Date   

Re: Rasterizing polygon from numpy arrays

Brendan Ward
 

You need to convert each polygon in your numpy array of coordinates into a GeoJSON-like polygon.

You could use Shapely to do so.

You should then be able to use those as input to rasterize or geometry_mask.  Note that your polygons must be in the coordinate system of your images in order for this to work.  Since it sounds like your x,y dimensions of your images are in integer degrees, this means your x,y coordinates must also be in degrees (vertical axis shouldn't matter for rasterize / geometry_mask).

Hope that helps!


Rasterizing polygon from numpy arrays

alexisshakas@...
 

Hello,
I've been struggling with this for quite some time now, so its time to ask for help.

My problem is set up as follows:
I want to create a binary mask for an image that is quite large (~10k rows and 360 columns). I have several polygons, all closed, and I want to assign all values that fall in the polygons as True.
The polygons are in the form of a numpy array, with (x,y) coordinates. Each polygon has roughly 1000 values. I have about 200 such polygons for a given image.
Ideally, since the polygons are quite small, I should not query all points of the image (as suggested for example here https://stackoverflow.com/questions/36399381/whats-the-fastest-way-of-checking-if-a-point-is-inside-a-polygon-in-python)



I found documentation about how to rasterize shapefiles here https://rasterio.readthedocs.io/en/latest/api/rasterio.features.html
using for example
rasterio.features.geometry_mask(geometries, out_shape, transform, all_touched=False, invert=False)
but I could not figure out how to do this with a simple numpy array.

Is it possible?

I should also mention that my image axes are mixed. The vertical axis is a float (depths, from 0m to ~100m at 1 mm spacing) while the horizontal axis contains integers (angles, from 0 to 360 in 1 degree increments).

But I could also change these to be simply indices (both integers).

Thank you for any feedback!



Re: Is it possible to ignore existing overview when performing decimated read?

Denis Rykov
 

You can find an example how to do that here: https://github.com/mapbox/rasterio/issues/1929.


On Tue, Nov 10, 2020, 8:12 PM Loïc Dutrieux <loic.dutrieux@...> wrote:
Hi everyone,

I'm trying to perform a decimated read of a dataset that already contains overviews. Regardless of which value I pass to the resampling= argument of the read method, it seems that the already existing overview is used. Is there any way to ignore it? See the reproducible example below.

Cheers,
Loïc

---

import tempfile
import os

import numpy as np
import rasterio
from rasterio.enums import Resampling


filename = os.path.join(tempfile.gettempdir(), 'overview_test.tif')

# Create random int array
shape = (100, 100)
arr = np.random.randint(0, 9999, size=shape, dtype=np.uint16)

meta = {'height': 100,
        'width': 100,
        'driver': 'GTiff',
        'dtype': np.uint16,
        'count': 1}

# Write random array to file and compute first overview
with rasterio.open(filename, 'w', **meta) as dst:
    dst.write(arr, 1)
    dst.build_overviews([2], Resampling.nearest)

# Read with downsample
with rasterio.open(filename) as src:
    arr_avg = src.read(1, out_shape=(1,50,50), resampling=Resampling.average,
                                  out_dtype=np.float)
    arr_nrt = src.read(1, out_shape=(1,50,50), resampling=Resampling.nearest)

print(np.max(arr_avg - arr_nrt))
# when the source file does not contain overviews, the max of the difference array
# is > 0









Is it possible to ignore existing overview when performing decimated read?

Loïc Dutrieux
 

Hi everyone,

I'm trying to perform a decimated read of a dataset that already contains overviews. Regardless of which value I pass to the resampling= argument of the read method, it seems that the already existing overview is used. Is there any way to ignore it? See the reproducible example below.

Cheers,
Loïc

---

import tempfile
import os

import numpy as np
import rasterio
from rasterio.enums import Resampling


filename = os.path.join(tempfile.gettempdir(), 'overview_test.tif')

# Create random int array
shape = (100, 100)
arr = np.random.randint(0, 9999, size=shape, dtype=np.uint16)

meta = {'height': 100,
        'width': 100,
        'driver': 'GTiff',
        'dtype': np.uint16,
        'count': 1}

# Write random array to file and compute first overview
with rasterio.open(filename, 'w', **meta) as dst:
    dst.write(arr, 1)
    dst.build_overviews([2], Resampling.nearest)

# Read with downsample
with rasterio.open(filename) as src:
    arr_avg = src.read(1, out_shape=(1,50,50), resampling=Resampling.average,
                        out_dtype=np.float)
    arr_nrt = src.read(1, out_shape=(1,50,50), resampling=Resampling.nearest)

print(np.max(arr_avg - arr_nrt))
# when the source file does not contain overviews, the max of the difference array
# is > 0


Re: Does rasterio.warp.reproject use overviews?

Sean Gillies
 

Loïc,

1. Correct. To use overviews, you must pass a dataset explicitly opened on an overview using `rasterio.open("example.tif", OVERVIEW_LEVEL=1)`.
2. Correct again. We'll add an option to use an automatically determined source file overview, as gdalwarp does.


On Thu, Nov 5, 2020 at 10:29 AM Loïc Dutrieux <loic.dutrieux@...> wrote:

Hi Sean,


Thank you for your response.

Is it correct to interpret from your explanation that:

 1- currently the reproject function will never use overviews of the source file, even when it involves downsampling.

 2- In a future version `reproject()` will eventually use source file overviews. (and the user will be able to control that behavior via an argument (e.g. overviewLevel= as in gdal python API)?)

To provide some context about why I want to know that, I'm warping 10 m sentinel2 data to a 20 m grid and I want to make sure it does *not* use the source file overviews. Overviews are fine for visualization but often not suitable for analysis (unless they have been generated with the right resampling algorithm).


Thanks again,

Cheers,

Loïc




From: main@rasterio.groups.io <main@rasterio.groups.io> on behalf of Sean Gillies <sean.gillies@...>
Sent: 05 November 2020 16:27:12
To: main@rasterio.groups.io
Subject: Re: [rasterio] Does rasterio.warp.reproject use overviews?
 
Hi Loïc,

On Tue, Nov 3, 2020 at 8:55 AM Loïc Dutrieux <loic.dutrieux@...> wrote:

Hi everyone,


I read that the gdalwarp command line defaults to using the overview level nearest to target resolution. What about `rasterio.warp.reproject()`? Does it use overview at all?


Thank you and kind regards,

Loïc

gdalwarp's logic is found in apps/gdalwarp_lib.cpp, not in gcore, so rasterio.warp.reproject() does not have it. We've got a rasterio ticket about switching over to use the main function in apps/gdalwarp_lib.cpp so that the behavior of rasterio.warp.reproject() is exactly the same as gdalwarp, but haven't started work on it yet.

--
Sean Gillies



--
Sean Gillies


Re: Does rasterio.warp.reproject use overviews?

Loïc Dutrieux
 

Hi Sean,


Thank you for your response.

Is it correct to interpret from your explanation that:

 1- currently the reproject function will never use overviews of the source file, even when it involves downsampling.

 2- In a future version `reproject()` will eventually use source file overviews. (and the user will be able to control that behavior via an argument (e.g. overviewLevel= as in gdal python API)?)


To provide some context about why I want to know that, I'm warping 10 m sentinel2 data to a 20 m grid and I want to make sure it does *not* use the source file overviews. Overviews are fine for visualization but often not suitable for analysis (unless they have been generated with the right resampling algorithm).


Thanks again,

Cheers,

Loïc




From: main@rasterio.groups.io <main@rasterio.groups.io> on behalf of Sean Gillies <sean.gillies@...>
Sent: 05 November 2020 16:27:12
To: main@rasterio.groups.io
Subject: Re: [rasterio] Does rasterio.warp.reproject use overviews?
 
Hi Loïc,

On Tue, Nov 3, 2020 at 8:55 AM Loïc Dutrieux <loic.dutrieux@...> wrote:

Hi everyone,


I read that the gdalwarp command line defaults to using the overview level nearest to target resolution. What about `rasterio.warp.reproject()`? Does it use overview at all?


Thank you and kind regards,

Loïc

gdalwarp's logic is found in apps/gdalwarp_lib.cpp, not in gcore, so rasterio.warp.reproject() does not have it. We've got a rasterio ticket about switching over to use the main function in apps/gdalwarp_lib.cpp so that the behavior of rasterio.warp.reproject() is exactly the same as gdalwarp, but haven't started work on it yet.

--
Sean Gillies


Re: Does rasterio.warp.reproject use overviews?

Sean Gillies
 

Hi Loïc,

On Tue, Nov 3, 2020 at 8:55 AM Loïc Dutrieux <loic.dutrieux@...> wrote:

Hi everyone,


I read that the gdalwarp command line defaults to using the overview level nearest to target resolution. What about `rasterio.warp.reproject()`? Does it use overview at all?


Thank you and kind regards,

Loïc

gdalwarp's logic is found in apps/gdalwarp_lib.cpp, not in gcore, so rasterio.warp.reproject() does not have it. We've got a rasterio ticket about switching over to use the main function in apps/gdalwarp_lib.cpp so that the behavior of rasterio.warp.reproject() is exactly the same as gdalwarp, but haven't started work on it yet.

--
Sean Gillies


Does rasterio.warp.reproject use overviews?

Loïc Dutrieux
 

Hi everyone,


I read that the gdalwarp command line defaults to using the overview level nearest to target resolution. What about `rasterio.warp.reproject()`? Does it use overview at all?


Thank you and kind regards,

Loïc


Re: HDF4 not recognized as a supported file format

leonidas_liakos@...
 

Thank you!
pip3 install rasterio --force-reinstall --no-binary rasterio  did the job.


Re: HDF4 not recognized as a supported file format

Sean Gillies
 

Hi,

On Mon, Nov 2, 2020 at 9:16 AM leonidas_liakos via groups.io <leonidas_liakos=yahoo.gr@groups.io> wrote:
I have install rasterio with pip3 install rasterio --no-binary rasterio
When I'm trying to read an HDF4 MODIS file I get an error:

Matplotlib created a temporary config/cache directory at /tmp/matplotlib-m5kstvas because the default path (/home/kokkytos/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Traceback (most recent call last):
File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.init
File "rasterio/_shim.pyx", line 67, in rasterio._shim.open_dataset
File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: 'MOD11A1.A2014152.h18v04.006.2016204003609.hdf' not recognized as a supported file format.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/bin/rio", line 8, in
sys.exit(main_group())
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/rasterio/rio/info.py", line 66, in info
with ctx.obj['env'], rasterio.open(input) as src:
File "/usr/local/lib/python3.7/dist-packages/rasterio/env.py", line 433, in wrapper
return f(*args, **kwds)
File "/usr/local/lib/python3.7/dist-packages/rasterio/init.py", line 218, in open
s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
File "rasterio/_base.pyx", line 218, in rasterio._base.DatasetBase.init
rasterio.errors.RasterioIOError: 'MOD11A1.A2014152.h18v04.006.2016204003609.hdf' not recognized as a supported file format.


In https://github.com/mapbox/rasterio/issues/2026 you reported that you'd previously installed a wheel from PyPI. In that case, I think you need to add an option when reinstalling:

pip3 install rasterio --force-reinstall --no-binary rasterio

--
Sean Gillies


HDF4 not recognized as a supported file format

leonidas_liakos@...
 

I have install rasterio with pip3 install rasterio --no-binary rasterio
When I'm trying to read an HDF4 MODIS file I get an error:

Matplotlib created a temporary config/cache directory at /tmp/matplotlib-m5kstvas because the default path (/home/kokkytos/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Traceback (most recent call last):
File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.init
File "rasterio/_shim.pyx", line 67, in rasterio._shim.open_dataset
File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: 'MOD11A1.A2014152.h18v04.006.2016204003609.hdf' not recognized as a supported file format.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/bin/rio", line 8, in
sys.exit(main_group())
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/rasterio/rio/info.py", line 66, in info
with ctx.obj['env'], rasterio.open(input) as src:
File "/usr/local/lib/python3.7/dist-packages/rasterio/env.py", line 433, in wrapper
return f(*args, **kwds)
File "/usr/local/lib/python3.7/dist-packages/rasterio/init.py", line 218, in open
s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
File "rasterio/_base.pyx", line 218, in rasterio._base.DatasetBase.init
rasterio.errors.RasterioIOError: 'MOD11A1.A2014152.h18v04.006.2016204003609.hdf' not recognized as a supported file format.





rio-mbtiles 1.5.0

Sean Gillies
 

Hi all,

Version 1.5.0 of rio-mbtiles is on PyPI now: https://pypi.org/project/rio-mbtiles/1.5.0/. Special thanks to James McBride for the reviews and feedback.

Share and enjoy,

--
Sean Gillies


Re: Silencing NotGeoreferencedWarning

Sean Gillies
 

Hi Nikos,

On Thu, Oct 29, 2020 at 4:29 PM Nikos Alexandris <nik@...> wrote:
Thank you for this. It's been bugging for quite some time.
I use it here: https://gitlab.com/thermopolis/public/ecor/-/blob/master/ecor/utilities.py.

I have the same question, as this snippet need to appear in every function that `rasterio.open()`s a HDF5 (sub)dataset:
is there a more elegant way to silence this warning?

Cheers

(data) vas-y:~ seang$ python -W "ignore:Dataset has no geotransform set" -c "import rasterio; rasterio.open('/Users/seang/Desktop/DSC_1549.jpg')"
(data) vas-y:~ seang$ python -c "import rasterio; rasterio.open('/Users/seang/Desktop/DSC_1549.jpg')"
/Users/seang/envs/data/lib/python3.6/site-packages/rasterio/__init__.py:218: NotGeoreferencedWarning: Dataset has no geotransform set. The identity matrix may be returned.
  s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
 
We're going to support RPCs in 1.2.0, so you'll see less of this warning, but in the meanwhile I can't suggest anything other than explicitly silencing warnings when you open a file using a context manager or changing Python's filter before you run a program like this:

$ python -W "ignore:Dataset has no geotransform set" -c "import rasterio; rasterio.open('/Users/seang/Desktop/DSC_1549.jpg')"

No warnings! Without -W you will get them.

$ python -c "import rasterio; rasterio.open('/Users/seang/Desktop/DSC_1549.jpg')"
/Users/seang/envs/data/lib/python3.6/site-packages/rasterio/__init__.py:218: NotGeoreferencedWarning: Dataset has no geotransform set. The identity matrix may be returned.
  s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
 
--
Sean Gillies


Re: Silencing NotGeoreferencedWarning

 

Thank you for this. It's been bugging for quite some time.
I use it here: https://gitlab.com/thermopolis/public/ecor/-/blob/master/ecor/utilities.py.

I have the same question, as this snippet need to appear in every function that `rasterio.open()`s a HDF5 (sub)dataset:
is there a more elegant way to silence this warning?

Cheers


rio-mbtiles 1.5b3

Sean Gillies
 

Hi all,

1.5b3 is on PyPI now. It adds an option to constrain mbtiles output to only the tiles that cover a web mercator quadkey. See https://github.com/mapbox/rio-mbtiles/blob/master/CHANGES.txt#L4.

This is probably the last change before 1.5.0.

--
Sean Gillies


Re: Rasterio and GDAL_CACHEMAX

Angus Dickey
 

Sean,

Awesome, thanks for the response. There are a couple of places in the docs where the example sets the cache in MBs:

https://rasterio.readthedocs.io/en/latest/api/rasterio.env.html?highlight=GDAL_CACHEMAX
https://rasterio.readthedocs.io/en/latest/topics/switch.html?highlight=GDAL_CACHEMAX

Not a big deal, but might send people down the wrong path.

Thanks again,

Angus


On Mon, Oct 26, 2020 at 8:58 PM Sean Gillies via groups.io <sean=mapbox.com@groups.io> wrote:
Hi Angus,

On Mon, Oct 26, 2020 at 6:04 PM Angus Dickey <angus@...> wrote:
Does GDAL_CACHEMAX have to be set in bytes when using rasterio? I see in the docs there is an example using MBs but it seems to be causing rasterio to set a very small cache size when I use it. For example, accessing a COG in S3 using rasterio 1.1.8:

# No problem here
with rasterio.Env() as env:
    # Prints 851132006 (5% of my system RAM in bytes)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# No problem here either
with rasterio.Env(GDAL_CACHEMAX=536870912) as env:
    # Prints  536870912
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# Really slow
with rasterio.Env(GDAL_CACHEMAX=512) as env:
    # Prints 512 (in bytes?)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG


It seems like rasterio is setting the GDAL raster block cache to 512 bytes and this is causing the slow read. I don't really understand the internals of rasterio but it looks to be using GDALSetCacheMax() (which only accepts bytes) and is passing it 512. I might be misunderstanding the problem though, it could be something else slowing things down but that is the only change I am making.

Any input is appreciated.

Thanks,

Angus

Yes, GDAL_CACHEMAX passed to `Env()` must be in bytes (since https://github.com/mapbox/rasterio/pull/1042/files).

--
Sean Gillies


Re: opening file with forced CRS

Alan Snow
 

This should be fixed in 1.1.5 IIRC: #1248


opening file with forced CRS

 

Hi all: as part of using rasterio for OGC API - Coverages work in pygeoapi, a client can pass a bbox parameter as a shortcut to spatially subset a coverage using lat/long coordinates. The idea here, then, is that lat/long coordinates would be reprojected into the data source's native coordinates and then the coverage would be subsetted accordingly.

Having said this, using example files in https://dd.weather.gc.ca/model_hrdps/continental/grib2/06/002, I'm unable to derive the native crs from rasterio in Python:

>>> import rasterio
>>> ds = rasterio.open('CMC_hrdps_continental_HGT_ISBL_0985_ps2.5km_2020102706_P002-00.grib2')
>>> ds.bounds
BoundingBox(left=-0.5, bottom=1455.5, right=2575.5, top=-0.5)
>>> ds.crs
>>> ds.crs is None
True
>>> 

However, when running through rio info, I am able to see bounds and CRS:

$ rio info CMC_hrdps_continental_HGT_ISBL_0985_ps2.5km_2020102706_P002-00.grib2
WARNING:rasterio._env:CPLE_AppDefined in Unable to perform coordinate transformations, so the correct projected geotransform could not be deduced from the lat/long control points.  Defaulting to ungeoreferenced.
{"bounds": [-2099127.4944969374, -5739388.521499627, 4340872.505503062, -2099388.5214996273], "colorinterp": ["undefined"], "count": 1, "crs": "PROJCS[\"unnamed\",GEOGCS[\"Coordinate System imported from GRIB file\",DATUM[\"unknown\",SPHEROID[\"Sphere\",6371229,0]],PRIMEM[\"Greenwich\",0],UNIT[\"degree\",0.0174532925199433]],PROJECTION[\"Polar_Stereographic\"],PARAMETER[\"latitude_of_origin\",60],PARAMETER[\"central_meridian\",252],PARAMETER[\"scale_factor\",1],PARAMETER[\"false_easting\",0],PARAMETER[\"false_northing\",0],UNIT[\"Metre\",1]]", "descriptions": ["98500[Pa] ISBL=\"Isobaric surface\""], "driver": "GRIB", "dtype": "float64", "height": 1456, "indexes": [1], "lnglat": [-92.0404517295466, 52.147885433427845], "mask_flags": [["nodata"]], "nodata": 9999.0, "res": [2500.0, 2500.0], "shape": [1456, 2576], "tiled": false, "transform": [2500.0, 0.0, -2099127.4944969374, 0.0, -2500.0, -2099388.5214996273, 0.0, 0.0, 1.0], "units": [null], "width": 2576}

I need the crs value to be able to reproject the incoming lat/long bbox into native coordinates to subset accordingly. Is there a way to set a given dataset's CRS through code (in readonly mode)?

Thanks

..Tom


Re: Rasterio and GDAL_CACHEMAX

Sean Gillies
 

Hi Angus,

On Mon, Oct 26, 2020 at 6:04 PM Angus Dickey <angus@...> wrote:
Does GDAL_CACHEMAX have to be set in bytes when using rasterio? I see in the docs there is an example using MBs but it seems to be causing rasterio to set a very small cache size when I use it. For example, accessing a COG in S3 using rasterio 1.1.8:

# No problem here
with rasterio.Env() as env:
    # Prints 851132006 (5% of my system RAM in bytes)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# No problem here either
with rasterio.Env(GDAL_CACHEMAX=536870912) as env:
    # Prints  536870912
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# Really slow
with rasterio.Env(GDAL_CACHEMAX=512) as env:
    # Prints 512 (in bytes?)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG


It seems like rasterio is setting the GDAL raster block cache to 512 bytes and this is causing the slow read. I don't really understand the internals of rasterio but it looks to be using GDALSetCacheMax() (which only accepts bytes) and is passing it 512. I might be misunderstanding the problem though, it could be something else slowing things down but that is the only change I am making.

Any input is appreciated.

Thanks,

Angus

Yes, GDAL_CACHEMAX passed to `Env()` must be in bytes (since https://github.com/mapbox/rasterio/pull/1042/files).

--
Sean Gillies


Rasterio and GDAL_CACHEMAX

Angus Dickey
 

Does GDAL_CACHEMAX have to be set in bytes when using rasterio? I see in the docs there is an example using MBs but it seems to be causing rasterio to set a very small cache size when I use it. For example, accessing a COG in S3 using rasterio 1.1.8:

# No problem here
with rasterio.Env() as env:
    # Prints 851132006 (5% of my system RAM in bytes)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# No problem here either
with rasterio.Env(GDAL_CACHEMAX=536870912) as env:
    # Prints  536870912
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# Really slow
with rasterio.Env(GDAL_CACHEMAX=512) as env:
    # Prints 512 (in bytes?)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG


It seems like rasterio is setting the GDAL raster block cache to 512 bytes and this is causing the slow read. I don't really understand the internals of rasterio but it looks to be using GDALSetCacheMax() (which only accepts bytes) and is passing it 512. I might be misunderstanding the problem though, it could be something else slowing things down but that is the only change I am making.

Any input is appreciated.

Thanks,

Angus

141 - 160 of 780