Date
1 - 3 of 3
Rasterio and GDAL_CACHEMAX
Angus Dickey
Does GDAL_CACHEMAX have to be set in bytes when using rasterio? I see in the docs there is an example using MBs but it seems to be causing rasterio to set a very small cache size when I use it. For example, accessing a COG in S3 using rasterio 1.1.8: # No problem here with rasterio.Env() as env: # Prints 851132006 (5% of my system RAM in bytes) print(get_gdal_config('GDAL_CACHEMAX')) with rasterio.open('s3://path/to/cog') as src: # Do stuff with the COG # No problem here either with rasterio.Env(GDAL_CACHEMAX=536870912) as env: # Prints 536870912 print(get_gdal_config('GDAL_CACHEMAX')) with rasterio.open('s3://path/to/cog') as src: # Do stuff with the COG # Really slow with rasterio.Env(GDAL_CACHEMAX=512) as env: # Prints 512 (in bytes?) print(get_gdal_config('GDAL_CACHEMAX')) with rasterio.open('s3://path/to/cog') as src: # Do stuff with the COG It seems like rasterio is setting the GDAL raster block cache to 512 bytes and this is causing the slow read. I don't really understand the internals of rasterio but it looks to be using GDALSetCacheMax() (which only accepts bytes) and is passing it 512. I might be misunderstanding the problem though, it could be something else slowing things down but that is the only change I am making. Any input is appreciated. Thanks, Angus |
|
Sean Gillies
Hi Angus, On Mon, Oct 26, 2020 at 6:04 PM Angus Dickey <angus@...> wrote:
Yes, GDAL_CACHEMAX passed to `Env()` must be in bytes (since https://github.com/mapbox/rasterio/pull/1042/files). Sean Gillies |
|
Angus Dickey
Sean, Awesome, thanks for the response. There are a couple of places in the docs where the example sets the cache in MBs: https://rasterio.readthedocs.io/en/latest/api/rasterio.env.html?highlight=GDAL_CACHEMAX https://rasterio.readthedocs.io/en/latest/topics/switch.html?highlight=GDAL_CACHEMAX Not a big deal, but might send people down the wrong path. Thanks again, Angus
|
|