Re: How to speed up rasterio.dataset.sample()
Sean Gillies
Hi, The sample method is not optimized. It reads data from GDAL's block cache or disk for each coordinate. Unless your coordinates are specially sorted, it is very likely that you will get block cache misses after some number of coordinates and then you'll be reading blocks from disk over again for every coordinate. If you increase GDAL_CACHEMAX to be >= the size of your raster, you should see a pretty good speedup. That's more or less what you would see by copying the dataset into memory using MemoryFile. Sorting your input coordinates by x and y so that fewer disk reads are required would be a way to speed up processing. If rasterio was ever to optimize sampling, that's what we would investigate.
On Wed, Dec 22, 2021 at 8:29 AM <aleksandar.ilic@...> wrote:
--
Sean Gillies
|
|