Re: Speed up reading rasters


Carlos García Rodríguez
 

I would also like to add that the first tiled read is not necessarily slow... 


El lun., 27 abr. 2020 20:06, Carlos García Rodríguez via groups.io <carlogarro=gmail.com@groups.io> escribió:
So, do you think it should be a good idea to increase the cache memory? If so, how to do it? I have plenty of ram so that should not be a problem. On the other side I checked I there is some relation between tiles proximity and time and didn't find it. You can see the position of each tile in the image. 
El lun., 27 abr. 2020 17:20, Sean Gillies <sean.gillies@...> escribió:
Hi,

On Mon, Apr 27, 2020 at 2:36 AM <carlogarro@...> wrote:
Hello, thank you so much for your recommendation, it speed it up x5. Very useful. Now I am having a problem that i do not understand.

I have the following script, where i access 10 random tiles of my raster. train_data is a vector [4822,2] of pixels position in the raster.
for i in range(10):
    idx = np.random.randint(4822)
    x_idx = train_data[idx][1]
    y_idx = train_data[idx][0]
    window = Window(y_idx, x_idx, 224, 224)
    start_time = time.time()
    with rasterio.open('./sentinel2_tiled.tif') as src:
        sentinel2 = src.read(window=window)
    end_time = (time.time() - start_time)

I do not understand why the times of loading a window are so different, as can be seen in the following image. Do you have some explanation?



 Thank you once more!

I can't say for sure about the time differences because I don't know much about your data files or your computer. However, know this: GDAL's I/O system caches blocks of raster data in memory, the size of the cache is generally 5% of your computers memory, and windowed reads may or may not be served directly from the cache depending on their size and adjacency to previously read data.

--
Sean Gillies

Join main@rasterio.groups.io to automatically receive all group messages.