Representing many raster files as one big file


rasterio@...
 

Hi all,

I'm new to rasterio and it's so far working very nicely for reading some GeoTIFF files (specifically, the EU DEM elevation data set). I have a question I hope someone can answer: it's about whether it's possible to take the 27 contiguous individual TIFF files that form this data set and represent them as if they were one big file.

Currently I have to load each file in a loop to perform my analysis, but ideally I would like to be able to give all of the file names and their coordinates to some function that then lets me fetch data from anywhere in the whole dataset using a single `dataset.read` with my window settings. This would avoid me having to deal with the edges where the window I'm looking at would overlap with other tiles. This would ideally work without requiring the whole dataset to be in memory, since it's huge (10s of GB).

Is this possible already in rasterio?

Cheers,


Sean


Dion Häfner <dion.haefner@...>
 

Hey Sean,

as far as I know, this is not possible with "just" rasterio. But there is one thing you can do.

GDAL (the library rasterio is built upon) supports building virtual datasets (https://gdal.org/programs/gdalbuildvrt.html). So you should be able to do

$ gdalbuildvrt dem.vrt *.tif

and then read this VRT file with rasterio.open (which should work out of the box).

Good luck!
Dion

On 03/12/2019 15.09, rasterio via Groups.Io wrote:
Hi all,
I'm new to rasterio and it's so far working very nicely for reading some GeoTIFF files (specifically, the EU DEM elevation data set). I have a question I hope someone can answer: it's about whether it's possible to take the 27 contiguous individual TIFF files that form this data set and represent them as if they were one big file.
Currently I have to load each file in a loop to perform my analysis, but ideally I would like to be able to give all of the file names and their coordinates to some function that then lets me fetch data from anywhere in the whole dataset using a single `dataset.read` with my window settings. This would avoid me having to deal with the edges where the window I'm looking at would overlap with other tiles. This would ideally work without requiring the whole dataset to be in memory, since it's huge (10s of GB).
Is this possible already in rasterio?
Cheers,
Sean


rasterio@...
 

Hi Dion,

Thanks a lot, that worked!

Cheers,


Sean

On 2019-12-03 15:31, Dion Häfner wrote:
Hey Sean,
as far as I know, this is not possible with "just" rasterio. But there is one thing you can do.
GDAL (the library rasterio is built upon) supports building virtual datasets (https://gdal.org/programs/gdalbuildvrt.html). So you should be able to do
$ gdalbuildvrt dem.vrt *.tif
and then read this VRT file with rasterio.open (which should work out of the box).
Good luck!
Dion
On 03/12/2019 15.09, rasterio via Groups.Io wrote:
Hi all,

I'm new to rasterio and it's so far working very nicely for reading some GeoTIFF files (specifically, the EU DEM elevation data set). I have a question I hope someone can answer: it's about whether it's possible to take the 27 contiguous individual TIFF files that form this data set and represent them as if they were one big file.

Currently I have to load each file in a loop to perform my analysis, but ideally I would like to be able to give all of the file names and their coordinates to some function that then lets me fetch data from anywhere in the whole dataset using a single `dataset.read` with my window settings. This would avoid me having to deal with the edges where the window I'm looking at would overlap with other tiles. This would ideally work without requiring the whole dataset to be in memory, since it's huge (10s of GB).

Is this possible already in rasterio?

Cheers,


Sean