Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory
Hobart, Geordie (NRCan/RNCan)
Hi there. I was wondering if there was a way to create a multiband 12 GB geotif file with rasterio without consuming a large amount of memory. I’m working on a shared system where resources are limited and requesting a large amount of memory for this job will take a long time to be scheduled. I’ve come across some older solutions using other python libraries but would prefer to stick with rasterio if possible. Thanks so much. Geordie Hobart CFS, NRCan / RNCan
From: main@rasterio.groups.io <main@rasterio.groups.io>
On Behalf Of Sean Gillies via groups.io
Sent: June 21, 2021 12:42 To: main@rasterio.groups.io Subject: Re: [rasterio] Shape of resulting merged image is not what I expect
Hi,
I'm happy report that you won't have this problem with version 1.2.5, which is coming soon to PyPI.
As a part of the work to fix the bug noted at https://github.com/mapbox/rasterio/blob/maint-1.2/CHANGES.txt#L10, we replaced a call to math.ceil() to a call of round() in a key place and now the output shape of merge is what you expect.
On Thu, Jun 10, 2021 at 9:02 AM <juanjo.gomeznavarro@...> wrote:
-- Sean Gillies |
|
yon.davies@...
Hey Hobart,
I've had success using Rasterio's windowed reading and writing to create a huge raster in chunks. You could do something like this: # Create blank raster that will hold the data. Make sure the dst_profile has compression to make this small with rio.open(r1, "w", **dst_profile) as dst:
dst.write(np.zeros(shape=(x, y, z), height, width), dtype=dst_dtype))
for j, window in enumerate(block_windows):# Get block windows from this blank raster with rio.open(r1, "r") as src:
block_windows = list(src.block_windows()) dst.write(stacked, window=window[1]) The block_windows are automatically created and usually, but you can play around and create rasterio.windows.Window() with a size of your choosing. Creating the initial blank raster will require that you have enough memory to hold the np.zeros, which may not be true. In that case, you can create the initial blank raster on the disk using gdal_create in command line or in Python: import gdal
import osr
driver = gdal.GetDriverByName('GTiff')
ds = driver.Create(outfn, xsize, ysize, nbands, dtype, options=['COMPRESS=LZW', 'TILED=YES']) |
|
Reading & writing with windows can be done with rioxarray (rasterio + xarray):
import rioxarray rds = rioxarray.open_rasterio("input.tif", lock=False) rds.rio.to_raster("output.tif", windowed=True) You can also use dask to read/write in parallel:
Hope this helps, Alan |
|
Hobart, Geordie (NRCan/RNCan)
Thank you Alan. Good to know. Kind regards. Geordie
From: main@rasterio.groups.io <main@rasterio.groups.io>
On Behalf Of Alan Snow
Sent: June 25, 2021 09:06 To: main@rasterio.groups.io Subject: Re: [rasterio] Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory
Reading & writing with windows can be done with rioxarray (rasterio + xarray):
|
|
Hobart, Geordie (NRCan/RNCan)
Thanks Yon. I want to avoid making the 12 GB numpy zero array and the associated memory demand so solution two is the best bet. Greatly appreciated. Geordie
From: main@rasterio.groups.io <main@rasterio.groups.io>
On Behalf Of yon.davies@...
Sent: June 25, 2021 08:44 To: main@rasterio.groups.io Subject: Re: [rasterio] Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory
Hey Hobart, with rio.open(r1, "w", **dst_profile) as dst: dst.write(np.zeros(shape=(x, y, z), height, width), dtype=dst_dtype)) with rio.open(r1, "r") as src: for j, window in enumerate(block_windows): import gdal import osr driver = gdal.GetDriverByName('GTiff')
|
|
Luke
You don't need to pre-populate a dataset with a massive empty array, you can just open it in write mode and write chunks of data via windowed writing:
profile = {
'driver': 'GTiff',
'tiled': True,
'height': huge_number,
'width': huge_number,
'count': 1,
'dtype': 'float32',
'blockxsize': 256,
'blockysize': 256,
'compress': 'LZW'
}
with rasterio.open('massive.tif', 'w', **profile) as dst:
for window in dst.block_windows():
dst.write(small_data_chunk, window=window, indexes=1) |
|
Hobart, Geordie (NRCan/RNCan)
Thanks Luke. And everyone else for all great the suggestions. Much appreciated.
From: main@rasterio.groups.io <main@rasterio.groups.io>
On Behalf Of Luke
Sent: June 25, 2021 18:10 To: main@rasterio.groups.io Subject: Re: [rasterio] Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory
You don't need to pre-populate a dataset with a massive empty array, you can just open it in write mode and write chunks of data via windowed writing: profile = { 'driver': 'GTiff', 'tiled': True, 'height': huge_number, 'width': huge_number, 'count': 1, 'dtype': 'float32', 'blockxsize': 256, 'blockysize': 256, 'compress': 'LZW' }
with rasterio.open('massive.tif', 'w', **profile) as dst: for window in dst.block_windows(): dst.write(small_data_chunk, window=window, indexes=1) |
|