Date   

Re: No geoKeys generated when writing a geotiff

Sean Gillies
 

Hi,

On Tue, Jul 6, 2021 at 10:23 AM devsantiago <santiago@...> wrote:

Hello,
I'm sorry if this is question is an easy one, I'm a beginner with rasterio. I have a python script that reads in a dataset, calculates an overlay layer and then outputs the calculated layer. This part works as expected. I then need to take this output geotiff and read it in for display inside a web application. Inside my web application, I need to extract data such as the geotiff's coordinate system , resolution and origin. Currently, I have some other geotiffs that were not generated using rasterio  and I'm able to fetch the needed information by extracting the geoKeys with npm's geotiff package : https://www.npmjs.com/package/geotiff. However, when reading my geotifffs generated with rasterio, no geoKeys are found. Below you can find my code used to open the existing dataset and the open and create options I'm using. I'm wondering if there is something that I'm missing to be able to correctly generate these geoKeys. Any help will be greatly appreciated!

 

def layers(directory, gis_data, maptype, mapname):
    h = gis_data.height
    w = gis_data.width
    c = gis_data.crs
    t = gis_data.transform
    output_folder = directory + 'output'
    
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)
    
    mapname = rio.open(
        output_folder + '/' + mapname + '.tiff',
        'w',
        driver = 'GTiff',
        height = h,
        width = w,
        count = 1,
        dtype = np.uint8,
        crs = c,
        transform = t,
profile='GeoTIFF',
geotiff_keys_flavour='STANDARD',
georef_sources='INTERNAL', ) mapname.write(maptype, 1) mapname.close()

Can you try with 

    geotiff_keys_flavor='STANDARD'

and see what happens? According to https://gdal.org/drivers/raster/gtiff.html#creation-options there is no "u" in the option name.

--
Sean Gillies


Re: How are calculate_default_transform and aligned_target different?

Sean Gillies
 

Hi Amaury,

Rasterio's calculate_default_transform is only a thin wrapper around GDALSuggestedWarpOutput2: https://gdal.org/api/gdal_alg.html#_CPPv424GDALSuggestedWarpOutput212GDALDatasetH19GDALTransformerFuncPvPdPiPiPdi. The behavior you see in GDAL is implemented in the gdalwarp program and is only partially implemented in rasterio's rio-warp command. So, to get the same behavior, you would need to port some logic from C (the gdalwap_lib.cpp file, if I remember correctly) to Python.

We'd like to change rasterio's warp API to align better with GDAL https://github.com/mapbox/rasterio/issues/1990 but no work on this has started yet.

On Tue, Jul 6, 2021 at 4:20 AM <amaury.dehecq@...> wrote:

Hi all,

on the same topic, is there an easy way to get the destination parameters (transform, width, height) exactly as GDAL would do, considering the user might provide any of, or a combination of, the following: dst_bounds, dst_width/height, dst_res.

I understand that if dst_width/height or resolution are provided, the output transformation and width/height can easily be calculated using calculate_default_transform (and aligned_target if the resolution is to be strictly enforced). But what if dst_bounds are also set?

Let's consider the following cases:

- dst_bounds is set alone -> GDAL output strictly enforces the output bounds but keeps the output resolution as close as possible to the input resolution

- dst_bounds and res are set -> GDAL output enforces the output resolution and use the bounds that are as close as possible to the requested bounds

- dst_bounds and res are set+ tap option -> same as before but output bounds is a multiple of the resolution

- dst_bounds and width/height are set -> GDAL outputs shape and bounds are strictly enforced, the resolution is set accordingly.

Currently, I cannot find an easy way to do this with rasterio. At least a specific, non trivial order of rasterio functions have to be called, to get exactly this behavior. There could possibly be a specific rasterio function to do this.

Thanks in advance!

Amaury

 

_,_._,_



--
Sean Gillies


No geoKeys generated when writing a geotiff

devsantiago
 

Hello,
I'm sorry if this is question is an easy one, I'm a beginner with rasterio. I have a python script that reads in a dataset, calculates an overlay layer and then outputs the calculated layer. This part works as expected. I then need to take this output geotiff and read it in for display inside a web application. Inside my web application, I need to extract data such as the geotiff's coordinate system , resolution and origin. Currently, I have some other geotiffs that were not generated using rasterio  and I'm able to fetch the needed information by extracting the geoKeys with npm's geotiff package : https://www.npmjs.com/package/geotiff. However, when reading my geotifffs generated with rasterio, no geoKeys are found. Below you can find my code used to open the existing dataset and the open and create options I'm using. I'm wondering if there is something that I'm missing to be able to correctly generate these geoKeys. Any help will be greatly appreciated!

 

def layers(directory, gis_data, maptype, mapname):
    h = gis_data.height
    w = gis_data.width
    c = gis_data.crs
    t = gis_data.transform
    output_folder = directory + 'output'
    
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)
    
    mapname = rio.open(
        output_folder + '/' + mapname + '.tiff',
        'w',
        driver = 'GTiff',
        height = h,
        width = w,
        count = 1,
        dtype = np.uint8,
        crs = c,
        transform = t,
profile='GeoTIFF',
geotiff_keys_flavour='STANDARD',
georef_sources='INTERNAL', ) mapname.write(maptype, 1) mapname.close()


Re: Default nodata value

amaury.dehecq@...
 

Hi Idan,

thanks, that's already quite useful. This is typically the kind of default values I encounter when using GDAL in the command line indeed.

This does not match the values I find when using rasterio's reproject, when nodata is set to None, which would be, e.g.

- uint8: 255 -> ok
- uint16: 65535 -> ok

- int16: 32767 -> ok
- uint32: 999999 -> not the same

- float32: 1e20 -> not the same

- float64: 1e20 -> not the same

Amaury

 


Re: Default nodata value

Idan Miara
 

Hi Amaury,

I'm not aware of any "default" nodata value per dtype in gdal.
Though, you might find the defaults in gdal_calc useful:

Idan


On Tue, 6 Jul 2021 at 15:31, <amaury.dehecq@...> wrote:

Hi,

what is the default nodata used by GDAL, and hence rasterio, for any given dtype?
For example, rasterio.warp.reproject states:
dst_nodata (int or float, optional) – The nodata value used to initialize the destination; it will remain in all areas not covered by the reprojected source. Defaults to the nodata value of the destination image (if set), the value of src_nodata, or 0 (GDAL default).

But I couldn't find anywhere in rasterio's or GDAL's documentation what was the default value for any given dtype.

Thanks,

Amaury


Default nodata value

amaury.dehecq@...
 

Hi,

what is the default nodata used by GDAL, and hence rasterio, for any given dtype?
For example, rasterio.warp.reproject states:
dst_nodata (int or float, optional) – The nodata value used to initialize the destination; it will remain in all areas not covered by the reprojected source. Defaults to the nodata value of the destination image (if set), the value of src_nodata, or 0 (GDAL default).

But I couldn't find anywhere in rasterio's or GDAL's documentation what was the default value for any given dtype.

Thanks,

Amaury


Re: How are calculate_default_transform and aligned_target different?

amaury.dehecq@...
 

Hi all,

on the same topic, is there an easy way to get the destination parameters (transform, width, height) exactly as GDAL would do, considering the user might provide any of, or a combination of, the following: dst_bounds, dst_width/height, dst_res.

I understand that if dst_width/height or resolution are provided, the output transformation and width/height can easily be calculated using calculate_default_transform (and aligned_target if the resolution is to be strictly enforced). But what if dst_bounds are also set?

Let's consider the following cases:

- dst_bounds is set alone -> GDAL output strictly enforces the output bounds but keeps the output resolution as close as possible to the input resolution

- dst_bounds and res are set -> GDAL output enforces the output resolution and use the bounds that are as close as possible to the requested bounds

- dst_bounds and res are set+ tap option -> same as before but output bounds is a multiple of the resolution

- dst_bounds and width/height are set -> GDAL outputs shape and bounds are strictly enforced, the resolution is set accordingly.

Currently, I cannot find an easy way to do this with rasterio. At least a specific, non trivial order of rasterio functions have to be called, to get exactly this behavior. There could possibly be a specific rasterio function to do this.

Thanks in advance!

Amaury

 


Re: Open EHdr raster with hdr header file from memory

Luke
 

Replacing your FTP code with direct access using the /vsicurl/ virtual filesystem:

import rasterio

vrt = '''
<VRTDataset rasterXSize="6935" rasterYSize="3351">
<SRS>EPSG:4326</SRS>
<GeoTransform>-124.729583333333,0.00833333333333333,0,52.8704166666666,0,-0.00833333333333333</GeoTransform>
<VRTRasterBand dataType="Int16" band="1" subClass="VRTRawRasterBand">
<SourceFilename relativetoVRT="0">{}/{}/{}</SourceFilename>
<NoDataValue>-9999</NoDataValue>
<ImageOffset>0</ImageOffset>
<PixelOffset>2</PixelOffset>
<LineOffset>13870</LineOffset>
<ByteOrder>MSB</ByteOrder>
</VRTRasterBand>
</VRTDataset>
'''
""
vsi_path = '/vsigzip//vsitar//vsicurl'
ftp_path = 'ftp://sidads.colorado.edu/DATASETS/NOAA/G02158/masked/2021/01_Jan/SNODAS_20210101.tar'
file_path = 'us_ssmv11050lL00T0024TTNATS2021010105DP000.dat.gz'
with rasterio.open(vrt.format(vsi_path, ftp_path, file_path)) as src:
print(src.profile)


Re: Open EHdr raster with hdr header file from memory

Luke
 

You could use a raw raster VRT combined with the /vsitar and /vsigzip virtual filesystems (you could probably use /vsicurl as well to replace your FTP code, but I couldn't get that to work immediately so gave up).


import rasterio

vrt = '''
<VRTDataset rasterXSize="6935" rasterYSize="3351">
    <SRS>EPSG:4326</SRS>
    <GeoTransform>-124.729583333333,0.00833333333333333,0,52.8704166666666,0,-0.00833333333333333</GeoTransform>
    <VRTRasterBand dataType="Int16" band="1" subClass="VRTRawRasterBand">
        <SourceFilename relativetoVRT="0">/vsigzip//vsitar/{}</SourceFilename>
        <NoDataValue>-9999</NoDataValue>
        <ImageOffset>0</ImageOffset>
        <PixelOffset>2</PixelOffset>
        <LineOffset>13870</LineOffset>
        <ByteOrder>MSB</ByteOrder>
    </VRTRasterBand>
</VRTDataset>
'''
file_path = '/path/to/SNODAS_20210101.tar/us_ssmv11050lL00T0024TTNATS2021010105DP000.dat.gz'
with rasterio.open(vrt.format(file_path)) as src:
    print(src.profile)

On Tue., 29 Jun. 2021, 08:27 , <fkluibenschaedl@...> wrote:

Hi All,

I am processing SNODAS data archives that store each variable in a single-band ESRI BIL format raster. rasterio.open() works like a charm when a) fp references a file on disk and b) when a hdr (header) file exists in that same location. (I am attaching a working set of files for reference).

I am curious if there is a way to load the raster from memory though. Since each eraster within the archive aforementioned archive is compressed using gzip, The goal is to download the dataset, unpack the .bil file, open it with the matching header information, and store it as e.g. GeoTIFF for further processing. Below is some code that (obviously) fails when it passes a tuple to the rasterio.open fp argument.

Looking forward to your thoughts on this. Thanks in advance,

Florian

import gzip
import tarfile
import rasterio

from ftplib import FTP
from io import BytesIO
from io import StringIO

header_values = {'byteorder': 'M',
                 'layout': 'bil',
                 'nbands': 1,
                 'nbits': 16,  # i.e. 2 bytes indicated in meta info text
                 'ncols': 6935,
                 'nodata_value': -9999,
                 'nrows': 3351,
                 'pixeltype': 'signedint',
                 'ulxmap': -124.729583333333,
                 'ulymap': 52.8704166666666,
                 'xdim': 0.00833333333333333,
                 'ydim': 0.00833333333333333
                 }


HOST = "sidads.colorado.edu"
ftp_path = 'DATASETS/NOAA/G02158/masked/2021/01_Jan/SNODAS_20210101.tar'
data_file_name = 'us_ssmv11034tS__T0001TTNATS2021010105HP001.dat.gz'

dl_file = BytesIO()
ftp = FTP(HOST)
ftp.login()
# ftp.login('anonymous',password)
ftp.retrbinary('RETR ' + ftp_path, dl_file.write)
dl_file.seek(0)


with tarfile.open(fileobj=dl_file, mode='r:') as tar_handle:
    gz_data_file = tar_handle.extractfile(member=data_file_name)
    with gzip.open(gz_data_file) as data_file:
        with StringIO() as hdr_file:
            for field, value in header_values.items():
                hdr_file.write(f"{field} {value}\n")
            with rasterio.open((data_file, hdr_file), driver="EHdr") as src_ds:
                out_profile = src_ds.profile
                out_profile.update(driver="GTiff")
                out_profile.update(compress="lzw")
                with rasterio.open('out.tiff', "w", **out_profile) as out_ds:
                    out_ds.write(src_ds.read(1), 1)


Re: Help with rasterio.transfrom.xy

Sean Gillies
 

Hi,

The xy method always returns projected coordinates. If you want long and lat, you must reproject the xy values using https://rasterio.readthedocs.io/en/latest/api/rasterio.warp.html?#rasterio.warp.transform.


On Tue, Jun 29, 2021 at 11:55 AM Cooney,Tom (ECCC) <Tom.Cooney@...> wrote:

Hi,

We are working on a process that extracts values from raster data and returns them as GeoJSON geometries. Our use cases are point, line, and polygon. 


For the line use case, we are trying to return the values and coordinates for each raster cell intersected by the line. We would also like to have the coordinates returned in geographic coordinates. To do this we are trying to use rasterio.transform.xy.

 

The problem we are encountering is that rasterio.transform.xy is not returning the coordinates in lat/long but rather in their native projection. We think this is caused by using the incorrect transform (Affine.affine) parameter.

 

Our setup is below. Note in this case shapes is a GeoJSON LineString:

```
with rasterio.open(raster_path) as src:

       out_image, out_transform = rasterio.mask.mask(src, shapes, crop=True)

       with MemoryFile() as memfile:

               with memfile.open(driver="GTiff", height=out_image.shape[1], width=out_image.shape[2], count=1,

                                      dtype=rasterio.float64, transform=out_transform) as dataset:

                        dataset.write(out_image)

                        ds = dataset.read()

                        xs_to_mod = list(range(0, ds.shape[1]))

                        ys_to_mod = list(range(0, ds.shape[2]))

                        xs, ys = rasterio.transform.xy(out_transform, xs_to_mod, ys_to_mod, offset='center')

```
Thank you

,_

--
Sean Gillies


Re: Proposal: Allow other cloud object store providers

Sean Gillies
 

Hi Ashley,

On Fri, Jun 25, 2021 at 6:07 PM <ashley.sommer@...> wrote:

Thanks for your reply, Sean.

> are you proposing that potentially any cloud platform would be made first class in rasterio, concretely, as in GDAL and as with rasterio/S3 today?

Yes, thats right.

> What would you think if rasterio were to take the opposite approach and require users to write ~5 lines of code themselves to adapt output of, say, keystonev3 and swiftclient to a standard interface in rasterio

Yeah, I actually had the same thought too, but wasn't sure if it would be received well.

That is actually kind-of how you can do it in rasterio already. You can manually create a `keystonev3` or `swiftclient` auth session, and populate it with your credentials. You can then manually create a rasterio `SwiftClient`, and give it that auth session. Then pass the pre-configured `SwiftClient` into `session.Env()` and rasterio will use that as the Cloud Session. Its a bit of boilerplate code, but it works across all of the existing `Session` subclasses for the different cloud platforms already.

Should that be the "standard" way of doing it? Could it be cleaner? Would AWS S3 still be a special case?

Yes, I think that should be the standard way. AWS would have to remain a special case until we deprecate the special parameters properly, but we can already do what you outlined above for AWS. 

My end goal is to be able to use OpenStack Swift ObjectStore as a storage backend for an opendatacube project. Opendatacube only supports AWS S3 for now, because it relies on rasterio's "first-class" interface to S3. I was told, if I want to get other cloud providers working natively in opendatacube, we need them to be fully supported by rasterio first (as you mentioned, they're already fully supported by GDAL).

Ashley Sommer

I'm not familiar with opendatacube, but I think it should be possible to use Swift now without modifying either opendatacube or rasterio. GDAL already has support for 4 different authentication mechanisms (see https://gdal.org/user/virtual_file_systems.html#vsiswift) and as far as I know all of those options can be set using similarly named environmental variables, not only through the GDAL API as rasterio does here: https://github.com/mapbox/rasterio/blob/db03b66e81b489d3f5f01c9edfb6fc720250a2c1/rasterio/env.py#L233-L246. For example, one could call rasterio.session.SwiftSession(...).get_credential_options() and then update os.environ with the result.

I hope this is a useful suggestion and not a red herring,

--
Sean Gillies


Help with rasterio.transfrom.xy

Cooney,Tom (ECCC)
 

Hi,

We are working on a process that extracts values from raster data and returns them as GeoJSON geometries. Our use cases are point, line, and polygon. 


For the line use case, we are trying to return the values and coordinates for each raster cell intersected by the line. We would also like to have the coordinates returned in geographic coordinates. To do this we are trying to use rasterio.transform.xy.

 

The problem we are encountering is that rasterio.transform.xy is not returning the coordinates in lat/long but rather in their native projection. We think this is caused by using the incorrect transform (Affine.affine) parameter.

 

Our setup is below. Note in this case shapes is a GeoJSON LineString:

```
with rasterio.open(raster_path) as src:

       out_image, out_transform = rasterio.mask.mask(src, shapes, crop=True)

       with MemoryFile() as memfile:

               with memfile.open(driver="GTiff", height=out_image.shape[1], width=out_image.shape[2], count=1,

                                      dtype=rasterio.float64, transform=out_transform) as dataset:

                        dataset.write(out_image)

                        ds = dataset.read()

                        xs_to_mod = list(range(0, ds.shape[1]))

                        ys_to_mod = list(range(0, ds.shape[2]))

                        xs, ys = rasterio.transform.xy(out_transform, xs_to_mod, ys_to_mod, offset='center')

```
Thank you


Open EHdr raster with hdr header file from memory

fkluibenschaedl@...
 

Hi All,

I am processing SNODAS data archives that store each variable in a single-band ESRI BIL format raster. rasterio.open() works like a charm when a) fp references a file on disk and b) when a hdr (header) file exists in that same location. (I am attaching a working set of files for reference).

I am curious if there is a way to load the raster from memory though. Since each eraster within the archive aforementioned archive is compressed using gzip, The goal is to download the dataset, unpack the .bil file, open it with the matching header information, and store it as e.g. GeoTIFF for further processing. Below is some code that (obviously) fails when it passes a tuple to the rasterio.open fp argument.

Looking forward to your thoughts on this. Thanks in advance,

Florian

import gzip
import tarfile
import rasterio

from ftplib import FTP
from io import BytesIO
from io import StringIO

header_values = {'byteorder': 'M',
                 'layout': 'bil',
                 'nbands': 1,
                 'nbits': 16,  # i.e. 2 bytes indicated in meta info text
                 'ncols': 6935,
                 'nodata_value': -9999,
                 'nrows': 3351,
                 'pixeltype': 'signedint',
                 'ulxmap': -124.729583333333,
                 'ulymap': 52.8704166666666,
                 'xdim': 0.00833333333333333,
                 'ydim': 0.00833333333333333
                 }


HOST = "sidads.colorado.edu"
ftp_path = 'DATASETS/NOAA/G02158/masked/2021/01_Jan/SNODAS_20210101.tar'
data_file_name = 'us_ssmv11034tS__T0001TTNATS2021010105HP001.dat.gz'

dl_file = BytesIO()
ftp = FTP(HOST)
ftp.login()
# ftp.login('anonymous',password)
ftp.retrbinary('RETR ' + ftp_path, dl_file.write)
dl_file.seek(0)


with tarfile.open(fileobj=dl_file, mode='r:') as tar_handle:
    gz_data_file = tar_handle.extractfile(member=data_file_name)
    with gzip.open(gz_data_file) as data_file:
        with StringIO() as hdr_file:
            for field, value in header_values.items():
                hdr_file.write(f"{field} {value}\n")
            with rasterio.open((data_file, hdr_file), driver="EHdr") as src_ds:
                out_profile = src_ds.profile
                out_profile.update(driver="GTiff")
                out_profile.update(compress="lzw")
                with rasterio.open('out.tiff', "w", **out_profile) as out_ds:
                    out_ds.write(src_ds.read(1), 1)


Re: Which version of rasterio support works with numpy==1.19.2

Sean Gillies
 

Hi,

Rasterio is compatible with the "oldest supported numpy" version defined in https://github.com/scipy/oldest-supported-numpy/blob/master/setup.cfg. This means that rasterio wheels on PyPI are built using that version of numpy. They remain forward compatible with newer versions.

On Mon, Jun 28, 2021 at 7:26 AM Souhail Aboulfadile <1s.aboulfadile@...> wrote:
I am using tensorflow 2.5 which requires a specific version of  NumPy 1.19.2.
I am also using rasterio. Which version of rasterio that requires the same version of numpy ? 



--
Sean Gillies


Which version of rasterio support works with numpy==1.19.2

1s.aboulfadile@...
 

I am using tensorflow 2.5 which requires a specific version of  NumPy 1.19.2.
I am also using rasterio. Which version of rasterio that requires the same version of numpy ? 


Re: Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory

Hobart, Geordie (NRCan/RNCan)
 

Thanks Luke. And everyone else for all great the suggestions. Much appreciated.

 

From: main@rasterio.groups.io <main@rasterio.groups.io> On Behalf Of Luke
Sent: June 25, 2021 18:10
To: main@rasterio.groups.io
Subject: Re: [rasterio] Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory

 

You don't need to pre-populate a dataset with a massive empty array, you can just open it in write mode and write chunks of data via windowed writing:

profile = {

    'driver': 'GTiff',

    'tiled': True,

    'height': huge_number,

    'width': huge_number,

    'count': 1,

    'dtype': 'float32',

    'blockxsize': 256,

    'blockysize': 256,

    'compress': 'LZW'

}

 

with rasterio.open('massive.tif', 'w', **profile) as dst:

    for window in dst.block_windows():

        dst.write(small_data_chunk, window=window, indexes=1)


Re: Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory

Luke
 

You don't need to pre-populate a dataset with a massive empty array, you can just open it in write mode and write chunks of data via windowed writing:

profile = {
    'driver': 'GTiff',
    'tiled': True,
    'height': huge_number,
    'width': huge_number,
    'count': 1,
    'dtype': 'float32',
    'blockxsize': 256,
    'blockysize': 256,
    'compress': 'LZW'
}
 
with rasterio.open('massive.tif', 'w', **profile) as dst:
    for window in dst.block_windows():
        dst.write(small_data_chunk, window=window, indexes=1)



Re: Proposal: Allow other cloud object store providers

ashley.sommer@...
 

Thanks for your reply, Sean.

> are you proposing that potentially any cloud platform would be made first class in rasterio, concretely, as in GDAL and as with rasterio/S3 today?

Yes, thats right.

> What would you think if rasterio were to take the opposite approach and require users to write ~5 lines of code themselves to adapt output of, say, keystonev3 and swiftclient to a standard interface in rasterio

Yeah, I actually had the same thought too, but wasn't sure if it would be received well.

That is actually kind-of how you can do it in rasterio already. You can manually create a `keystonev3` or `swiftclient` auth session, and populate it with your credentials. You can then manually create a rasterio `SwiftClient`, and give it that auth session. Then pass the pre-configured `SwiftClient` into `session.Env()` and rasterio will use that as the Cloud Session. Its a bit of boilerplate code, but it works across all of the existing `Session` subclasses for the different cloud platforms already.

Should that be the "standard" way of doing it? Could it be cleaner? Would AWS S3 still be a special case?

My end goal is to be able to use OpenStack Swift ObjectStore as a storage backend for an opendatacube project. Opendatacube only supports AWS S3 for now, because it relies on rasterio's "first-class" interface to S3. I was told, if I want to get other cloud providers working natively in opendatacube, we need them to be fully supported by rasterio first (as you mentioned, they're already fully supported by GDAL).

Ashley Sommer


Re: Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory

Hobart, Geordie (NRCan/RNCan)
 

Thanks Yon.

I want to avoid making the 12 GB numpy zero array and the associated memory demand so solution two is the best bet.

Greatly appreciated.

Geordie

 

From: main@rasterio.groups.io <main@rasterio.groups.io> On Behalf Of yon.davies@...
Sent: June 25, 2021 08:44
To: main@rasterio.groups.io
Subject: Re: [rasterio] Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory

 

Hey Hobart,

I've had success using Rasterio's windowed reading and writing to create a huge raster in chunks. You could do something like this:

# Create blank raster that will hold the data. Make sure the dst_profile has compression to make this small

with rio.open(r1, "w", **dst_profile) as dst:

       dst.write(np.zeros(shape=(x, y, z), height, width), dtype=dst_dtype))

# Get block windows from this blank raster

with rio.open(r1, "r") as src:
      block_windows = list(src.block_windows())

      for j, window in enumerate(block_windows):
            dst.write(stacked, window=window[1])

The block_windows are automatically created and usually, but you can play around and create rasterio.windows.Window() with a size of your choosing.

Creating the initial blank raster will require that you have enough memory to hold the np.zeros, which may not be true. In that case, you can create the initial blank raster on the disk using gdal_create in command line or in Python:

import gdal

import osr

driver = gdal.GetDriverByName('GTiff')
ds = driver.Create(outfn, xsize, ysize, nbands, dtype, options=['COMPRESS=LZW', 'TILED=YES'])


      


Re: Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory

Hobart, Geordie (NRCan/RNCan)
 

Thank you Alan. Good to know. Kind regards. Geordie

 

From: main@rasterio.groups.io <main@rasterio.groups.io> On Behalf Of Alan Snow
Sent: June 25, 2021 09:06
To: main@rasterio.groups.io
Subject: Re: [rasterio] Creating 12GB Create very large empty geotif with Rasterio without using large amounts of memory

 

Reading & writing with windows can be done with rioxarray (rasterio + xarray):

import rioxarray

rds = rioxarray.open_rasterio("input.tif")
rds.rio.to_raster("output.tif", windowed=True)

You can also use dask to read/write in parallel:


Hope this helps,
Alan

21 - 40 of 828