Date   
Issue when using rasterio dataset inside Class with multiprocessing

luoyntech@...
 

I have a simple class which has a member variable of rasterio dataset. Inside the class, there is also a function wrapped by a python multiprocessing call. See below



This code is supposed to print 'Worker 0', 'Worker 1' and 'Worker 2'. However, when I actually ran this code, it printed nothing but exited normally.




I then tried to comment out the line reading the tif image using rasterio, which looks like this


This time it printed out text as I expected.




Is there any possible reason that causes this issue? Thanks!

Float precision on writing grid

Betman, Maarten
 

I'm trying to transform xyz data to a raster grid using rasterio

I've loaded the elevation data into a numpy array, the precision of the z values is 2 decimals and the array has dtype float64.

after loading the xyz data and putting elevation data in a grid the array zi has the following properties and example value in the array:

[in]  zi.shape

[out] (3796, 3557)

 

[in]  zi.dtype

[out] dtype('float64')

 

[in]  zi[1000,1000]

[out] -5.27

I'm trying to write the gridded data to an ESRI Ascii file using rasterio and the AAIGrid driver.

import rasterio

from rasterio.transform import from_origin

from fiona.crs import from_epsg

 

transform = from_origin(xmin, ymax, 0.5, 0.5)

 

new_dataset = rasterio.open('test1.asc', 'w', driver='AAIGrid',

                        height = zi.shape[0], width = zi.shape[1],

                        count=1, dtype=str(zi.dtype)

                        crs=from_epsg(32750),

                        transform=transform,

                        nodata = -9999)

 

new_dataset.write(zi, 1)

new_dataset.close()

The z grid is written to the ESRI Ascii file successfully, only the "precision" of the values is now 19 decimals. This makes the files unnecessarily large and slows down programs using the gridded files.

Header ESRI Ascii file and first 2 values:

ncols        3557

nrows        3796

xllcorner    765005.500000000000

yllcorner    9430016.000000000000

cellsize     0.500000000000

NODATA_value -9999

-0.029999999329447746277 -0.029999999329447746277

I've tried changing the datatype to float32 and added argument "precision = 2" to rasterio.open. Is their a way I can get the number of decimals down?

Re: Float precision on writing grid

Matthew Perry
 

You can control the precision of the AAIGrid output using the GDAL creation options DECIMAL_PRECISION (or SIGNIFICANT_DIGITS). You can specify these as case-insensitive kwargs like `rasterio.open(..., decimal_precision=3)`

Re: Issue when using rasterio dataset inside Class with multiprocessing

Matthew Perry
 

An open Rasterio dataset object should not be passed between multiple processes or threads; the underlying GDALDataset is not thread safe. Additionally, the dataset's lifecycle should be made explicit - either by explicitly calling .close or opening as a context manager (recommended).

It's not clear what your intention is with the `worker` function but I can see two ways to approach it, depending on your goal

if each process simply needs access to the array of data, I would read all of the data in __init__ and close out the dataset before invoking any parallel workers. Then you're just sharing a numpy array instead of a stateful dataset object.

    def __init__(self):
        with rasterio.open('/Users/mperry/work/rasterio/tests/data/RGB.byte.tif') as src:
            data = src.read()
 

if you need to read different parts of the dataset from each process, you should pass the dataset path and open/close the a new dataset within each thread. You can't share a dataset object between threads/procs but you can create multiple datasets pointing to the same resource.

Hope this helps.

Open a raster file with read and write permissions in rasterio

Amine Aboufirass
 

I have already asked this question in GIS stack exchange  and stack overflow but received no response:
 
I am trying to write a function which will help me construct a `rasterio` memory file and return the dataset reader object so I can then play around with it as needed. I am trying to avoid `with` statements because I would like to allow the user to read and write as they deem necessary. This is what I have so far:
 
 
    from rasterio import mask, features, warp
    from rasterio.io import MemoryFile
    import rasterio
    import numpy as np
 
    def create_memory_file(data, west_bound, north_bound, cellsize, driver='AAIGrid'):
        #data is a numpy array
        dtype = data.dtype
        shape = data.shape
        transform = rasterio.transform.from_origin(west_bound, north_bound, cellsize, cellsize)
        memfile = MemoryFile()
        dataset = memfile.open(driver=driver, width= shape[1], height = shape[0], transform=transform, count=1, dtype = dtype)
        dataset.write(data,1)
    
        return dataset
    def close_memory_file(memfile):
        memfile.close()
    
    
    data = np.array([[1,2,3], [4,5,6], [7,8,9]])
    memfile = create_memory_file(data, 0, 2, 0.5)
    memfile.read(1)
 
The last line throws an error:
 
    Traceback (most recent call last):
      File "D:/11202750-002_RA2CE/Basis/common.py", line 214, in <module>
        memfile.read(1)
      File "rasterio\_io.pyx", line 209, in rasterio._io.DatasetReaderBase.read
    rasterio.errors.UnsupportedOperation: not readable
 
The memfile apparently only has write permissions because of the following:
 
    >>> memfile
    <open BufferedDatasetWriter name='/vsimem/3c0572dd-a295-4f2b-a65d-22a404dd5d0f.' mode='w'>
    >>> memfile.mode
    'w'
 
I have tried various combinations of adding `mode=r+`, `permissions=r+` to my `memfile.open()` call but it won't allow this.
 
How can I add the read/write mode at the level of the `open` statement?
 
 
  [1]: https://gis.stackexchange.com/questions/297562/open-a-raster-file-with-read-and-write-permissions-in-rasterio

Re: Open a raster file with read and write permissions in rasterio

Sean Gillies
 

Hi,

On Tue, Oct 2, 2018 at 8:14 AM Amine Aboufirass <amine.aboufirass@...> wrote:
I have already asked this question in GIS stack exchange  and stack overflow but received no response:
 
I am trying to write a function which will help me construct a `rasterio` memory file and return the dataset reader object so I can then play around with it as needed. I am trying to avoid `with` statements because I would like to allow the user to read and write as they deem necessary. This is what I have so far:
 
 
    from rasterio import mask, features, warp
    from rasterio.io import MemoryFile
    import rasterio
    import numpy as np
 
    def create_memory_file(data, west_bound, north_bound, cellsize, driver='AAIGrid'):
        #data is a numpy array
        dtype = data.dtype
        shape = data.shape
        transform = rasterio.transform.from_origin(west_bound, north_bound, cellsize, cellsize)
        memfile = MemoryFile()
        dataset = memfile.open(driver=driver, width= shape[1], height = shape[0], transform=transform, count=1, dtype = dtype)
        dataset.write(data,1)
    
        return dataset
    def close_memory_file(memfile):
        memfile.close()
    
    
    data = np.array([[1,2,3], [4,5,6], [7,8,9]])
    memfile = create_memory_file(data, 0, 2, 0.5)
    memfile.read(1)
 
The last line throws an error:
 
    Traceback (most recent call last):
      File "D:/11202750-002_RA2CE/Basis/common.py", line 214, in <module>
        memfile.read(1)
      File "rasterio\_io.pyx", line 209, in rasterio._io.DatasetReaderBase.read
    rasterio.errors.UnsupportedOperation: not readable
 
The memfile apparently only has write permissions because of the following:
 
    >>> memfile
    <open BufferedDatasetWriter name='/vsimem/3c0572dd-a295-4f2b-a65d-22a404dd5d0f.' mode='w'>
    >>> memfile.mode
    'w'
 
I have tried various combinations of adding `mode=r+`, `permissions=r+` to my `memfile.open()` call but it won't allow this.
 
How can I add the read/write mode at the level of the `open` statement?
 
 

Currently, the datasets in a MemoryFile can only be opened in read or write modes. Instances created with no initial bytes have their mode fixed to "w" and instances created with initial bytes have their mode fixed to "r". I could explore changing these to "w+" and "r+" instead for the next version of Rasterio.

That said, few data formats can be used in this way. The ascii grid format in your example, for one, cannot be updated in place by GDAL or Rasterio. It has a "no" in the "Creation" column of the table in https://www.gdal.org/formats_list.html and like other such formats is read-only or write-on-copy only.

--
Sean Gillies

Rasterio 1.0.8

Sean Gillies
 

Hi all,

Rasterio 1.0.8 has been uploaded to PyPI. Thanks to Denis Rykov, we have Python 3.4 wheels for manylinux1 again. And datasets stored in MemoryFile bytes are now opened in r+ or w+ mode as requested by Amine Aboufirass earlier today.

--
Sean Gillies

GDAL version policy

Denis Rykov
 

Hello folks,

What is the policy of the project about GDAL version?
The latest stable release of GDAL is 2.3.2 but recent wheels of rasterio  are built onto 2.3.1.

Re: GDAL version policy

Sean Gillies
 

Hi Denis,

On Wed, Oct 3, 2018 at 1:50 AM Denis Rykov <rykovd@...> wrote:
Hello folks,

What is the policy of the project about GDAL version?
The latest stable release of GDAL is 2.3.2 but recent wheels of rasterio  are built onto 2.3.1

Good question! My intent for the wheels is to make it easy for people to try Rasterio and to be able to easily provision CI servers for testing Python software that requires Rasterio. There isn't really any policy yet except that I do keep the GDAL and other libraries versions under revision control so that they don't regress and try to keep them not too far out of date.

I will happily accept PRs at https://github.com/sgillies/rasterio-wheels to update libraries to the latest stable versions. As to format drivers, I would like to minimize drivers to keep the wheels under 20MB and keep wheel build times to under 20 mins.

--
Sean Gillies

Re: GDAL version policy

Denis Rykov
 

Thank you Sean!

I made a PR at rasterio-wheels repository which updates GDAL to the latest stable version.

Besides this we discovered that we need a rasterio wheels built against a master version of GDAL. What do you advise us in this case? Does it make sense having such wheels on PyPI?

Windowed reads and coordinates

Koshy Thomas
 

Hi all

I'm trying to create vector features out of raster files using the `features.shapes` module and I'm running into a weird issue where when I run a windowed read on a `.tif`, the Y coordinates of the polygons are thrown off while the X coordinates seem correct. If I run the same logic without a windowed read (against the whole tif), it seems to work correctly. Any ideas? I'm attaching the .tif and the code screenshot is below:

Re: Windowed reads and coordinates

Koshy Thomas
 

Turns out the transform object needs to actually be

transform = rasterio.windows.transform(window, dataset.transform)

for windowed reads. 

Rasterio 1.0.9

Sean Gillies
 

Hi all.

Rasterio 1.0.9 is on PyPI now. It has a bunch of bug fixes that are especially important if you are deploying systems on AWS or using the WarpedVRT class.


Thanks for the bug reports, analysis, and patience, everybody!

--
Sean Gillies

Any way to get/set the raster band scale and offset?

Kris Vanhoof
 

Hi! I am using rasterio to convert a number of geotiffs that were originally
created using GDAL.  To preserve the image metadata I'm currently using a combination of
the 'profile', 'descriptions' and 'units' properties of DatasetBase as well as 'tags()'.

However, it is unclear to me how to get and set the 'Scale' and 'Offset' value for a raster band.

Since there is a 'units' property in 'DatasetBase', I would have expected there to be a 'scales' and 'offsets'.

Likewise, in 'DatasetWriterBase' there is a 'set_band_unit()', but no 'set_band_offset()' or 'set_band_scale()'.

The GDAL functions that I would normally use are:

GDALGetRasterUnitType()
GDALGetRasterScale()
GDALGetRasterOffset()

GDALSetRasterUnitType()
GDALSetRasterScale()
GDALSetRasterOffset()

Is there any way to access these values using rasterio?

Re: Any way to get/set the raster band scale and offset?

Sean Gillies
 

Hi Kris,

Raster scale and offset don't surface in Rasterio versions 1.0.9 and older only because no user has spoken up them. I'd be happy to add these for 1.0.10.


On Mon, Oct 29, 2018 at 4:22 PM Kris Vanhoof <kris.vanhoof@...> wrote:
Hi! I am using rasterio to convert a number of geotiffs that were originally
created using GDAL.  To preserve the image metadata I'm currently using a combination of
the 'profile', 'descriptions' and 'units' properties of DatasetBase as well as 'tags()'.

However, it is unclear to me how to get and set the 'Scale' and 'Offset' value for a raster band.

Since there is a 'units' property in 'DatasetBase', I would have expected there to be a 'scales' and 'offsets'.

Likewise, in 'DatasetWriterBase' there is a 'set_band_unit()', but no 'set_band_offset()' or 'set_band_scale()'.

The GDAL functions that I would normally use are:

GDALGetRasterUnitType()
GDALGetRasterScale()
GDALGetRasterOffset()

GDALSetRasterUnitType()
GDALSetRasterScale()
GDALSetRasterOffset()

Is there any way to access these values using rasterio?



--
Sean Gillies

Re: Any way to get/set the raster band scale and offset?

Kris Vanhoof
 

That would be great, thanks a lot!

Make _boundless_vrt_doc public

Denis Rykov
 

Hi,

We are going to use rasterio.vrt._boundless_vrt_doc for making a XML VRT out of dataset in our library.
But we are little afraid because the name of this function tells that it is a private one.
Are there any reasons for this? What do you think about to make this function public?

Re: Make _boundless_vrt_doc public

Sean Gillies
 

Hi Denis,

I'm not prepared to make it public yet. I'm not confident that the signature of the method is finished and I believe we should have a discussion on the dev group about VRTs and how much support we'll have for them in Rasterio's public API.

For now, I recommend copying it into your own project.


On Wed, Oct 31, 2018 at 9:39 AM Denis Rykov <rykovd@...> wrote:
Hi,

We are going to use rasterio.vrt._boundless_vrt_doc for making a XML VRT out of dataset in our library.
But we are little afraid because the name of this function tells that it is a private one.
Are there any reasons for this? What do you think about to make this function public?



--
Sean Gillies

Unclear how to implement image alignment based on GCPs

mmweene15@...
 

I am trying to replicated the gdal functionality for performing image alignment based on GCPs. The equivalent gdal commands are the following:

`gdal_translate -of GTiff -gcp GCP1 -gcp GCP2... src_raster dst_raster`
`gdalwarp -s_srs S_SRS -t_srs T_SRS -tps dst_raster dst_raster2`

This would create a new raster, `dst_raster2` that has been reprojected according to the specified GCPs. It is unclear, based on the documentation, how to achieve this using the rasterio.warp module. I have tried using the rasterio.warp.reproject function to no avail. I've included a snippet of code below. The resulting image does not match the image produced by gdal. Any guidance would be appreciated. Note that the variables are descriptively named.

with rasterio.open(target_image_path, 'r') as f:
target_profile = f.profile
target_image = image_tools.rasterio_read_bands(f)
print(target_profile)

destination = np.empty(target_image.shape, dtype=target_profile['dtype'])
resolution = (target_profile['transform'].a, target_profile['transform'].e)
transform, width, height = rasterio_warp.calculate_default_transform(src_crs,
dst_crs,
gcps=gcps,
resolution=resolution,
height=target_profile['height'],
width=target_profile['width'])

rasterio_warp.reproject(source=target_image,
destination=destination,
gcps=gcps,
src_crs=src_crs,
dst_crs=src_crs,
# src_transform=target_profile['transform'],
dst_transform=target_profile['transform'],
resampling=getattr(enums.Resampling, resampling_method),
num_threads=num_threads)

Re: Unclear how to implement image alignment based on GCPs

Sean Gillies
 

Hi,

I'm having trouble understanding the intent of code. It looks like you are calling reproject() with source and destination arrays of the same shape, with the same coordinate reference system on each side of the reprojection. I'm not what we should expect in this situation; we are not currently testing for this. Can you write back and show your GCPs?