Date   

Re: Handle to underlying GDAL dataset

Sean Gillies
 

Hi Søren,

It's not possible for a GDAL dataset handle to be shared between rasterio and GDAL's Python bindings. I discourage people from combining these modules. Unless one is extremely careful and understands the internals of each, it's a great way to create perplexing bugs.

Rasterio's dataset.read() works at a higher level than GDAL's DatasetReadAsArray(). The latter is effectively the C function that rasterio uses internally. Rasterio has data type and nodata checking overhead that GDAL's function lacks and perhaps some logic that could be more optimized. Is the overhead really 30%? Would you be willing to use Python's timeit to race the last two lines of your code against each other? I would expect the difference to diminish as the size of the window grows.


On Tue, May 4, 2021 at 7:15 AM soren.rasmussen via groups.io <soren.rasmussen=alexandra.dk@groups.io> wrote:
Hi!

I would like to get a handle for the underlying GDAL dataset behind a RasterIO dataset.
Is this possible somehow?
 
The reason I want to do this is that, in some cases, GDAL operations are significantly faster. I have a case, where reading a subset of channels in a window with the just-released GDAL 3.3.0 is approximately 25% faster than RasterIO.
It would be nice to be able to use GDAL operations without having to open the dataset with both RasterIO and GDAL


from osgeo import gdal, gdal_array

rio_ds = rasterio.open(path)

gdal_ds = gdal.Open(path)
bands = [1,2,3]
resampling = rasterio.enums.Resampling.bilinear
win = ds.window(*bbox)
xoff, yoff, xsize, ysize = map(lambda x: int(round(x)), win.col_off, win.row_off, win.width, win.height)
rio_img = rio_ds.read(window=win, indexes=bands, out_shape=out_shape, resampling=resampling)
gdal_img = gdal_array.DatasetReadAsArray(gdal_ds, xoff=xoff, yoff=yoff, win_xsize=xsize, win_ysize=ysize, buf_xsize=out_shape[1], buf_ysize=out_shape[0], resample_alg=resampling, band_list=bands)
(Note that band_list is new in GDAL 3.3.0)

Best, Søren



--
Sean Gillies


Handle to underlying GDAL dataset

soren.rasmussen@...
 

Hi!

I would like to get a handle for the underlying GDAL dataset behind a RasterIO dataset.
Is this possible somehow?
 
The reason I want to do this is that, in some cases, GDAL operations are significantly faster. I have a case, where reading a subset of channels in a window with the just-released GDAL 3.3.0 is approximately 25% faster than RasterIO.
It would be nice to be able to use GDAL operations without having to open the dataset with both RasterIO and GDAL


from osgeo import gdal, gdal_array

rio_ds = rasterio.open(path)

gdal_ds = gdal.Open(path)
bands = [1,2,3]
resampling = rasterio.enums.Resampling.bilinear
win = ds.window(*bbox)
xoff, yoff, xsize, ysize = map(lambda x: int(round(x)), win.col_off, win.row_off, win.width, win.height)
rio_img = rio_ds.read(window=win, indexes=bands, out_shape=out_shape, resampling=resampling)
gdal_img = gdal_array.DatasetReadAsArray(gdal_ds, xoff=xoff, yoff=yoff, win_xsize=xsize, win_ysize=ysize, buf_xsize=out_shape[1], buf_ysize=out_shape[0], resample_alg=resampling, band_list=bands)
(Note that band_list is new in GDAL 3.3.0)

Best, Søren


Re: Need Help

Sean Gillies
 

Hello,

I don't know about MSAVI2, but once you have a numpy array of 0 and 1, you can mask another array using https://numpy.org/doc/stable/reference/generated/numpy.where.html. Rasterio only reads and writes datasets, it doesn't have built in analysis functions. All that is left up to the user and numpy.

For band names, if you create a new dataset in "w" mode or open an existing dataset in "r+" mode, you can pass a sequence of strings as shown in https://github.com/mapbox/rasterio/blob/master/tests/test_descriptions.py#L14.

On Sat, May 1, 2021 at 7:05 AM <serignemansour.diene@...> wrote:
First : I want to know how could i change the name of raster images’s band? Because i have a multispectral image with 8 band and the name is none when i do image.descrptions. 

Second : 
On this multispectral image, I want to compute an MSAVI2 spectral index and extract all non-ground pixels to mask another image.

MSAVI2 allows to mask the ground by giving it values 0 or 1.

Best regards

--
Sean Gillies


Need Help

serignemansour.diene@...
 

First : I want to know how could i change the name of raster images’s band? Because i have a multispectral image with 8 band and the name is none when i do image.descrptions. 

Second : 
On this multispectral image, I want to compute an MSAVI2 spectral index and extract all non-ground pixels to mask another image.

MSAVI2 allows to mask the ground by giving it values 0 or 1.

Best regards


snippet for minimum bounding snapped window?

Gregory, Matthew
 

Hi all,

Often I'm given unsnapped coordinates and want to find the minimum bounding snapped box based on a rasterio Dataset. I'm guessing someone here has a better way of doing it or I'm missing a method somewhere.

Here's a function with some comments:

def get_minimum_snapped_extent(src, bounds):
# Get window from bounds (assume bounds is sequence of
# left, bottom, right, top
w1 = src.window(*bounds)

# Round to include top-left corner (op="floor" is default)
w2 = w1.round_offsets()

# Union these to get a new window
w3 = rasterio.windows.union(w1, w2)

# Round shape to buffer to bottom-right corner
w4 = w3.round_shape(op="ceil")

# Return the coordinates associated with this window
return src.window_bounds(w4)

Obviously, I don't need all the temp variables, but is there a more succinct way of doing this?

thanks, matt


Rasterio 1.2.2

Sean Gillies
 

Hi all,

Rasterio 1.2.2 source distribution and wheels for selected platforms are on PyPI now. The wheels include GDAL 3.2.2 and PROJ 7.2.1. See 

--
Sean Gillies


Re: Help Saving .Tif File after Mask

Simon Tarr
 
Edited

Hi Luke, thanks for the reply. Your code has helped me move past my previous error.
I had to delete the indexes argument when calling `write' but apart from that, it works like a charm.

Thank you for your help!


On Wed, Apr 7, 2021 at 02:41 AM, Luke wrote:
with rasterio.open('/data/my_output.tif', 'w', 
               driver='GTiff',
               height=masked_raster.shape[1], 
               width=masked_raster.shape[2], 
               count=1, 
               dtype=rasterio.ubyte, 
               crs=file.crs, 
               transform=transform
              ) as outfile:
            outfile.write(masked_raster, indexes=1) 


Re: Using rasterio to mask/crop a raster results in AttributeError

Luke
 

You can also use json:

import json
poly = json.loads(obj.job_loc.geojson)

 

 


Re: Help Saving .Tif File after Mask

Luke
 
Edited

rasterio.mask.mask returns a two element tuple - element 0 is the array, element 1 is the transform.  Try something like (untested):

# Mask Raster
masked_raster, transform = rasterio.mask.mask(file, poly, crop=True, filled=True, invert=False)
 
with rasterio.open('/data/my_output.tif', 'w', 
               driver='GTiff',
               height=masked_raster.shape[1], 
               width=masked_raster.shape[2], 
               count=1, 
               dtype=rasterio.ubyte, 
               crs=file.crs, 
               transform=transform
              ) as outfile:
            outfile.write(masked_raster, indexes=1) 
 


Help Saving .Tif File after Mask

Simon Tarr
 

I have just used rasterio.mask.mask() to crop a larger raster using a Django PolygonField() polygon:

# Open large raster file
file = rasterio.open('/data/corine2018_100m/test.tif') print(file) <open DatasetReader name='/data/corine2018_100m/test.tif' mode='r'>

# Dummy Django model object
obj = newJob.objects.create(job_loc="SRID=4326;POLYGON ((0.9063720703125029 52.32023207609735, 0.8239746093749998 52.10819209746323, 1.170043945312496 52.14191683166823, 1.170043945312496 52.31351619974807, 0.9063720703125029 52.32023207609735))")poly = ast.literal_eval(obj.job_loc.geojson)

# Grab PolygonField(), serialise to GeoJSON, put into a list
poly = ast.literal_eval(obj.job_loc.geojson)
poly = [poly]
print(poly)
[{'type': 'Polygon', 'coordinates': [[[0.906372070312503, 52.32023207609735], [0.823974609375, 52.10819209746323], [1.170043945312496, 52.14191683166823], [1.170043945312496, 52.31351619974807], [0.906372070312503, 52.32023207609735]]]}]

# Mask Raster
masked_raster = rasterio.mask.mask(file, poly, crop=True, filled=True, invert=False)
print(masked_raster)
(array([[[ 0, 0, 0, ..., 0, 0, 0],
[ 0, 0, 0, ..., 0, 0, 0],
[ 0, 0, 0, ..., 0, 0, 0],
...,
[ 0, 12, 12, ..., 0, 0, 0],
[ 0, 12, 12, ..., 0, 0, 0],
[ 0, 0, 0, ..., 0, 0, 0]]], dtype=uint8), Affine(0.0012806021160224688, 0.0, 0.8235730268502728,
0.0, -0.0012806021160261063, 52.32090505556345))

I now need to save the object masked_raster to disk. I have tried with the following code:

with rasterio.open('/data/my_output.tif', 'w', 
driver='GTiff',
height=masked_raster[0].shape[1],
width=masked_raster[0].shape[2],
count=1,
dtype=rasterio.ubyte,
crs=file.crs,
transform= # Not sure what this should map to!
) as outfile:
outfile.write(x, indexes=1)

But I get the error:

 File "rasterio/_io.pyx", line 1338, in rasterio._io.DatasetWriterBase.write
rasterio.errors.InvalidArrayError: Positional argument arr must be an array-like object

I'm clearly passing the wrong data from masked_raster and I don't understand what I'm supposed to input for the transform argument.
Any help would be gratefully received!

 


Re: Using rasterio to mask/crop a raster results in AttributeError

Simon Tarr
 

Thank you. I used the below to remove the quotes:

import ast
poly = ast.literal_eval(obj.job_loc.geojson)


Re: Using rasterio to mask/crop a raster results in AttributeError

Yves Moisan
 

Hi,


You have a len = 1 list the only element of which is a string.  That's what the error message tells you, essentially.  Get rid of the quotes and you'l have a one-item list the which is your dict :


 var2 = [{ "type": "Polygon", "coordinates": [ [ [ 0.906372070312503, 52.320232076097348 ], [ 0.823974609375, 52.108192097463231 ], [ 1.170043945312496, 52.141916831668233 ], [ 1.170043945312496, 52.313516199748072 ], [ 0.906372070312503, 52.320232076097348 ] ] ] }]
>>> type(var2)
<class 'list'>
>>> type(var2[0])
<class 'dict'>

Cheers,

I'm trying to use rasterio to mask a raster. According to the mask documentation, I need to load a raster file in 'r' mode, which I have done like so:

file = rasterio.open('/data/corine2018_100m/test.tif')
print(file)
<open DatasetReader name='/data/corine2018_100m/test.tif' mode='r'>

I also need a shapes which is an iterable GeoJSON-like object. I have GeoJSON serialised a Polygon object from a Django PolygonField() and added it to a list (my interpretation of the documentation's 'iterable object'):

obj = newJob.objects.create(job_loc="SRID=4326;POLYGON ((0.9063720703125029 52.32023207609735, 0.8239746093749998 52.10819209746323, 1.170043945312496 52.14191683166823, 1.170043945312496 52.31351619974807, 0.9063720703125029 52.32023207609735))")

poly = [obj.job_loc.geojson]
print(poly)
['{ "type": "Polygon", "coordinates": [ [ [ 0.906372070312503, 52.320232076097348 ], [ 0.823974609375, 52.108192097463231 ], [ 1.170043945312496, 52.141916831668233 ], [ 1.170043945312496, 52.313516199748072 ], [ 0.906372070312503, 52.320232076097348 ] ] ] }']

print(type(poly))
<class 'list'>

I then attempt the mask with:

masked_band, masked_transform = rasterio.mask.mask(file, poly, crop=True)

However, I get the error:

AttributeError: 'str' object has no attribute 'get'

I have tried following this (i.e. adding my JSON to a list) but I still get the error. Can someone help me mask this raster?



Using rasterio to mask/crop a raster results in AttributeError

Simon Tarr
 

I'm trying to use rasterio to mask a raster. According to the mask documentation, I need to load a raster file in 'r' mode, which I have done like so:

file = rasterio.open('/data/corine2018_100m/test.tif')
print(file)
<open DatasetReader name='/data/corine2018_100m/test.tif' mode='r'>

I also need a shapes which is an iterable GeoJSON-like object. I have GeoJSON serialised a Polygon object from a Django PolygonField() and added it to a list (my interpretation of the documentation's 'iterable object'):

obj = newJob.objects.create(job_loc="SRID=4326;POLYGON ((0.9063720703125029 52.32023207609735, 0.8239746093749998 52.10819209746323, 1.170043945312496 52.14191683166823, 1.170043945312496 52.31351619974807, 0.9063720703125029 52.32023207609735))")

poly = [obj.job_loc.geojson]
print(poly)
['{ "type": "Polygon", "coordinates": [ [ [ 0.906372070312503, 52.320232076097348 ], [ 0.823974609375, 52.108192097463231 ], [ 1.170043945312496, 52.141916831668233 ], [ 1.170043945312496, 52.313516199748072 ], [ 0.906372070312503, 52.320232076097348 ] ] ] }']

print(type(poly))
<class 'list'>

I then attempt the mask with:

masked_band, masked_transform = rasterio.mask.mask(file, poly, crop=True)

However, I get the error:

AttributeError: 'str' object has no attribute 'get'

I have tried following this (i.e. adding my JSON to a list) but I still get the error. Can someone help me mask this raster?



Is it possible to build a .vrt file from multiple files with Rasterio?

Pierrick Rambaud
 

I would like to build a vrt file from multiple dataset. I know the gdalbuildvrt but i don't find how I'm supposed to do the same using rasterio vrt object.

Does anyone have a simple example ?

Is it even possible ? 

 

My use case is very simple, when I export a big image from GEE it's tiled in smaller .tif. instead of merging them all and get a 10 Gb .tif, I would like to create a .vrt to gather them.


Re: Clarification on usage with QGIS

Ari Meyer
 

That's good to hear, Christina -- thanks!
Ari

On Thu, Mar 25, 2021 at 6:59 PM Ratcliff, Christina (A&F, Waite Campus) <christina.ratcliff@...> wrote:
Hi Ari,

Yes, I've also confirmed  my plugin & rasterio runs in QGIS 3.18 without any problems.

I have similar GDAL & GDAL backwards compatibility packages installed and haven't experienced issues with QGIS so it should be fine. I'm not a big user of Grass or SAGA, so can't comment on them.

Christina


Re: Clarification on usage with QGIS

Ratcliff, Christina (A&F, Waite Campus)
 

Hi Ari,

Yes, I've also confirmed  my plugin & rasterio runs in QGIS 3.18 without any problems.

I have similar GDAL & GDAL backwards compatibility packages installed and haven't experienced issues with QGIS so it should be fine. I'm not a big user of Grass or SAGA, so can't comment on them.

Christina


Re: Clarification on usage with QGIS

Ari Meyer
 

Hi Christina,

FYI, I just installed QGIS 3.18.1 and added rasterio.  Here's the relevant snippet of the installer output:

gdal (3.1.4-3)
The GDAL/OGR library and commandline tools
Required by: grass, python3-rasterio, gpsbabel, python3-gdal, python3-fiona, qgis-common

gdal111dll (1.11.3-1)
The GDAL/OGR 1.11 DLL (backward compability package)
Required by: liblas

gdal204dll (2.4.3-1)
The GDAL/OGR 2.4 DLL (backward compability package)
Required by: saga-ltr

...

liblas (1.8.0-1)
The libLAS commandline utilities
Required by: grass

So it appears that QGIS actually installed 3 versions of GDAL based on the requirements we have (including GRASS and SAGA).  Do you think this could be an issue?  So far we haven't seen any actual problems because of this.

Thanks,
Ari

On Tue, Mar 2, 2021 at 8:31 PM Ari Meyer via groups.io <ari.meyer=gmail.com@groups.io> wrote:
Much appreciated, Christina.  Will try this out.

Cheers!
Ari


Re: Rasterio read outshape resample does not give correct array

Loïc Dutrieux
 

With count data, like population, it only makes sense to aggregate using a sum resampling function (like you did with block_reduce).
Using rasterio you can perform a decimated read (setting out_shape) using resampling=Resampling.sum. I see from the docs that it requires GDAL >= 3.1 though (https://rasterio.readthedocs.io/en/latest/api/rasterio.enums.html#rasterio.enums.Resampling)
It can give unexpected results if the file has precomputed overviews, then no matter which resampling method you specify, rasterio won't be able to ignore them (unless built against GDAL >= 3.3 which is very unlikely at the moment).

cheers,
Loïc
________________________________________
From: main@rasterio.groups.io <main@rasterio.groups.io> on behalf of tinak.contact@gmail.com <tinak.contact@gmail.com>
Sent: 17 March 2021 07:06:04
To: main@rasterio.groups.io
Subject: Re: [rasterio] Rasterio read outshape resample does not give correct array

[Edited Message Follows]
[Reason: added gdal option ]

Thank you very much, I found scikit block_reduce<https://urldefense.com/v3/__https://scikit-image.org/docs/dev/api/skimage.measure.html*block-reduce__;Iw!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBB-xB81wfd0QZgYJdbZuqh9ZDbxz4GMJQqQ$> which did the trick, but I was wondering if there are any gdal or rasterio functions who could
do the same. I found from (here<https://urldefense.com/v3/__https://gis.stackexchange.com/questions/152661/downsampling-geotiff-using-summation-gdal-numpy__;!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBB-xB81wfd0QZgYJdbZuqh9ZDbxz3Jieug0$>) and ( here<https://urldefense.com/v3/__https://gdal.org/programs/gdalwarp.html__;!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBB-xB81wfd0QZgYJdbZuqh9ZDbxzYF_dZ3o$>) that
GDAL 3.1 or later does have -r sum or -r average
as interpolation method.
However, if I apply the gdal translate command I am not getting the correct numbers. Could this be a problem with my gdal?


Solution with scikit block_reduce :<https://urldefense.com/v3/__https://scikit-image.org/docs/dev/api/skimage.measure.html*block-reduce__;Iw!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBB-xB81wfd0QZgYJdbZuqh9ZDbxz4GMJQqQ$>

from skimage.measure import block_reduce
import matplotlib.pyplot as plt
factor = 2
with rasterio.open("mwi_ppp_2020_UTM.tif") as dataset:
data = dataset.read(1)
print('population file', data[np.where(data>0)].sum())
# give zero to no dataval
data=np.where(data<=0,0, data)

# reduce by factor
x=block_reduce(data, block_size=(factor,factor), func=np.sum)

# mask out no data vals
x_masked=np.ma.masked_equal(x,0)
plt.imshow(x_masked)
plt.colorbar()
print('population resampled',x_masked.sum())
[cid:attach_0_166D0C1E101B79CD_3895@groups.io]


Using gdal translate -r average

(decrease by factor 10)
import subprocess
subprocess.call("""gdal_translate -tr 0.0083 0.0083 -r average\
/Users/admin/Downloads/mwi_ppp_2020.tif gdal_test.tif""", shell=True)

Population after 206120.61


Re: Rasterio read outshape resample does not give correct array

tinak.contact@...
 
Edited

Thank you very much, I found  scikit block_reduce which did the trick, but I was wondering if there are any gdal or rasterio functions who could 
do the same. I found from (here) and ( here) that

GDAL 3.1 or later   does have     -r sum     or -r average

as interpolation method.
However, if I apply the gdal translate command I am not getting the correct numbers. Could this be a problem with my gdal?


 Solution with scikit block_reduce :

from skimage.measure import block_reduce
import matplotlib.pyplot as plt
factor = 2
with rasterio.open("mwi_ppp_2020_UTM.tif") as dataset:
    data = dataset.read(1)
    print('population file', data[np.where(data>0)].sum())
    # give zero to no dataval 
    data=np.where(data<=0,0, data) 
    
     #  reduce by factor 
    x=block_reduce(data, block_size=(factor,factor), func=np.sum)
   
    # mask out no data vals
    x_masked=np.ma.masked_equal(x,0) 
    plt.imshow(x_masked)
    plt.colorbar()
print('population resampled',x_masked.sum())



Using gdal translate -r average 

(decrease by factor 10) 
import subprocess
subprocess.call("""gdal_translate  -tr 0.0083  0.0083  -r average\
                /Users/admin/Downloads/mwi_ppp_2020.tif gdal_test.tif""", shell=True)
Population after 206120.61


Re: Rasterio read outshape resample does not give correct array

Sean Gillies
 

Hi,

The "average" resampling algorithm should give you better results than "bilinear". I'm not sure it is guaranteed to preserve the sum, especially in the case where the original dataset's width and height are not multiples of 2.

Documentation on the resampling algorithms can be found here: https://gdal.org/programs/gdal_translate.html#cmdoption-gdal_translate-r. Rasterio uses these same options.


On Tue, Mar 16, 2021 at 4:05 PM <tinak.contact@...> wrote:

[Edited Message Follows]

Hi all, 

I have a population grid and want to decrease the resolution by a factor 2. So I would like to get a raster that has 2 times less rows and columns, a resolution that is 2 times as large as the original one, while retaining the same overall number of population.
 
This should be a straightforward operation, but the population numbers change incorrectly if I follow the documentation. When the resampled array is written to file, the numbers are incorrect.  
 
Shouldn't the resample method yield an array with resampled cell values to get the same number of total population? Any tips and hints would be appreciated!

factor = 2

with rasterio.open("mwi_ppp_2020.tif") as dataset:

    arr = dataset.read(1,masked=True)
    #Show total population of Malawi 
    print(arr.sum())

    data = dataset.read(1, masked=True, resampling=Resampling.bilinear,
        out_shape=(
            int(dataset.height * factor),
            int(dataset.width * factor)))

     #Show total population of array 
    print(data.sum())

     # scale image transform
    transform = dataset.transform * dataset.transform.scale(
            (dataset.width / data.shape[1]),


            (dataset.height / data.shape[0])
                                )
_

--
Sean Gillies

1 - 20 of 764