concurrent.futures.ThreadPoolExecutor for s3 COG reads fails - CPLReleaseMutex: Error
dweber.consulting@...
This could be similar to https://github.com/mapbox/rasterio/issues/1686
In an AWS-Lambda python 3.7 runtime, there is an error like: CPLReleaseMutex: Error = 1 (Operation not permitted)
[WARNING] 2021-05-07T17:38:33.315Z 3e8b4b55-9066-426c-9319-540e39e17709
CPLE_AppDefined in HTTP response code on https://bucket-xxx-us-west-2.s3.amazonaws.com/prefix/geotiff.tif: 0
2021-05-07 17:38:33,366 | ERROR | '/vsis3/bucket-xxx-us-west-2/prefix/geotiff.tif' not recognized as a supported file format.
The python concurrency that wraps the rasterio/gdal reader is a common pattern like: dataset = []
with concurrent.futures.ThreadPoolExecutor() as executor:
future_to_task = {}
for task in peril_queries:
future = executor.submit(extract_geotiff_data, task)
future_to_task[future] = task
for future in concurrent.futures.as_completed(future_to_task):
try:
data = future.result()
dataset.extend(data)
except Exception as exc:
task = future_to_task[future]
LOGGER.error("Failed Task: %s", task)
raise
Where the 'extract_geotiff_data' function is basically given a task to read an s3-COG, and that function encapsulates everything to setup a rasterio.Env with gdal env-var settings, opening the s3 object and reading data from it. What is causing the `CPLReleaseMutex: Error` and how is it debugged and avoided? TIA, Darren
|
||
|
||
Re: Handle to underlying GDAL dataset
Sean Gillies
Hi Søren, It's not possible for a GDAL dataset handle to be shared between rasterio and GDAL's Python bindings. I discourage people from combining these modules. Unless one is extremely careful and understands the internals of each, it's a great way to create perplexing bugs. Rasterio's dataset.read() works at a higher level than GDAL's DatasetReadAsArray(). The latter is effectively the C function that rasterio uses internally. Rasterio has data type and nodata checking overhead that GDAL's function lacks and perhaps some logic that could be more optimized. Is the overhead really 30%? Would you be willing to use Python's timeit to race the last two lines of your code against each other? I would expect the difference to diminish as the size of the window grows.
On Tue, May 4, 2021 at 7:15 AM soren.rasmussen via groups.io <soren.rasmussen=alexandra.dk@groups.io> wrote:
--
Sean Gillies
|
||
|
||
Handle to underlying GDAL dataset
soren.rasmussen@...
Hi!
I would like to get a handle for the underlying GDAL dataset behind a RasterIO dataset. Is this possible somehow? The reason I want to do this is that, in some cases, GDAL operations are significantly faster. I have a case, where reading a subset of channels in a window with the just-released GDAL 3.3.0 is approximately 25% faster than RasterIO.
It would be nice to be able to use GDAL operations without having to open the dataset with both RasterIO and GDAL
(Note that band_list is new in GDAL 3.3.0) Best, Søren
|
||
|
||
Re: Need Help
Sean Gillies
Hello, I don't know about MSAVI2, but once you have a numpy array of 0 and 1, you can mask another array using https://numpy.org/doc/stable/reference/generated/numpy.where.html. Rasterio only reads and writes datasets, it doesn't have built in analysis functions. All that is left up to the user and numpy. For band names, if you create a new dataset in "w" mode or open an existing dataset in "r+" mode, you can pass a sequence of strings as shown in https://github.com/mapbox/rasterio/blob/master/tests/test_descriptions.py#L14.
On Sat, May 1, 2021 at 7:05 AM <serignemansour.diene@...> wrote: First : I want to know how could i change the name of raster images’s band? Because i have a multispectral image with 8 band and the name is none when i do image.descrptions. --
Sean Gillies
|
||
|
||
Need Help
Serigne Mansour DIENE
First : I want to know how could i change the name of raster images’s band? Because i have a multispectral image with 8 band and the name is none when i do image.descrptions.
Second : On this multispectral image, I want to compute an MSAVI2 spectral index and extract all non-ground pixels to mask another image. MSAVI2 allows to mask the ground by giving it values 0 or 1. Best regards
|
||
|
||
snippet for minimum bounding snapped window?
Gregory, Matthew
Hi all,
Often I'm given unsnapped coordinates and want to find the minimum bounding snapped box based on a rasterio Dataset. I'm guessing someone here has a better way of doing it or I'm missing a method somewhere. Here's a function with some comments: def get_minimum_snapped_extent(src, bounds): # Get window from bounds (assume bounds is sequence of # left, bottom, right, top w1 = src.window(*bounds) # Round to include top-left corner (op="floor" is default) w2 = w1.round_offsets() # Union these to get a new window w3 = rasterio.windows.union(w1, w2) # Round shape to buffer to bottom-right corner w4 = w3.round_shape(op="ceil") # Return the coordinates associated with this window return src.window_bounds(w4) Obviously, I don't need all the temp variables, but is there a more succinct way of doing this? thanks, matt
|
||
|
||
Rasterio 1.2.2
Sean Gillies
Hi all, Rasterio 1.2.2 source distribution and wheels for selected platforms are on PyPI now. The wheels include GDAL 3.2.2 and PROJ 7.2.1. See https://github.com/mapbox/rasterio/blob/master/CHANGES.txt#L13 for a list of the changes. Sean Gillies
|
||
|
||
Re: Help Saving .Tif File after Mask
Hi Luke, thanks for the reply. Your code has helped me move past my previous error.
toggle quoted messageShow quoted text
I had to delete the indexes argument when calling `write' but apart from that, it works like a charm. Thank you for your help!
On Wed, Apr 7, 2021 at 02:41 AM, Luke wrote:
|
||
|
||
Re: Using rasterio to mask/crop a raster results in AttributeError
Luke
You can also use json:
|
||
|
||
Re: Help Saving .Tif File after Mask
rasterio.mask.mask returns a two element tuple - element 0 is the array, element 1 is the transform. Try something like (untested):
# Mask Raster masked_raster, transform = rasterio.mask.mask(file, poly, crop=True, filled=True, invert=False)
with rasterio.open('/data/my_output.tif', 'w',
driver='GTiff',
height=masked_raster.shape[1],
width=masked_raster.shape[2],
count=1,
dtype=rasterio.ubyte,
crs=file.crs,
transform=transform
) as outfile:
outfile.write(masked_raster, indexes=1)
|
||
|
||
Help Saving .Tif File after Mask
Simon Tarr <si.tarr@...>
I have just used rasterio.mask.mask() to crop a larger raster using a Django PolygonField() polygon:
I now need to save the object masked_raster to disk. I have tried with the following code:
I'm clearly passing the wrong data from masked_raster and I don't understand what I'm supposed to input for the transform argument. Any help would be gratefully received!
|
||
|
||
Re: Using rasterio to mask/crop a raster results in AttributeError
Simon Tarr <si.tarr@...>
Thank you. I used the below to remove the quotes:
import ast poly = ast.literal_eval(obj.job_loc.geojson)
|
||
|
||
Re: Using rasterio to mask/crop a raster results in AttributeError
Hi,
You have a len = 1 list the only element of which is a string. That's what the error message tells you, essentially. Get rid of the quotes and you'l have a one-item list the which is your dict :
var2 = [{ "type": "Polygon", "coordinates": [ [ [
0.906372070312503, 52.320232076097348 ], [ 0.823974609375,
52.108192097463231 ], [ 1.170043945312496, 52.141916831668233 ], [
1.170043945312496, 52.313516199748072 ], [ 0.906372070312503,
52.320232076097348 ] ] ] }] Cheers,
|
||
|
||
Using rasterio to mask/crop a raster results in AttributeError
si.tarr@...
I'm trying to use
I also need a
I then attempt the mask with:
However, I get the error:
I have tried following this (i.e. adding my JSON to a list) but I still get the error. Can someone help me mask this raster?
|
||
|
||
Is it possible to build a .vrt file from multiple files with Rasterio?
I would like to build a vrt file from multiple dataset. I know the Does anyone have a simple example ? Is it even possible ?
My use case is very simple, when I export a big image from GEE it's tiled in smaller .tif. instead of merging them all and get a 10 Gb .tif, I would like to create a .vrt to gather them.
|
||
|
||
Re: Clarification on usage with QGIS
Ari Meyer
That's good to hear, Christina -- thanks! Ari
On Thu, Mar 25, 2021 at 6:59 PM Ratcliff, Christina (A&F, Waite Campus) <christina.ratcliff@...> wrote: Hi Ari,
|
||
|
||
Re: Clarification on usage with QGIS
Ratcliff, Christina (A&F, Waite Campus)
Hi Ari,
Yes, I've also confirmed my plugin & rasterio runs in QGIS 3.18 without any problems. I have similar GDAL & GDAL backwards compatibility packages installed and haven't experienced issues with QGIS so it should be fine. I'm not a big user of Grass or SAGA, so can't comment on them. Christina
|
||
|
||
Re: Clarification on usage with QGIS
Ari Meyer
Hi Christina, FYI, I just installed QGIS 3.18.1 and added rasterio. Here's the relevant snippet of the installer output: gdal (3.1.4-3) The GDAL/OGR library and commandline tools Required by: grass, python3-rasterio, gpsbabel, python3-gdal, python3-fiona, qgis-common gdal111dll (1.11.3-1) The GDAL/OGR 1.11 DLL (backward compability package) Required by: liblas gdal204dll (2.4.3-1) The GDAL/OGR 2.4 DLL (backward compability package) Required by: saga-ltr ... liblas (1.8.0-1) The libLAS commandline utilities Required by: grass So it appears that QGIS actually installed 3 versions of GDAL based on the requirements we have (including GRASS and SAGA). Do you think this could be an issue? So far we haven't seen any actual problems because of this. Thanks, Ari
Much appreciated, Christina. Will try this out.
|
||
|
||
Re: Rasterio read outshape resample does not give correct array
Loïc Dutrieux
With count data, like population, it only makes sense to aggregate using a sum resampling function (like you did with block_reduce).
Using rasterio you can perform a decimated read (setting out_shape) using resampling=Resampling.sum. I see from the docs that it requires GDAL >= 3.1 though (https://rasterio.readthedocs.io/en/latest/api/rasterio.enums.html#rasterio.enums.Resampling) It can give unexpected results if the file has precomputed overviews, then no matter which resampling method you specify, rasterio won't be able to ignore them (unless built against GDAL >= 3.3 which is very unlikely at the moment). cheers, Loïc ________________________________________ From: main@rasterio.groups.io <main@rasterio.groups.io> on behalf of tinak.contact@... <tinak.contact@...> Sent: 17 March 2021 07:06:04 To: main@rasterio.groups.io Subject: Re: [rasterio] Rasterio read outshape resample does not give correct array [Edited Message Follows] [Reason: added gdal option ] Thank you very much, I found scikit block_reduce<https://urldefense.com/v3/__https://scikit-image.org/docs/dev/api/skimage.measure.html*block-reduce__;Iw!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBB-xB81wfd0QZgYJdbZuqh9ZDbxz4GMJQqQ$> which did the trick, but I was wondering if there are any gdal or rasterio functions who could do the same. I found from (here<https://urldefense.com/v3/__https://gis.stackexchange.com/questions/152661/downsampling-geotiff-using-summation-gdal-numpy__;!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBB-xB81wfd0QZgYJdbZuqh9ZDbxz3Jieug0$>) and ( here<https://urldefense.com/v3/__https://gdal.org/programs/gdalwarp.html__;!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBB-xB81wfd0QZgYJdbZuqh9ZDbxzYF_dZ3o$>) that GDAL 3.1 or later does have -r sum or -r average as interpolation method. However, if I apply the gdal translate command I am not getting the correct numbers. Could this be a problem with my gdal? Solution with scikit block_reduce :<https://urldefense.com/v3/__https://scikit-image.org/docs/dev/api/skimage.measure.html*block-reduce__;Iw!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBB-xB81wfd0QZgYJdbZuqh9ZDbxz4GMJQqQ$> from skimage.measure import block_reduce import matplotlib.pyplot as plt factor = 2 with rasterio.open("mwi_ppp_2020_UTM.tif") as dataset: data = dataset.read(1) print('population file', data[np.where(data>0)].sum()) # give zero to no dataval data=np.where(data<=0,0, data) # reduce by factor x=block_reduce(data, block_size=(factor,factor), func=np.sum) # mask out no data vals x_masked=np.ma.masked_equal(x,0) plt.imshow(x_masked) plt.colorbar() print('population resampled',x_masked.sum()) [cid:attach_0_166D0C1E101B79CD_3895@groups.io] Using gdal translate -r average (decrease by factor 10) import subprocess subprocess.call("""gdal_translate -tr 0.0083 0.0083 -r average\ /Users/admin/Downloads/mwi_ppp_2020.tif gdal_test.tif""", shell=True) Population after 206120.61
|
||
|
||
Re: Rasterio read outshape resample does not give correct array
Thank you very much, I found scikit
block_reduce which did the trick, but I was wondering if there are any gdal or rasterio functions who could do the same. I found from (here) and ( here) that GDAL 3.1 or later does have
as interpolation method. |
||
|