Date   

Re: Tracking down a segmentation fault

Angus Dickey
 

Thanks to both Sean and Even for their responses.

I ended up compiling GDAL 3.3.0 (to include the debug symbols), then building rasterio on top of that. The backtrace (full bt is attached) now provides some more information:

#0  GDALDatasetPool::_CloseDatasetIfZeroRefCount (this=0x7fffb03c4940, pszFileName=pszFileName@entry=0x7fffa85946f0 "/vsis3/bucket/path/01.cog.tiff", pszOwner=pszOwner@entry=0x7fffa83c6eb0 "0x7fffa84f38c0") at gdalproxypool.cpp:378
#1  0x00007fffcb5034a2 in GDALDatasetPool::CloseDatasetIfZeroRefCount (pszFileName=0x7fffa85946f0 "/vsis3/bucket/path/01.cog.tiff", eAccess=eAccess@entry=GA_ReadOnly, pszOwner=pszOwner@entry=0x7fffa83c6eb0 "0x7fffa84f38c0")at gdalproxypool.cpp:520

Seems the issue is here in GDAL. I am not sure if this is a bug or am am doing something "off label". For context I am reading a VRT stored in S3 that is made up of a series of COGS.

Should I be moving this out of the rasterio list to GDAL? It might not be easy to replicate with GDAL alone though.

Thanks again for any thoughts,

Angus

 


Re: Question about code fragment in features.geometry_window()

Sean Gillies
 

Hi,

On Tue, Aug 17, 2021 at 11:35 AM Shinji Suzuki <shinji.suzuki@...> wrote:
Oops, I overlooked the fact that min/max was computed after (x,y) gets
transformed.
Please disregard point 2.


On Wed, Aug 18, 2021 at 2:22 AM Shinji Suzuki via groups.io
<shinji.suzuki=gmail.com@groups.io> wrote:
>
> Hi.
> Starting this line, bounds extended by (pad_x, pad_y) are computed and
> later get transformed.
> https://github.com/mapbox/rasterio/blob/master/rasterio/features.py#L451
>
> What  I don't understand are;
> 1. 'pad_x' is subtracted from 'bottom'. Shouldn't 'pad_y' be the one
> that should be related to 'bottom'?
> 2. Same value is computed twice.(e.g. left - pad_x). Since the
> computed values are used later only for obtaining max and min. There
> are no point in yielding a value multiple times.
>
> If my understanding is correct that this code is for computing the
> bounding box after transformation, should not this be changed as
> follows?
>
> diff --git a/rasterio/features.py b/rasterio/features.py
> index 768f6429..e0764e63 100644
> --- a/rasterio/features.py
> +++ b/rasterio/features.py
> @@ -451,12 +451,12 @@ def geometry_window(
>      xs = [
>          x
>          for (left, bottom, right, top) in all_bounds
> -        for x in (left - pad_x, right + pad_x, right + pad_x, left - pad_x)
> +        for x in (left - pad_x, right + pad_x, right - pad_x, left + pad_x)
>      ]
>      ys = [
>          y
>          for (left, bottom, right, top) in all_bounds
> -        for y in (top + pad_y, top + pad_y, bottom - pad_x, bottom - pad_x)
> +        for y in (top + pad_y, top - pad_y, bottom - pad_y, bottom + pad_y)
>      ]
>
>      rows1, cols1 = rowcol(
>
> Thank you for reading.

Indeed I believe you have found a bug! We're tracking it at https://github.com/mapbox/rasterio/issues/2266.

Thanks,

--
Sean Gillies


Re: Tracking down a segmentation fault

Even Rouault
 



I did a shallow search of the GDAL issue tracker and found https://github.com/OSGeo/gdal/issues/3189.

Likely unrelated: the above issue was specific to VRT pansharpening.

-- 
http://www.spatialys.com
My software is free, but my time generally not.


Re: Tracking down a segmentation fault

Sean Gillies
 

Hi Angus,

On Wed, Aug 11, 2021 at 9:41 PM Angus Dickey <angus@...> wrote:
We have a Chalice app that uses Rasterio to read COGs from an AWS S3 bucket, do some processing, and deliver the result as XYZ tiles.

Recently we upgraded the rasterio version and started seeing segmentation faults killing the chalice development server. Rasterio 1.2.3 and lower work fine, but 1.2.4 and up segfaults after a small number of requests.

The app uses a python 3.6 virtual environment and the rasterio manylinux wheels (which are awesome BTW). The backtrace (from gdb) seems to show the problem is in libgdal but doesn't provide much to go on:

Thread 13 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff9e5d8700 (LWP 3887)]
0x00007fffad92e0b3 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
(gdb) bt
#0  0x00007fffad92e0b3 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#1  0x00007fffad92e552 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#2  0x00007fffad92e5b1 in GDALProxyPoolDataset::~GDALProxyPoolDataset() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#3  0x00007fffad92e679 in GDALProxyPoolDataset::~GDALProxyPoolDataset() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#4  0x00007fffad8d0fcf in GDALDataset::ReleaseRef() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#5  0x00007fffad86448c in VRTSimpleSource::~VRTSimpleSource() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#6  0x00007fffad864569 in VRTComplexSource::~VRTComplexSource() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#7  0x00007fffad863e8a in VRTSourcedRasterBand::CloseDependentDatasets() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#8  0x00007fffad863eeb in VRTSourcedRasterBand::~VRTSourcedRasterBand() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#9  0x00007fffad863f59 in VRTSourcedRasterBand::~VRTSourcedRasterBand() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#10 0x00007fffad8d67ef in GDALDataset::~GDALDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#11 0x00007fffad8409f2 in VRTDataset::~VRTDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#12 0x00007fffad840a99 in VRTDataset::~VRTDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#13 0x00007fffad92dd11 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#14 0x00007fffad92e311 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#15 0x00007fffad92e62c in GDALProxyPoolDataset::~GDALProxyPoolDataset() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#16 0x00007fffad92e679 in GDALProxyPoolDataset::~GDALProxyPoolDataset() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#17 0x00007fffad8d0fcf in GDALDataset::ReleaseRef() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#18 0x00007fffad86448c in VRTSimpleSource::~VRTSimpleSource() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#19 0x00007fffad864569 in VRTComplexSource::~VRTComplexSource() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#20 0x00007fffad863e8a in VRTSourcedRasterBand::CloseDependentDatasets() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#21 0x00007fffad863eeb in VRTSourcedRasterBand::~VRTSourcedRasterBand() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#22 0x00007fffad863f59 in VRTSourcedRasterBand::~VRTSourcedRasterBand() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#23 0x00007fffad8d67ef in GDALDataset::~GDALDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#24 0x00007fffad8409f2 in VRTDataset::~VRTDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#25 0x00007fffad840a99 in VRTDataset::~VRTDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#26 0x00007fffaec0ac99 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/_base.cpython-36m-x86_64-linux-gnu.so
#27 0x00007fffaec21216 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/_base.cpython-36m-x86_64-linux-gnu.so
#28 0x00007fffaec2d5d0 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/_base.cpython-36m-x86_64-linux-gnu.so
#29 0x0000000000566bbc in _PyCFunction_FastCallDict ()
#30 0x00000000005a4cd1 in _PyObject_FastCallDict ()
#31 0x00000000005a4fb8 in PyObject_CallFunctionObjArgs ()
#32 0x000000000050d698 in _PyEval_EvalFrameDefault ()
#33 0x00000000005095c8 in ?? ()
#34 0x000000000050a2fd in ?? ()
#35 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#36 0x00000000005095c8 in ?? ()
#37 0x000000000050a2fd in ?? ()
#38 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#39 0x0000000000507be4 in ?? ()
#40 0x0000000000588e5c in ?? ()
#41 0x000000000059fd0e in PyObject_Call ()
#42 0x000000000050d256 in _PyEval_EvalFrameDefault ()
#43 0x00000000005095c8 in ?? ()
#44 0x000000000050a2fd in ?? ()
#45 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#46 0x0000000000507be4 in ?? ()
#47 0x0000000000509900 in ?? ()
#48 0x000000000050a2fd in ?? ()
#49 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#50 0x0000000000507be4 in ?? ()
#51 0x0000000000509900 in ?? ()
#52 0x000000000050a2fd in ?? ()
#53 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#54 0x00000000005095c8 in ?? ()
#55 0x000000000050a2fd in ?? ()
#56 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#57 0x0000000000508cd5 in _PyFunction_FastCallDict ()
#58 0x0000000000594a01 in ?? ()
#59 0x000000000054a971 in ?? ()
#60 0x00000000005a9dac in _PyObject_FastCallKeywords ()
#61 0x000000000050a433 in ?? ()
#62 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#63 0x0000000000507be4 in ?? ()
#64 0x0000000000508ec2 in _PyFunction_FastCallDict ()
#65 0x0000000000594a01 in ?? ()
#66 0x000000000054a971 in ?? ()
#67 0x00000000005a9dac in _PyObject_FastCallKeywords ()
#68 0x000000000050a433 in ?? ()
#69 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#70 0x0000000000508cd5 in _PyFunction_FastCallDict ()
#71 0x0000000000594a01 in ?? ()
#72 0x000000000054a971 in ?? ()
#73 0x00000000005a9dac in _PyObject_FastCallKeywords ()
#74 0x000000000050a433 in ?? ()
#75 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#76 0x0000000000507be4 in ?? ()
#77 0x0000000000509900 in ?? ()
#78 0x000000000050a2fd in ?? ()
#79 0x000000000050cc96 in _PyEval_EvalFrameDefault ()
#80 0x00000000005095c8 in ?? ()
#81 0x000000000050a2fd in ?? ()
#82 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#83 0x00000000005095c8 in ?? ()
#84 0x000000000050a2fd in ?? ()
#85 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#86 0x00000000005095c8 in ?? ()
#87 0x000000000050a2fd in ?? ()
#88 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#89 0x00000000005095c8 in ?? ()
#90 0x000000000050a2fd in ?? ()
#91 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#92 0x0000000000507be4 in ?? ()
#93 0x0000000000508ec2 in _PyFunction_FastCallDict ()
#94 0x0000000000594a01 in ?? ()
#95 0x0000000000549e8f in ?? ()
#96 0x00000000005515c1 in ?? ()
#97 0x00000000005a48ec in _PyObject_FastCallDict ()
#98 0x00000000005f03bc in ?? ()
#99 0x00000000005a9dac in _PyObject_FastCallKeywords ()
#100 0x000000000050a433 in ?? ()
#101 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#102 0x00000000005095c8 in ?? ()
#103 0x000000000050a2fd in ?? ()
#104 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#105 0x0000000000508cd5 in _PyFunction_FastCallDict ()
#106 0x0000000000594a01 in ?? ()
#107 0x000000000059fd0e in PyObject_Call ()
#108 0x000000000050d256 in _PyEval_EvalFrameDefault ()
#109 0x00000000005095c8 in ?? ()
#110 0x000000000050a2fd in ?? ()
#111 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#112 0x00000000005095c8 in ?? ()
#113 0x000000000050a2fd in ?? ()
#114 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#115 0x0000000000508cd5 in _PyFunction_FastCallDict ()
#116 0x0000000000594a01 in ?? ()
#117 0x000000000059fd0e in PyObject_Call ()
#118 0x00000000005e12f2 in ?? ()
#119 0x0000000000631b94 in ?? ()
#120 0x00007ffff77ca6db in start_thread (arg=0x7fff9e5d8700) at pthread_create.c:463
#121 0x00007ffff7b0371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95


Is there a location where the libgdal version for each resterio release is documented? I presume 1.2.4 uses a newer GDAl version.

Any tips for tracking down what the problem might be in rasterio/GDAL?

Thanks for any help,

Angus


Rasterio's 1.2.3 wheels include GDAL 3.2.2. The 1.2.4 wheels include GDAL 3.3.0. We really should write these down in the readme.

I did a shallow search of the GDAL issue tracker and found https://github.com/OSGeo/gdal/issues/3189. There may be a lead in there or in another issue I didn't find.

--
Sean Gillies


Re: WarpedVRT and mixed data types

Even Rouault
 


In the second case I am surprised that the same exception is not raised. It may be because GDALCreateWarpedVRT (called by rasterio's WarpedVRT constructor) makes a mistake in determining the data types of the source bands and assigns "int32" to all of them.

The warping kernel only supports one single working data type for all bands. The  GDALWarpResolveWorkingDataType() function

https://github.com/OSGeo/gdal/blob/c6f0ea7d8a4da08e0d633eb381677c23f3934be3/gdal/alg/gdalwarper.cpp#L1439

will select the "largest" data type from input bands.

Even

--

http://www.spatialys.com
My software is free, but my time generally not.


Re: WarpedVRT and mixed data types

Sean Gillies
 

Hi,

Thanks for bringing this to the list! More response below.

On Wed, Jul 7, 2021 at 9:01 AM Thomas Maschler <thomas.maschler@...> wrote:

Hi,

I noticed some odd behavior when using WarpedVRT in combination with a VRT with mixed Data Types. 

When trying to open the mixed data type raster using rasterio, it throws an error due to no support for mixed data types. However, when trying to open the same dataset as WarpedVRT, I do not get such error. Instead it returns an all-zero array.

I would either expect to get the same error message, receive an array with all data cast to the largest data type or receive an array with mixed data types.

 

import rasterio
import subprocess
from affine import Affine

from rasterio.vrt import WarpedVRT



# First raster saved as uint16
a = np.array([[0,1,2,3,4]]).astype("uint16")

with rasterio.open("uint16.tif", "w", width=5, height=1, count=1, dtype=rasterio.uint16, nodata=0) as dst:

    dst.write(a, 1)

# Second raster saved as int32
b = np.array([[1,2,3,4,0]]).astype("int32")

with rasterio.open("int32.tif", "w", width=5, height=1, count=1, dtype=rasterio.int32, nodata=0) as dst:

    dst.write(b, 1)

# create a VRT combining both datasets into a two band raster using gdalbuildvrt

subprocess.run(["gdalbuildvrt", "-separate", "mixed_datatype.tif", "uint16.tif", "int32.tif"], capture_output=True)

# Try to open the two band raster using rasterio

with rasterio.open("mixed_datatype.tif") as src:

    print(src.profile)

    print(src.read())  # This throws an error due to mixed data types

{'driver': 'VRT', 'dtype': 'uint16', 'nodata': 0.0, 'width': 5, 'height': 1, 'count': 2, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0, 0.0, 1.0, 0.0), 'tiled': False}

Traceback (most recent call last):

  File "<input>", line 3, in <module>

  File "rasterio/_io.pyx", line 278, in rasterio._io.DatasetReaderBase.read

ValueError: more than one 'dtype' found

 

# Try to open the two band raster using WarpedVRT

with rasterio.open("mixed_datatype.tif") as src:

    with WarpedVRT(src, crs="EPSG:4326", transform=transform, width=5, height=1) as vrt:

        print(vrt.profile)

        print(vrt.read())  # This does not throw an error but returns an array with only zeros

    

{'driver': 'VRT', 'dtype': 'int32', 'nodata': 0.0, 'width': 5, 'height': 1, 'count': 2, 'crs': CRS.from_epsg(4326), 'transform': Affine(1.0, 0.0, 0.0,

       0.0, -1.0, 0.0), 'tiled': False}

[[[0 0 0 0 0]]

 [[0 0 0 0 0]]]


Rasterio has no support right now for datasets with mixed data types. This certainly does pose a problem for using VRTs such as the one above.

In the first case, an exception is raised because read() can't return an array of mixed types. This doesn't exist in numpy. And also because GDAL's GDALDatasetRasterIO only accepts a single data type: https://gdal.org/api/raster_c_api.html#_CPPv419GDALDatasetRasterIO12GDALDatasetH10GDALRWFlagiiiiPvii12GDALDataTypeiPiiii.

In the second case I am surprised that the same exception is not raised. It may be because GDALCreateWarpedVRT (called by rasterio's WarpedVRT constructor) makes a mistake in determining the data types of the source bands and assigns "int32" to all of them. To check, I ran your code on my laptop (rasterio 1.2.6 and GDAL 3.3.0) with the addition of lines to print out src.dtypes and vrt.dtypes. I *do* get the same exception in both cases. Are you using different versions of rasterio and GDAL? There were a number of VRT-related fixes in 3.3.0.

--
Sean Gillies


Re: Understanding use of epsilon in transform.rowcol

Sean Gillies
 

Hi,

Sorry for the late reply. I've just returned from vacation.

We've found that we need to account for float imprecision when inverting the forward geotransform. Specifically, we intend (with op=floor, the default) that points on the left edge of pixels in column 0 to compute as within column 0 and we want points on the boundary between column 0 and column 1 to compute as within column 1. Now, this is only an internal detail. We may yet determine that there is a better way to deal with the precision issues and I suspect there probably is a better way.


On Wed, Jul 14, 2021 at 7:16 AM <tom.russell@...> wrote:
Hi, I'm trying to understand the use of epsilon in rasterio.transform.rowcol

Specifically here - https://github.com/groutr/rasterio/blob/95d77c5e8553405a9dee373ae815072cf91d5cd1/rasterio/transform.py#L240 - where, as I read it, a point to be transformed from world coordinates to raster row/column indices is nudged by epsilon before being transformed, in the direction of the interior of the raster cell.

Why do we need to translate it that small amount before transforming? My guess is that it avoids later rounding or off-by-one errors, but I'm struggling to come up with an example or small test case.

Am I on the right lines? Is there good background material (or keywords to search on) that I'm missing for this kind of thing?

Thanks in advance, I hope this is the right forum for these questions.

--
Sean Gillies


Re: Question about code fragment in features.geometry_window()

Shinji Suzuki
 

Oops, I overlooked the fact that min/max was computed after (x,y) gets
transformed.
Please disregard point 2.


On Wed, Aug 18, 2021 at 2:22 AM Shinji Suzuki via groups.io
<shinji.suzuki=gmail.com@groups.io> wrote:

Hi.
Starting this line, bounds extended by (pad_x, pad_y) are computed and
later get transformed.
https://github.com/mapbox/rasterio/blob/master/rasterio/features.py#L451

What I don't understand are;
1. 'pad_x' is subtracted from 'bottom'. Shouldn't 'pad_y' be the one
that should be related to 'bottom'?
2. Same value is computed twice.(e.g. left - pad_x). Since the
computed values are used later only for obtaining max and min. There
are no point in yielding a value multiple times.

If my understanding is correct that this code is for computing the
bounding box after transformation, should not this be changed as
follows?

diff --git a/rasterio/features.py b/rasterio/features.py
index 768f6429..e0764e63 100644
--- a/rasterio/features.py
+++ b/rasterio/features.py
@@ -451,12 +451,12 @@ def geometry_window(
xs = [
x
for (left, bottom, right, top) in all_bounds
- for x in (left - pad_x, right + pad_x, right + pad_x, left - pad_x)
+ for x in (left - pad_x, right + pad_x, right - pad_x, left + pad_x)
]
ys = [
y
for (left, bottom, right, top) in all_bounds
- for y in (top + pad_y, top + pad_y, bottom - pad_x, bottom - pad_x)
+ for y in (top + pad_y, top - pad_y, bottom - pad_y, bottom + pad_y)
]

rows1, cols1 = rowcol(

Thank you for reading.





Question about code fragment in features.geometry_window()

Shinji Suzuki
 

Hi.
Starting this line, bounds extended by (pad_x, pad_y) are computed and
later get transformed.
https://github.com/mapbox/rasterio/blob/master/rasterio/features.py#L451

What I don't understand are;
1. 'pad_x' is subtracted from 'bottom'. Shouldn't 'pad_y' be the one
that should be related to 'bottom'?
2. Same value is computed twice.(e.g. left - pad_x). Since the
computed values are used later only for obtaining max and min. There
are no point in yielding a value multiple times.

If my understanding is correct that this code is for computing the
bounding box after transformation, should not this be changed as
follows?

diff --git a/rasterio/features.py b/rasterio/features.py
index 768f6429..e0764e63 100644
--- a/rasterio/features.py
+++ b/rasterio/features.py
@@ -451,12 +451,12 @@ def geometry_window(
xs = [
x
for (left, bottom, right, top) in all_bounds
- for x in (left - pad_x, right + pad_x, right + pad_x, left - pad_x)
+ for x in (left - pad_x, right + pad_x, right - pad_x, left + pad_x)
]
ys = [
y
for (left, bottom, right, top) in all_bounds
- for y in (top + pad_y, top + pad_y, bottom - pad_x, bottom - pad_x)
+ for y in (top + pad_y, top - pad_y, bottom - pad_y, bottom + pad_y)
]

rows1, cols1 = rowcol(

Thank you for reading.


Re: Read data from a sequence of TIF files

Sean Gillies
 

Hi Shaunak,

A virtual mosaic using GDAL's VRT format https://gdal.org/drivers/raster/vrt.html is the best-tested approach for doing this. Rasterio can open and operate on such VRT XML files.


On Mon, Aug 16, 2021 at 4:58 PM <shaunakde@...> wrote:

Hello,

I was trying to use the Global Surface Water dataset found here: https://global-surface-water.appspot.com/download 

The data is delivered as a sequence of TIF files tiles as 10x10 degree granules. For example the granule starting at 10N and 10E is here: https://storage.googleapis.com/global-surface-water/downloads2020/occurrence/occurrence_120E_40Sv1_3_2020.tif 

I was wondering what the best way to read all the data intersecting an AOI was (without having to manually merge all the data)

Shaunak


--
Sean Gillies


Re: Use GCPs to reproject a ndarray

Sean Gillies
 

Hi,

On Fri, Aug 6, 2021 at 9:20 PM <shaunakde@...> wrote:
Hello All, 

I have GCPs associated with a GeoTIFF as follows:

GroundControlPoint(row=0.0, col=0.0, x=14.674978814299008, y=69.61799429448102, z=-3.898888826370239e-05, id='1', info=''),
 GroundControlPoint(row=0.0, col=299.55555555555554, x=14.689285403048489, y=69.61991548229533, z=-3.908667713403702e-05, id='2', info=''),

and the CRS associated is "EPSG: 4326". I want to write a NumPy array with dimensions (12120, 2696) as a geotiff. I was trying to use rasterio.warp.reproject to accomplish this as follows:

warp.reproject(source=numpy_array_with_daya, gcps=gcps, src_crs=gcp_crs, dst_crs=rio.crs.CRS.from_string('EPSG:4326'))

I run into this error:

AttributeError: 'NoneType' object has no attribute 'xoff'

I experimented a bit with the reproject function, but haven't been able to fully understand how to use it well. One of the things I tried was to specify "dst_transform" (computed from the calculate_default_transform function etc. and had a bit more luck producing an output at all, but I don't think I fully understand how this function works. Any help or pointers to documentation would be greatly appreciated.


If you're specifying dst_transform, then you are using the reproject function correctly. We don't have specific documentation for this use case, but it is continuously tested: https://github.com/mapbox/rasterio/blob/02d65cfd16decc2c923413df9f8b67433699d64c/tests/test_warp.py#L1508.

Can you try calculating a dst_transform using https://rasterio.readthedocs.io/en/stable/api/rasterio.transform.html#rasterio.transform.from_gcps ? The usage would be like this:

warp.reproject(source=numpy_array_with_daya, gcps=gcps, src_crs=gcp_crs, dst_crs="EPSG:4326", dst_transform=transform.from_gcps(gcps))

--
Sean Gillies


Re: Create a Dataset from an array

Sean Gillies
 

Hi,

What you're asking for is possible, though not well abstracted by rasterio. It takes some extra GDAL configuration which I will try to explain.

GDAL's MEM format driver can read from an array that exists in memory. See https://gdal.org/drivers/raster/mem.html#dataset-name-format. So reading an existing numpy array at a very low level in GDAL is a matter of describing the numpy array's address and layout to GDAL. That information can be had from an array's "array interface". For example:

>>> import numpy
>>> arr = numpy.array(range(9), dtype="uint8").reshape(3,3)
>>> arr.__array_interface__
{'data': (22236512, False), 'strides': None, 'descr': [('', '|u1')], 'typestr': '|u1', 'shape': (3, 3), 'version': 3}

The GDAL dataset connection name for this is

>>> descr = "MEM:::DATAPOINTER=22236512,PIXELS=3,LINES=3,BANDS=1,DATATYPE=Byte" 

You can open this with rasterio in "r+" mode and attach a CRS and geo transform.

>>> with rasterio.open(descr, "r+") as dataset:
...     dataset.crs = "EPSG:4326"
...     dataset.transform = Affine.scale(2, -2) * Affine.identity()
...     print(dataset.profile)
...     print(dataset.read())
...
{'driver': 'MEM', 'dtype': 'uint8', 'nodata': None, 'width': 3, 'height': 3, 'count': 1, 'crs': CRS.from_epsg(4326), 'transform': Affine(2.0, 0.0, 0.0,
       0.0, -2.0, 0.0), 'tiled': False}
[[[0 1 2]
  [3 4 5]
  [6 7 8]]]

I don't have any such code in production so can't say anything about the performance or reliability. Obviously you have to be careful with the data pointer to avoid crashes.

On Thu, Aug 5, 2021 at 3:39 PM <ayr035@...> wrote:
Hi all,

I'm struggling to find a good way to create a dataset object from a numpy array. Does such a workflow exist in rasterio? I'm trying to avoid any unnecessary memory copies as this numpy array could be quite large and I definitely need to avoid writing to disk if at all possible.

I'm trying to do some pre-processing on rasters before a merge and need to wrap an array in a dataset-like object that supports resampling, windowed reading, etc for rasterio.merge. I saw an InMemoryRaster class which seems like it could work, but unfortunately, the read method of InMemoryRaster lacks API compatibility with Dataset.read. Is there a solution that is escaping me? In my ideal world, something like DatasetReader.from_array(arr, **arr_profile) where a dataset could be created directly from an array would be incredibly useful. Are there existing solutions to accomplish this


--
Sean Gillies


Read data from a sequence of TIF files

Shaunak De
 

Hello,

I was trying to use the Global Surface Water dataset found here: https://global-surface-water.appspot.com/download 

The data is delivered as a sequence of TIF files tiles as 10x10 degree granules. For example the granule starting at 10N and 10E is here: https://storage.googleapis.com/global-surface-water/downloads2020/occurrence/occurrence_120E_40Sv1_3_2020.tif 

I was wondering what the best way to read all the data intersecting an AOI was (without having to manually merge all the data)

Shaunak


Tracking down a segmentation fault

Angus Dickey
 

We have a Chalice app that uses Rasterio to read COGs from an AWS S3 bucket, do some processing, and deliver the result as XYZ tiles.

Recently we upgraded the rasterio version and started seeing segmentation faults killing the chalice development server. Rasterio 1.2.3 and lower work fine, but 1.2.4 and up segfaults after a small number of requests.

The app uses a python 3.6 virtual environment and the rasterio manylinux wheels (which are awesome BTW). The backtrace (from gdb) seems to show the problem is in libgdal but doesn't provide much to go on:

Thread 13 "python" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff9e5d8700 (LWP 3887)]
0x00007fffad92e0b3 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
(gdb) bt
#0  0x00007fffad92e0b3 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#1  0x00007fffad92e552 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#2  0x00007fffad92e5b1 in GDALProxyPoolDataset::~GDALProxyPoolDataset() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#3  0x00007fffad92e679 in GDALProxyPoolDataset::~GDALProxyPoolDataset() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#4  0x00007fffad8d0fcf in GDALDataset::ReleaseRef() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#5  0x00007fffad86448c in VRTSimpleSource::~VRTSimpleSource() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#6  0x00007fffad864569 in VRTComplexSource::~VRTComplexSource() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#7  0x00007fffad863e8a in VRTSourcedRasterBand::CloseDependentDatasets() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#8  0x00007fffad863eeb in VRTSourcedRasterBand::~VRTSourcedRasterBand() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#9  0x00007fffad863f59 in VRTSourcedRasterBand::~VRTSourcedRasterBand() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#10 0x00007fffad8d67ef in GDALDataset::~GDALDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#11 0x00007fffad8409f2 in VRTDataset::~VRTDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#12 0x00007fffad840a99 in VRTDataset::~VRTDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#13 0x00007fffad92dd11 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#14 0x00007fffad92e311 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#15 0x00007fffad92e62c in GDALProxyPoolDataset::~GDALProxyPoolDataset() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#16 0x00007fffad92e679 in GDALProxyPoolDataset::~GDALProxyPoolDataset() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#17 0x00007fffad8d0fcf in GDALDataset::ReleaseRef() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#18 0x00007fffad86448c in VRTSimpleSource::~VRTSimpleSource() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#19 0x00007fffad864569 in VRTComplexSource::~VRTComplexSource() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#20 0x00007fffad863e8a in VRTSourcedRasterBand::CloseDependentDatasets() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#21 0x00007fffad863eeb in VRTSourcedRasterBand::~VRTSourcedRasterBand() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#22 0x00007fffad863f59 in VRTSourcedRasterBand::~VRTSourcedRasterBand() ()
   from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#23 0x00007fffad8d67ef in GDALDataset::~GDALDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#24 0x00007fffad8409f2 in VRTDataset::~VRTDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#25 0x00007fffad840a99 in VRTDataset::~VRTDataset() () from /home/user/project/env/lib/python3.6/site-packages/rasterio/../rasterio.libs/libgdal-41ff680c.so.29.0.0
#26 0x00007fffaec0ac99 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/_base.cpython-36m-x86_64-linux-gnu.so
#27 0x00007fffaec21216 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/_base.cpython-36m-x86_64-linux-gnu.so
#28 0x00007fffaec2d5d0 in ?? () from /home/user/project/env/lib/python3.6/site-packages/rasterio/_base.cpython-36m-x86_64-linux-gnu.so
#29 0x0000000000566bbc in _PyCFunction_FastCallDict ()
#30 0x00000000005a4cd1 in _PyObject_FastCallDict ()
#31 0x00000000005a4fb8 in PyObject_CallFunctionObjArgs ()
#32 0x000000000050d698 in _PyEval_EvalFrameDefault ()
#33 0x00000000005095c8 in ?? ()
#34 0x000000000050a2fd in ?? ()
#35 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#36 0x00000000005095c8 in ?? ()
#37 0x000000000050a2fd in ?? ()
#38 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#39 0x0000000000507be4 in ?? ()
#40 0x0000000000588e5c in ?? ()
#41 0x000000000059fd0e in PyObject_Call ()
#42 0x000000000050d256 in _PyEval_EvalFrameDefault ()
#43 0x00000000005095c8 in ?? ()
#44 0x000000000050a2fd in ?? ()
#45 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#46 0x0000000000507be4 in ?? ()
#47 0x0000000000509900 in ?? ()
#48 0x000000000050a2fd in ?? ()
#49 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#50 0x0000000000507be4 in ?? ()
#51 0x0000000000509900 in ?? ()
#52 0x000000000050a2fd in ?? ()
#53 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#54 0x00000000005095c8 in ?? ()
#55 0x000000000050a2fd in ?? ()
#56 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#57 0x0000000000508cd5 in _PyFunction_FastCallDict ()
#58 0x0000000000594a01 in ?? ()
#59 0x000000000054a971 in ?? ()
#60 0x00000000005a9dac in _PyObject_FastCallKeywords ()
#61 0x000000000050a433 in ?? ()
#62 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#63 0x0000000000507be4 in ?? ()
#64 0x0000000000508ec2 in _PyFunction_FastCallDict ()
#65 0x0000000000594a01 in ?? ()
#66 0x000000000054a971 in ?? ()
#67 0x00000000005a9dac in _PyObject_FastCallKeywords ()
#68 0x000000000050a433 in ?? ()
#69 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#70 0x0000000000508cd5 in _PyFunction_FastCallDict ()
#71 0x0000000000594a01 in ?? ()
#72 0x000000000054a971 in ?? ()
#73 0x00000000005a9dac in _PyObject_FastCallKeywords ()
#74 0x000000000050a433 in ?? ()
#75 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#76 0x0000000000507be4 in ?? ()
#77 0x0000000000509900 in ?? ()
#78 0x000000000050a2fd in ?? ()
#79 0x000000000050cc96 in _PyEval_EvalFrameDefault ()
#80 0x00000000005095c8 in ?? ()
#81 0x000000000050a2fd in ?? ()
#82 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#83 0x00000000005095c8 in ?? ()
#84 0x000000000050a2fd in ?? ()
#85 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#86 0x00000000005095c8 in ?? ()
#87 0x000000000050a2fd in ?? ()
#88 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#89 0x00000000005095c8 in ?? ()
#90 0x000000000050a2fd in ?? ()
#91 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#92 0x0000000000507be4 in ?? ()
#93 0x0000000000508ec2 in _PyFunction_FastCallDict ()
#94 0x0000000000594a01 in ?? ()
#95 0x0000000000549e8f in ?? ()
#96 0x00000000005515c1 in ?? ()
#97 0x00000000005a48ec in _PyObject_FastCallDict ()
#98 0x00000000005f03bc in ?? ()
#99 0x00000000005a9dac in _PyObject_FastCallKeywords ()
#100 0x000000000050a433 in ?? ()
#101 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#102 0x00000000005095c8 in ?? ()
#103 0x000000000050a2fd in ?? ()
#104 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#105 0x0000000000508cd5 in _PyFunction_FastCallDict ()
#106 0x0000000000594a01 in ?? ()
#107 0x000000000059fd0e in PyObject_Call ()
#108 0x000000000050d256 in _PyEval_EvalFrameDefault ()
#109 0x00000000005095c8 in ?? ()
#110 0x000000000050a2fd in ?? ()
#111 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#112 0x00000000005095c8 in ?? ()
#113 0x000000000050a2fd in ?? ()
#114 0x000000000050beb4 in _PyEval_EvalFrameDefault ()
#115 0x0000000000508cd5 in _PyFunction_FastCallDict ()
#116 0x0000000000594a01 in ?? ()
#117 0x000000000059fd0e in PyObject_Call ()
#118 0x00000000005e12f2 in ?? ()
#119 0x0000000000631b94 in ?? ()
#120 0x00007ffff77ca6db in start_thread (arg=0x7fff9e5d8700) at pthread_create.c:463
#121 0x00007ffff7b0371f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95


Is there a location where the libgdal version for each resterio release is documented? I presume 1.2.4 uses a newer GDAl version.

Any tips for tracking down what the problem might be in rasterio/GDAL?

Thanks for any help,

Angus


Use GCPs to reproject a ndarray

Shaunak De
 

Hello All, 

I have GCPs associated with a GeoTIFF as follows:

GroundControlPoint(row=0.0, col=0.0, x=14.674978814299008, y=69.61799429448102, z=-3.898888826370239e-05, id='1', info=''),
 GroundControlPoint(row=0.0, col=299.55555555555554, x=14.689285403048489, y=69.61991548229533, z=-3.908667713403702e-05, id='2', info=''),

and the CRS associated is "EPSG: 4326". I want to write a NumPy array with dimensions (12120, 2696) as a geotiff. I was trying to use rasterio.warp.reproject to accomplish this as follows:

warp.reproject(source=numpy_array_with_daya, gcps=gcps, src_crs=gcp_crs, dst_crs=rio.crs.CRS.from_string('EPSG:4326'))

I run into this error:

AttributeError: 'NoneType' object has no attribute 'xoff'

I experimented a bit with the reproject function, but haven't been able to fully understand how to use it well. One of the things I tried was to specify "dst_transform" (computed from the calculate_default_transform function etc. and had a bit more luck producing an output at all, but I don't think I fully understand how this function works. Any help or pointers to documentation would be greatly appreciated.


Create a Dataset from an array

ayr035@...
 

Hi all,

I'm struggling to find a good way to create a dataset object from a numpy array. Does such a workflow exist in rasterio? I'm trying to avoid any unnecessary memory copies as this numpy array could be quite large and I definitely need to avoid writing to disk if at all possible.

I'm trying to do some pre-processing on rasters before a merge and need to wrap an array in a dataset-like object that supports resampling, windowed reading, etc for rasterio.merge. I saw an InMemoryRaster class which seems like it could work, but unfortunately, the read method of InMemoryRaster lacks API compatibility with Dataset.read. Is there a solution that is escaping me? In my ideal world, something like DatasetReader.from_array(arr, **arr_profile) where a dataset could be created directly from an array would be incredibly useful. Are there existing solutions to accomplish this


rio-mbtiles 1.6.0

Sean Gillies
 

Hi all,

Version 1.6.0 of rio-mbtiles is on PyPI now. There have been no changes since 1.6a2.

Share and enjoy,

--
Sean Gillies


rio-mbtiles 1.6a2

Sean Gillies
 

Hi all,

I've uploaded early pre-releases of rio-mbtiles 1.6 to PyPI. See https://github.com/mapbox/rio-mbtiles/blob/main/CHANGES.txt#L4 for detailed changes. In a nutshell, the new features are

* Support for 8-bit input and output
* Automatic alpha channel creation for RGB input when the --rgba option is used
* An option to keep empty tiles with no valid data

I'd love to hear reports about the pre-release.

--
Sean Gillies


Re: Creating WarpedVRT with multiple files

Sean Gillies
 

Hello,

On Thu, Jul 22, 2021 at 9:09 AM <remi.desgrange@...> wrote:
Hi,

It may be silly, but It does not seem possible to create a WarpedVRT object with multiple files ? Or is it ?

Thanks.

Rasterio's WarpedVRT class uses GDALCreateWarpedVRT (https://gdal.org/api/gdalwarp_cpp.html#_CPPv419GDALCreateWarpedVRT12GDALDatasetHiiPdP15GDALWarpOptions) and thus can wrap only a single dataset.

--
Sean Gillies


Creating WarpedVRT with multiple files

remi.desgrange@...
 

Hi,

It may be silly, but It does not seem possible to create a WarpedVRT object with multiple files ? Or is it ?

Thanks.

41 - 60 of 885