Re: Rasterio.features.geometry_window error when geojson has GeometryCollection
filthyhands@...
Thanks again Brendan, very helpful.
|
|
Re: Rasterio.features.geometry_window error when geojson has GeometryCollection
Brendan Ward
The target release date that will include this fix is 5/1/2020 (https://github.com/mapbox/rasterio/issues/1911)
If you need changes sooner, you have a few options: 1) clone the repo and checkout the maint-1.1 branch, and rebuild the Cython files. This can be a bit involved depending on your OS / env; you'll need a compiler, GDAL, Cython, etc. (https://github.com/mapbox/rasterio/blob/master/CONTRIBUTING.rst#development-environment) 2) coerce each GeometryCollection to FeatureCollection 3) use shapely to union each GeometryCollection into a single geometry object before passing into bounds() 4) calculate bounds on each individual geometry within a GeometryCollection, then union the bounds. Here's how it was done in the Cython file (https://github.com/mapbox/rasterio/pull/1915/files#diff-83bc1acea1f8d435fd6c4798a58fc072R411-R426); you can do the same thing on your end.
|
|
Re: MemoryFile loses Profile information
ciaran.evans@...
I think the problem is that I'm trying to do the conversion in memory, not on disk. If I've read your latest reply correctly, I'd convert to JPEG2000 on disk and then copy to S3? Can I do the conversion via rasterio.shutil.copy in memory?
|
|
Re: Rasterio.features.geometry_window error when geojson has GeometryCollection
filthyhands@...
How can I patch my install of Rasterio to incorporate your changes?
|
|
Re: MemoryFile loses Profile information
Sean Gillies
Hi, On Wed, Apr 29, 2020 at 9:40 AM Guillaume Lostis <g.lostis@...> wrote:
I've only now remembered (not being a regular jp2 user) that JPEG2000 is a create-copy format (see https://gdal.org/drivers/raster/index.html) and as such is not suited for uses cases like with rasterio.open("file.jp2", "w", driver="JP2OpenJPEG", ...) as dataset: dataset.write(data) Rasterio tries to abstract over the differences between "create" and "create-copy" formats, but might falter in some cases. We're testing JPEG and PNG in the test suite, but not JPEG2000. If you're constructing a dataset from the bands of multiple other datasets, you should probably be using GeoTIFF as a format, it's well suited for this. And then convert to JPEG2000 before uploading to S3 using rasterio.shutil.copy, which calls on GDAL's GDALCreateCopy and is mostly equivalent to gdal_translate. Sean Gillies
|
|
Re: Rasterio.features.geometry_window error when geojson has GeometryCollection
filthyhands@...
That's great Brendan, much appreciated.
|
|
Re: MemoryFile loses Profile information
Even Rouault
On mercredi 29 avril 2020 08:40:07 CEST Guillaume Lostis wrote: > Like Ciaran I'm not very familiar with GDAL's python bindings, but I've > tried running a little snippet which I believe does the same thing as what > was done with `rasterio`: it writes a JP2 to a `vsimem` in-memory file, and > then writes the bytes to a file on disk. > > ```python > from osgeo import gdal > > mem, out = "/vsimem/B01.jp2", "new_B01.jp2" > gdal.Translate(mem, "B01.jp2") > > # Taken from > https://lists.osgeo.org/pipermail/gdal-dev/2016-August/045030.html f = > gdal.VSIFOpenL(mem, "rb") > gdal.VSIFSeekL(f, 0, 2) # seek to end > size = gdal.VSIFTellL(f) > gdal.VSIFSeekL(f, 0, 0) # seek to beginning > data = gdal.VSIFReadL(1, size, f) > gdal.Unlink(mem) > > with open(out, "wb") as ff: > ff.write(data) > ``` > > When I look at the `gdalinfo` of `new_B01.jp2`, it has a CRS and a > transform, so I am not able to reproduce the behavior we have with > `rasterio`. > > (Side note: Interestingly enough, the pixel values of `new_B01.jp2` have > slightly changed with respect to the original `B01.jp2` file, so there is > some data loss somewhere in the process. But maybe that is expected of the > JP2 format and could be avoided by passing extra arguments to > `gdal.Translate`?
By default, the JP2OpenJPEG driver uses a lossy compression (my personal take on JPEG2000 is that it has very limited interested when used in its lossless profile. Better use more conventional algorithms that are way faster). See https://gdal.org/drivers/raster/jp2openjpeg.html#lossless-compression for options to pass to get a lossless JPEG2000 as output.
So all in all that should be something like
gdal.Translate(mem, "B01.jp2", format = "JP2OpenJPEG", # in case you have several JP2K drivers available options = ["REVERSIBLE=YES", "QUALITY=100"])
-- Spatialys - Geospatial professional services http://www.spatialys.com
|
|
Re: Rasterio.features.geometry_window error when geojson has GeometryCollection
Brendan Ward
This looks like an oversight when we implemented bounds().
Rasterio currently supports FeatureCollection or individual geometry objects in this operation. Logged here: https://github.com/mapbox/rasterio/issues/1914 Working on a quick fix now...
|
|
Re: MemoryFile loses Profile information
Guillaume Lostis
Like Ciaran I'm not very familiar with GDAL's python bindings, but I've tried running a little snippet which I believe does the same thing as what was done with
When I look at the (Side note: Interestingly enough, the pixel values of Guillaume
|
|
Re: MemoryFile loses Profile information
ciaran.evans@...
Would you say it's worth raising an issue on GDAL then? I'm also not familiar with the Python GDAL bindings, Rasterio is my window in that haha.
|
|
Re: Read using multithreading
Sean Gillies
Hi, On Wed, Apr 29, 2020 at 2:54 AM Carlos García Rodríguez <carlogarro@...> wrote:
Dataset files can be accessed from multiple threads, but the dataset objects returned by rasterio.open can only be used by a single thread at a time. This is a constraint that we get from GDAL. You could try creating a set of dataset objects and then distribute them to the worker threads, making sure not to share them. Sean Gillies
|
|
Re: MemoryFile loses Profile information
Sean Gillies
On Wed, Apr 29, 2020 at 1:38 AM <ciaran.evans@...> wrote:
Since there's no JP2 specific code in MemoryFile and things work with the GeoTIFF driver, I suspect there's a subtle bug involving the vsimem system and one of the JP2 drivers. It would be useful to try to reproduce this with GDAL's Python bindings, but I don't have them installed and won't be able to try that. Sean Gillies
|
|
Rasterio.features.geometry_window error when geojson has GeometryCollection
filthyhands@...
I'm attempting to isolate areas in a rasterio raster object, using Fiona to open a geojson and then window the parts I want to look at. Where a geojson entry is a geometry collection rasterio.features.geometry_window throws a key error. It's expecting each entry in the geojson to hold a 'coordinates' key but in this geojson the 'coordinates' are inside a list a level below 'geometryCollection' A workaround could be to dismantle the geometry collections into separate objects. It would work but it's not ideal as the collections are there to group various styles of object together.
Linestring: [{'type': 'LineString', 'coordinates': [(144.937247997, -26.3759745069999, 0.0), Geometry Collection: {'type': 'GeometryCollection',
'geometries': [{'type': 'LineString',
'coordinates': [(145.619783254, -38.5831492689999, 0.0),
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-183-b9dff7a7db24> in <module> 10 11 # isolate a pipeline area ---> 12 pipes = rasterio.features.geometry_window(win, shapes, pad_x = 0.001, pad_y=0.001) 13 14 # print(shapes) ~\Anaconda3\envs\sat\lib\site-packages\rasterio\features.py in geometry_window(dataset, shapes, pad_x, pad_y, north_up, rotated, pixel_precision) 419 420 if not rotated: --> 421 all_bounds = [bounds(shape, north_up=north_up) for shape in shapes] 422 lefts, bottoms, rights, tops = zip(*all_bounds) 423 ~\Anaconda3\envs\sat\lib\site-packages\rasterio\features.py in <listcomp>(.0) 419 420 if not rotated: --> 421 all_bounds = [bounds(shape, north_up=north_up) for shape in shapes] 422 lefts, bottoms, rights, tops = zip(*all_bounds) 423 ~\Anaconda3\envs\sat\lib\site-packages\rasterio\features.py in bounds(geometry, north_up, transform) 374 375 geom = geometry.get('geometry') or geometry --> 376 return _bounds(geom, north_up=north_up, transform=transform) 377 378 rasterio\_features.pyx in rasterio._features._bounds() KeyError: 'coordinates'
|
|
Read using multithreading
Carlos García Rodríguez
Hello, I have an application that calls rasterio to open rasters in its main process. This application allows me to parallelize the process, so it will call the function where rasterio open is many times in parallel.
|
|
Re: Speed up reading rasters
Carlos García Rodríguez
I have already fixed it. One of the main problems was that the tiled tiff was configured with size 256 and I was reading with windows of sizes 224. From here comes the disparity of the times of reading.
Also, I found faster to read pixel-based instead of band based. But it depends on the application. Thank you all for your help!
|
|
Re: MemoryFile loses Profile information
ciaran.evans@...
As with Guillaume's answer, it's a Sentinel 2 band and has no side-car file, when I write straight to a file, the georeferencing is maintained.
|
|
Re: MemoryFile loses Profile information
Guillaume Lostis
Hi, I guess the B01.jp2 file mentioned is a Sentinel-2 band, which is standalone, there is no side-car file attached to it. I have tried downloading a single B01.jp2 file from a SAFE and running Vincent Sarago's snippet on it and I get the same result, the CRS and transform are lost when the jp2 is written through a MemoryFile. Guillaume Lostis
|
|
Re: MemoryFile loses Profile information
Sean Gillies
Hi, On Tue, Apr 28, 2020 at 8:02 AM <ciaran.evans@...> wrote:
To confirm: you're right, there's no need to seek after writing using mem.write() because the dataset API doesn't change the MemoryFile's stream position. It remains at 0, the beginning of the file. Is it possible that your B01.jp2 has auxiliary files? If so, they can be lost because MemoryFile.read() only returns bytes from the primary file and will overlook auxiliaries. For example, see the code below using rasterio's test data. $ rio insp tests/data/RGB.byte.tif >>> from rasterio.io import MemoryFile >>> profile = src.profile >>> del profile["tiled"] >>> del profile["interleave"] >>> profile["driver"] = "PNG" >>> with MemoryFile(filename="lolwut.png") as memfile: ... with memfile.open(**profile) as dataset_1: ... dataset_1.write(src.read()) ... print(dataset_1.files) ... print(dataset_1.profile) ... with memfile.open() as dataset_2: ... print(dataset_2.files) ... print(dataset_2.profile) ... with open("/tmp/lolwut.png", "wb") as f: ... f.write(memfile.read()) ... with rasterio.open("/tmp/lolwut.png") as dataset_3: ... print(dataset_3.files) ... print(dataset_3.profile) ... [] {'driver': 'PNG', 'dtype': 'uint8', 'nodata': 0.0, 'width': 791, 'height': 718, 'count': 3, 'crs': CRS.from_epsg(32618), 'transform': Affine(300.0379266750948, 0.0, 101985.0, 0.0, -300.041782729805, 2826915.0), 'tiled': False} ['/vsimem/lolwut.png', '/vsimem/lolwut.png.aux.xml'] {'driver': 'PNG', 'dtype': 'uint8', 'nodata': 0.0, 'width': 791, 'height': 718, 'count': 3, 'crs': CRS.from_epsg(32618), 'transform': Affine(300.0379266750948, 0.0, 101985.0, 0.0, -300.041782729805, 2826915.0), 'tiled': False, 'interleave': 'pixel'} 775558 ['/tmp/lolwut.png'] {'driver': 'PNG', 'dtype': 'uint8', 'nodata': 0.0, 'width': 791, 'height': 718, 'count': 3, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0, 0.0, 1.0, 0.0), 'tiled': False, 'interleave': 'pixel'} dataset_1.files is an empty list because no files are written to the /vsimem virtual filesystem until dataset_1 is closed. dataset_2.files show two files: the primary /vsimem/lolwut.png file and its auxiliary '/vsimem/lolwut.png.aux.xml' file, containing the georeferencing information. dataset_3.files shows only one file, the auxiliary file has been lost by memfile.read(). MemoryFile isn't any good for multiple-file formats or multiple-file format variants. It's fine for a profile of GeoTIFFs that keep their georeferencing and masks and overviews in a single file. Sean Gillies
|
|
Re: MemoryFile loses Profile information
ciaran.evans@...
I tried this with GDAL 2.4.2 and GDAL 3.0 too If anyone can point me to where I might look further to diagnose this, I can create an issue with some useful info, whether that's in Rasterio/GDAL :)
|
|
Re: Speed up reading rasters
Carlos García Rodríguez
I would also like to add that the first tiled read is not necessarily slow...
El lun., 27 abr. 2020 20:06, Carlos García Rodríguez via groups.io <carlogarro=gmail.com@groups.io> escribió:
|
|