Re: MemoryFile loses Profile information
ciaran.evans@...
Revised: Don't require the
For reference, doing:
results in a correctly geocoded GeoTiff. Appears it is an issue with
|
|
Re: Speed up reading rasters
Sean Gillies
Hi, On Mon, Apr 27, 2020 at 2:36 AM <carlogarro@...> wrote: Hello, thank you so much for your recommendation, it speed it up x5. Very useful. Now I am having a problem that i do not understand. I can't say for sure about the time differences because I don't know much about your data files or your computer. However, know this: GDAL's I/O system caches blocks of raster data in memory, the size of the cache is generally 5% of your computers memory, and windowed reads may or may not be served directly from the cache depending on their size and adjacency to previously read data. -- Sean Gillies
|
|
Re: MemoryFile loses Profile information
ciaran.evans@...
On further inspection, when I switched to saving my outputs as GeoTIFF, the resultant images are both geo-coded fine (Disk vs. In Memory to S3). I'm guessing something is going wrong when Any suggestions on next steps?
|
|
MemoryFile loses Profile information
ciaran.evans@...
Hi there,
I'm currently re-writing a process that was stacking .jp2 bands into one image and writing the resultant stack to disk. I'm moving this up to AWS so want to use MemoryFile's instead. My code looks something like: s3_r = boto3.resource('s3') If I download that image and open it in QGIS, the image has no geo-coding at all. If I process the same image, but write to disk like: profile = source_image.profile.copy() I get a geo-coded image and it appears where I'd expect.
|
|
Re: Speed up reading rasters
Carlos García Rodríguez
Hello, thank you so much for your recommendation, it speed it up x5. Very useful. Now I am having a problem that i do not understand.
I have the following script, where i access 10 random tiles of my raster. train_data is a vector [4822,2] of pixels position in the raster. for i in range(10): idx = np.random.randint(4822) x_idx = train_data[idx][1] y_idx = train_data[idx][0] window = Window(y_idx, x_idx, 224, 224) start_time = time.time() with rasterio.open('./sentinel2_tiled.tif') as src: sentinel2 = src.read(window=window) end_time = (time.time() - start_time) I do not understand why the times of loading a window are so different, as can be seen in the following image. Do you have some explanation? Thank you once more!
|
|
Re: Speed up reading rasters
Even Rouault
On dimanche 26 avril 2020 03:04:56 CEST carlogarro@... wrote: > Hello, I need to read many huge datasets and the speed time is very > important to avoid a bottleneck. I have to read a tiff file that has 20 > bands, and a window of 224,224. Now I am doing like this, and it takes > approx 0.8seconds. > > with rasterio.open('./sentinel.tif') as src: > sentinel1_1 = src.read(window=window) > > What I realized is that if I try to read only one of the bands the required > time is approx the same, but when reading a tiff of only one band the > amount of time is 10 times shorter. > > Can I do something to speed it up? Maybe read bands in parallel, I don't > really know.
If you have control on how the creation of the TIFF file, make sure it uses Band interleaving instead of Pixel interleaving
For example, with gdal_translate can be done with -co INTERLEAVE=BAND
If the file is not tiled, adding tiling might also help.
I see in https://rasterio.readthedocs.io/en/latest/topics/profiles.html a pure rasterio way of creating such file
-- Spatialys - Geospatial professional services http://www.spatialys.com
|
|
Speed up reading rasters
Carlos García Rodríguez
Hello, I need to read many huge datasets and the speed time is very important to avoid a bottleneck.
I have to read a tiff file that has 20 bands, and a window of 224,224. Now I am doing like this, and it takes approx 0.8seconds. with rasterio.open('./sentinel.tif') as src: sentinel1_1 = src.read(window=window) What I realized is that if I try to read only one of the bands the required time is approx the same, but when reading a tiff of only one band the amount of time is 10 times shorter. Can I do something to speed it up? Maybe read bands in parallel, I don't really know. I appreciate your help. Thank you.
|
|
Re: Why does shape not return the number of channels
Sean Gillies
Correct, the dataset's shape is not the same as the shape of dataset.read().
-- Sean Gillies
|
|
Re: Issue with bounds2raster (contextily) using rasterio
geoterraimage1@...
You can try the following: from rasterio.crs import CRS
raster = rio.open( path,
"w",
driver="GTiff",
height=h,
width=w,
count=b,
dtype=str(Z.dtype.name),
crs=CRS.from_user_input('EPSG:3857'),
transform=transform,
)
|
|
Re: CRS & EPSG issues
geoterraimage1@...
So my issue is resolved. The issue was caused by having a GDAL installation on my windows machine, as well as all the GDAL paths set up. This was conflicting with the GDAL setup in each conda environment. I uninstalled all GDAL core and bindings from Windows itself, and then only installed GDAL in the Anaconda environment. No system variables were set after the GDAL was installed in the environment.
|
|
Issue with bounds2raster (contextily) using rasterio
Tony
Hi,
I've been using bounds2raster in contextily to save some raster tiles from OSM, but my code has started throwing up this error. >8--------------------------------------------------------
File "rasterio\_crs.pyx", line 327, in rasterio._crs._CRS.from_user_input CRSError: The WKT could not be parsed. OGR Error code 6
>8-------------------------------------------------------- It worked before, so I am not sure what has changed. I've looked around online for similar problems and haven't found anything. I am using an Anaconda installation on Windows and have rasterio 1.0.21 and contextily 1.0rc2. Contexily calls rasterio with the following in the tile.py file: >8--------------------------------------------------------
raster = rio.open( path,
"w",
driver="GTiff",
height=h,
width=w,
count=b,
dtype=str(Z.dtype.name),
crs="epsg:3857",
transform=transform,
)
>8--------------------------------------------------------The full stack trace is: >8--------------------------------------------------------
img, ext = ctx.bounds2raster(w, s, e, n, 'test.tif', ll=True) File "C:\ProgramData\Anaconda3\lib\site-packages\contextily\tile.py", line 117, in bounds2raster
transform=transform,
File "C:\ProgramData\Anaconda3\lib\site-packages\rasterio\env.py", line 423, in wrapper
return f(*args, **kwds)
File "C:\ProgramData\Anaconda3\lib\site-packages\rasterio\__init__.py", line 225, in open
**kwargs)
File "rasterio\_io.pyx", line 1182, in rasterio._io.DatasetWriterBase.__init__
File "rasterio\_io.pyx", line 1222, in rasterio._io.DatasetWriterBase._set_crs
File "C:\ProgramData\Anaconda3\lib\site-packages\rasterio\crs.py", line 434, in from_user_input
obj._crs = _CRS.from_user_input(value, morph_from_esri_dialect=morph_from_esri_dialect)
File "rasterio\_crs.pyx", line 327, in rasterio._crs._CRS.from_user_input
CRSError: The WKT could not be parsed. OGR Error code 6
>8--------------------------------------------------------
Do you have any ideas for me to try?Thanks in advance, Tony
|
|
Re: CRS & EPSG issues
geoterraimage1@...
On my windows PC, I am running Anaconda to handle my environments. My guess is that the issue resides in the GDAL installation on windows, that was performed prior to installing anaconda. In terms of Conda environments and GDAL/Rasterio, should we treat each environment with its own GDAL installation and own routes to GDAL_DATA path's? I appreciate the assistance, but how can I ensure correct environment mappings of PROJ to the proj.db that up to date with the correct db schema in the proj.db file? Is there a good rule of thumb to not stuff this up in future?
|
|
Re: CRS & EPSG issues
Even Rouault
> ERROR 1: PROJ: proj_create_from_database: cannot build geodeticCRS 4326: > SQLite error on SELECT name, ellipsoid_auth_name, ellipsoid_code, > prime_meridian_auth_name, prime_meridian_code, area_of_use_auth_name, > area_of_use_code, publication_date, deprecated FROM geodetic_datum WHERE > auth_name = ? AND code = ?: no such column: publication_date Traceback
Smells like you have PROJ 6.x.y version pointing to the proj.db of another PROJ 6.z.t version, since there were a few database schema changes. Make sure your PROJ_LIB is correctly set.
-- Spatialys - Geospatial professional services http://www.spatialys.com
|
|
CRS & EPSG issues
geoterraimage1@...
(flask-cog-terracotta) PS E:\00_UBUNTU_DRIVE\WATER_MONITORING_APPLICATION> python
Python 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 22:22:21) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import rasterio
>>> rasterio.__version__
'1.1.3'
>>> from rasterio.crs import CRS
>>> crs = CRS.from_user_input("EPSG:4326")
ERROR 1: PROJ: proj_create_from_database: cannot build geodeticCRS 4326: SQLite error on SELECT name, ellipsoid_auth_name, ellipsoid_code, prime_meridian_auth_name, prime_meridian_code, area_of_use_auth_name, area_of_use_code, publication_date, deprecated FROM geodetic_datum WHERE auth_name = ? AND code = ?: no such column: publication_date
Traceback (most recent call last):
File "rasterio\_crs.pyx", line 363, in rasterio._crs._CRS.from_user_input
File "rasterio\_err.pyx", line 194, in rasterio._err.exc_wrap_ogrerr
rasterio._err.CPLE_BaseError: OGR Error code 6
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\jens.hiestermann\.conda\envs\flask-cog-terracotta\lib\site-packages\rasterio\crs.py", line 459, in from_user_input
obj._crs = _CRS.from_user_input(value, morph_from_esri_dialect=morph_from_esri_dialect)
File "rasterio\_crs.pyx", line 367, in rasterio._crs._CRS.from_user_input
rasterio.errors.CRSError: The WKT could not be parsed. OGR Error code 6
>>> crs = CRS.from_user_input("+init=epsg:4326")
Warning 1: +init=epsg:XXXX syntax is deprecated. It might return a CRS with a non-EPSG compliant axis order.
>>>
Please assist. I am having trouble trying to locate the cause of the issue... is GDAL, PROJ, Rasterio, or the combination of them in an Anaconda environment. I am consistently plagued with this error msg: ERROR 1: PROJ: proj_create_from_database: cannot build geodeticCRS 4326: SQLite error on SELECT name, ellipsoid_auth_name, ellipsoid_code, prime_meridian_auth_name, prime_meridian_code, area_of_use_auth_name, area_of_use_code, publication_date, deprecated FROM geodetic_datum WHERE auth_name = ? AND code = ?: no such column: publication_date
I have set a system variable to find the projlib proj.db in the GDAL folder but still running into the same error.
|
|
rio.plot.show with colorbar?
himat15@...
How can I add a colorbar after using rio.plot.show?
I've tried a bunch of things but have gotten various errors Here's one way I tried: fig, ax = plt.subplots(figsize = (16, 16))
retted = rio.plot.show(ds, ax=ax, cmap='Greys_r')
fig.colorbar(retted, ax=ax)
plt.title("Original")
plt.show() This has error: AttributeError: 'AxesSubplot' object has no attribute 'get_array'
|
|
Why does shape not return the number of channels
himat15@...
I was pretty surprised when I realized that my bug was due to ds.shape not showing the number of channels in its shape i.e. with rio.open(f, 'r') as ds:
ds_arr = ds.read()
print(ds.profile) # {'driver': 'GTiff', 'dtype': 'float32', 'nodata': 0.0, 'width': 890, 'height': 3080, 'count': 3, 'crs': CRS.from_epsg(32629), 'transform': Affine(15.0, 0.0, 232060.0,
0.0, -15.0, 1438430.0), 'tiled': False, 'interleave': 'pixel'} print(ds.shape) # (3080, 890)
print(ds_arr.shape) # (3, 3080, 890) I assume this is the expected behavior, but was just taken aback by this.
|
|
Re: Inverted axis in numpy
Armstrong Manuvakola Ezequias Ngolo
Hi Gabriel,
I am not sure if I understood what you meant!!
Were you expecting to have the width in the first axis?
I think this behaviour (height first) is normal in rasterio and in many libraries used to manipulate images!!
But if you really wanna change the axis you can use something like numpy.transpose (recommended) or numpy.rollaxis
Good luck!!!
De: main@rasterio.groups.io <main@rasterio.groups.io> em nome de gabriel@... <gabriel@...>
Enviado: 11 de abril de 2020 22:11 Para: main@rasterio.groups.io <main@rasterio.groups.io> Assunto: [rasterio] Inverted axis in numpy Hi everyone. I started using rasterio a few days ago to read some DSM/DTM and process them using numba.
|
|
Inverted axis in numpy
gabriel@...
Hi everyone. I started using rasterio a few days ago to read some DSM/DTM and process them using numba.
|
|
Re: Asyncio + Rasterio for slow network requests?
Dion Häfner
Hey Sean,
toggle quoted messageShow quoted text
Sorry, I should have been clearer. As it stands, my statement is false: GDAL is of course designed to be thread-safe, so doing concurrent reads in different threads *should* work. But in our experience, it doesn't, to the point that we have given up on threads entirely. Relevant issues from last year: https://github.com/mapbox/rasterio/issues/1686 https://github.com/OSGeo/gdal/issues/1960 https://github.com/OSGeo/gdal/issues/1244 Even though GDAL#1244 was closed as fixed, we still observed the problem, so I suspect there is another race condition somewhere within GDAL. Anyway, this wasn't meant as a general statement, just a personal word of advice. To me, multiprocessing seems like a saner alternative at the moment, but YMMV. Best, Dion
On 30/03/2020 23.38, Sean Gillies via Groups.Io wrote:
Hi Kyle, Dion: --
Dion Häfner PhD Student Niels Bohr Institute Physics of Ice, Climate and Earth University of Copenhagen Tagensvej 16, DK-2200 Copenhagen, DENMARK _.~"~._.~"~._.~"~._.~"~._.~"~._.~"~._.~"~._
|
|
Re: Asyncio + Rasterio for slow network requests?
Sean Gillies
Hi Kyle, Dion: On Mon, Mar 30, 2020 at 1:41 PM <kylebarron2@...> wrote: Sorry for the slow response. As Vincent noted, just moving back to GDAL 2.4 made the process ~8x faster, from 1.7s to read to ~200ms to read each source tile. Thank you for the details.
Dion, can you say a little more about reads not being thread-safe? It's intended that we can call GDAL's RasterIO functions in different threads concurrently as long as we don't share dataset handles between threads. If we observe otherwise, then there is a GDAL bug that we can fix. There is an additional consideration for VRTs explained in https://gdal.org/drivers/raster/vrt.html#multi-threading-issues. If we have multiple VRTs, used in different threads, pointing to the same URLs, we need to take an extra step to prevent GDAL from accidentally sharing those non-VRT dataset handles between the threads. Sean Gillies
|
|