Re: Reading NetCDF file as inMemoryFile
Even Rouault
On vendredi 20 décembre 2019 09:45:45 CET vincent.sarago@gmail.com wrote:
Yes it is set to yes,You were responding to "userfaultfd support: yes" ? Hum, then I'm not sure. Are you running in a container ? Maybe there are some restrictions by default. Dunno. But you should see GDAL error messages if userfaulfd system calls fail at runtime. There are quite a lot of them in https://github.com/OSGeo/gdal/blob/master/gdal/port/cpl_userfaultfd.cpp And fallback to the magic solution of open source projects (the reason why we all use open source, right ;-) ?): take your favorite debugger and break at https://github.com/OSGeo/gdal/blob/master/gdal/frmts/netcdf/netcdfdataset.cpp#L7274 and follow what happens then... It should normally go to the call to nc_open_mem() -- Spatialys - Geospatial professional services http://www.spatialys.com
|
|
Re: Reading NetCDF file as inMemoryFile
vincent.sarago@...
Yes it is set to yes,
here the full log https://gist.github.com/vincentsarago/36473e6322336e84cf928ef445db64cc
|
|
Re: Reading NetCDF file as inMemoryFile
vincent.sarago@...
Then I get "not recognized as a supported file format."
``` >>> f = open("/local/OR_ABI-L1b-RadF-M6C04_G16_s20193221600287_e20193221609595_c20193221610025.nc", "rb")
>>> with MemoryFile(f) as mem:
... with mem.open(driver="netCDF") as mem_dst:
... print(mem_dst.meta)
...
Traceback (most recent call last):
File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__
File "rasterio/_shim.pyx", line 78, in rasterio._shim.open_dataset
File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: '/vsimem/808d1a21-9154-43a9-a7a8-459294b33fe4.' not recognized as a supported file format. ``` same when using ``` with open("/local/OR_ABI-L1b-RadF-M6C04_G16_s20193221600287_e20193221609595_c20193221610025.nc", "rb") as f: with rasterio.open(f, driver="netCDF") as src_dst: Traceback (most recent call last):
File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__
File "rasterio/_shim.pyx", line 78, in rasterio._shim.open_dataset
File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: '/vsimem/b370432d-9a53-4c40-9315-bc9ad408589d.' not recognized as a supported file format.
```
|
|
Re: Reading NetCDF file as inMemoryFile
Even Rouault
On vendredi 20 décembre 2019 07:22:11 CET vincent.sarago@gmail.com wrote:
$ more /proc/versionCan you check for the following too ? userfaultfd support: yes to check that you actually built against sufficiently recent kernel headers. -- Spatialys - Geospatial professional services http://www.spatialys.com
|
|
Re: Reading NetCDF file as inMemoryFile
Sean Gillies
Vincent, Can you try naming the driver when you open the in-memory dataset? Something like with MemoryFile(...) as memfile: with memfile.open(driver="netCDF") as dataset: ....
On Fri, Dec 20, 2019 at 8:23 AM <vincent.sarago@...> wrote:
-- Sean Gillies
|
|
Re: Reading NetCDF file as inMemoryFile
Thanks for your answer Even, and be assured that I always read the manual before asking question :-)
``` $ more /proc/version
Linux version 4.9.184-linuxkit (root@a8c33e955a82) (gcc version 8.3.0 (Alpine 8.3.0) ) #1 SMP Tue Jul 2 22:58:16 UTC 2019 ... # From gdal /configure NetCDF support: yes
NetCDF has netcdf_mem.h: yes
```... # NetCDF install ENV NETCDF_VERSION=4.6.3
# NetCDF RUN mkdir /tmp/netcdf \
&& curl -sfL https://github.com/Unidata/netcdf-c/archive/v$NETCDF_VERSION.tar.gz | tar zxf - -C /tmp/netcdf --strip-components=1
RUN cd /tmp/netcdf \
&& CPPFLAGS="-I${PREFIX}/include" LDFLAGS="-L${PREFIX}/lib" \
./configure \
--with-default-chunk-size=67108864 \
--with-chunk-cache-size=67108864 \
--prefix=$PREFIX \
--disable-static \
--enable-netcdf4 \
--enable-dap \
--with-pic \
&& make -j $(nproc) --silent && make install && make clean \
&& rm -rf /tmp/netcdf
My configuration meets the current spec from the docs, and still fallback to HDF5
|
|
Re: Reading NetCDF file as inMemoryFile
Even Rouault
On vendredi 20 décembre 2019 05:03:36 CET vincent.sarago@gmail.com wrote:
I'm seeking some advice here,RTF(anstatic)M :-) https://gdal.org/drivers/raster/netcdf.html#vsi-virtual-file-system-api-support "Since GDAL 2.4, and with Linux kernel >=4.3 and libnetcdf >=4.5, read operations on /vsi file systems are supported." When building GDAL, you must see "NetCDF has netcdf_mem.h: yes" in the summary output of ./configure If you don't meet those requirements, /vsimem/ access on netCDF4/HDF5 files will fallback to the HDF5 driver, which has proper support for pluggable I/O since libhdf5 allows it The netCDF library has no pluggable I/O layer, hence the GDAL support uses the netCDF in-memory file API combined with the Linux userfaultfd mechanism to populated the in-memory mapping with data. That said, for a file hosted in /vsimem/ (to be opposed to /vsi network file systems), we could probably improve that to avoid any Linux specificities. Even -- Spatialys - Geospatial professional services http://www.spatialys.com
|
|
Reading NetCDF file as inMemoryFile
vincent.sarago@...
I'm seeking some advice here,
I'm trying to reduce the memory/disk usage of one of my script where I try to translate a netcdf file to COG. Ideally I'd love not to save any file to disk. The problem I'm facing is when opening the file from disk everything works fine and the netcdf is recognise as `netCDF` but when load the same file in memory it is then recognized as a HDF5... ``` import rasterio
from rasterio.io import MemoryFile
src_path = "OR_ABI-L1b-RadF-M6C04_G16_s20193221600287_e20193221609595_c20193221610025.nc"
with rasterio.open(src_path) as src_dst:
print(src_dst.name)
print(src_dst.meta)
OR_ABI-L1b-RadF-M6C04_G16_s20193221600287_e20193221609595_c20193221610025.nc
{'driver': 'netCDF', 'dtype': 'float_', 'nodata': None, 'width': 512, 'height': 512, 'count': 0, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0,
0.0, 1.0, 0.0)}
with open(src_path, "rb") as f:
with rasterio.open(f) as src_dst:
with src_dst.open() as mem:
print(mem.name)
print(mem.meta)
/vsimem/8ee5dd37-9f49-47c9-bce7-a6732c72d4b5.nc
{'driver': 'HDF5', 'dtype': 'float_', 'nodata': None, 'width': 512, 'height': 512, 'count': 0, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0,
0.0, 1.0, 0.0)}
```
The problem with HDF5 is that it seems to loose any geographical information and thus is not usable for the process I'm doing. thanks for your help
|
|
Re: Define invalid regions of VRT file in a way that allows windowing
Sean Gillies
Hi Sean, You're correct, mask() does require a dataset object (no numpy array) and will read the entire dataset. Can you use multiple, smaller, VRTs? Using a shapefile is fine. If you want to remove the dependence on fiona, you could serialize the shapes to JSON and use Python's json module to decode them.
On Tue, Dec 10, 2019 at 10:21 AM rasterio via Groups.Io <rasterio=attackllama.com@groups.io> wrote: Hi all, -- Sean Gillies
|
|
Rasterio 1.1.2
Sean Gillies
Hi all, An sdist and manylinux1 and macosx wheels for pythons 2.7, 3.5-3.8 are on PyPI now. Please note that these wheels contain the base PROJ datum grids (version 1.8) and are somewhat larger than previous wheels. Sean Gillies
|
|
Re: exporting single banc with a colormap to RGB.tif
Sean Gillies
Hi, The library doesn't automatically convert single band color-mapped data to 3-band RGB data. You'll need to construct a new output array filled with values from the colormap and then write that to the output file. Numpy will help: make a new 2D array and replace index values with an [R, G, B] array from the colormap, then use numpy.moveaxis to turn this into a 3-band array that rasterio can write.
On Fri, Dec 6, 2019 at 7:25 PM Eyal Saiet <ejsaiet@...> wrote:
--
Sean Gillies
|
|
Define invalid regions of VRT file in a way that allows windowing
rasterio@...
Hi all,
I'm using a VRT file to represent a bunch of individual map tiles as one big image (following the help of Dion Haefner via this list - thanks again!). I noticed that some of the data is invalid but not marked as such (via the invalid data value). I'd like to define these regions as invalid using a file (e.g. a shapefile) in a way that would allow me to call `dataset.read(..., masked=True)` and get a masked array that masks out both data that has the dataset's invalid data value, and data that I manually define to be invalid in my file. I found the examples on the site using Fiona to load a shapefile (https://rasterio.readthedocs.io/en/stable/topics/masking-by-shapefile.html) and use it to mask some data. I figured that would be useful, so I built a shapefile with my invalid regions as polygons. I'm now having trouble reading from my VRT file with these shapes masked out. I've got the following code: import rasterio import fiona with rasterio.open(VRT_FILE) as dataset, fiona.open(SHAPE_FILE, "r") as shapefile: shapes = [feature["geometry"] for feature in shapefile] masked_data = rasterio.mask.mask(dataset, shapes, invert=True) ... but this throws the error `MemoryError: Unable to allocate array with shape (216640, 282250) and data type bool`. It seems the `rasterio.mask.mask` method tries to read the WHOLE dataset, and apparently has no way to define a window. Until now (before playing with shapefiles) I've been reading the dataset with `dataset.read(... window=Window(...))`, which lets me define a window to avoid reading the whole (huge) dataset at once. I'd like to do this for my shape-masked data too. I have two questions: - Is my approach to defining these invalid regions - i.e. with a shapefile - reasonable? Or is there a better way, ideally one that doesn't involve loading, manually masking and then writing a huge file back to disk? - If my approach is indeed reasonable, then is there a way to create a `DatasetReader`-like object that supports `.read(...)` with a window parameter, using data masked from my shapefile? Best wishes, Sean
|
|
Re: rasterio vsicurl issues with docker-compose
Frankly, I always thinking giving more context will help especially with the python binding logging gdal errors to stderr and you have to figure out how to do that. However, in my case I guess it was pretty much user error.
|
|
exporting single banc with a colormap to RGB.tif
Eyal Saiet
Hello, I could not figure out how to export a single band with a colormap as an RGB.tif A snippet of my code: with rasterio.open(new_f_cmp,'w',**raster_cmp_p) as dst: dst.write(file_np_man_cs_int8.data,1) dst.write_colormap(1, f_dic) cmap=dst.colormap(1) -- Eyal Saiet The mind is not a vessel to be filled, but a fire to be kindled. Plutarch
|
|
Including PROJ datum grids in rasterio wheels, yes or no?
Sean Gillies
Hi all, I have built new wheels for rasterio 1.1.1 here which, for the first time, contain the base PROJ datum grids (version 1.8). This increases the size of the wheels by several MB and brings the installed size up to 52MB (on a Mac): $ du rasterio 10720 rasterio/proj_data 30092 rasterio/.dylibs 164 rasterio/rio 7444 rasterio/gdal_data 52212 rasterio My question to the user community is: would you rather have heavier wheels with the base datum grids or lighter wheels without? If consensus is for even more batteries included, I will give the new wheels a unique build tag, upload them to PyPI and then delete the existing 1.1.1 wheels. Some of you may notice warnings from your build tools about hash changes. Sean Gillies
|
|
Re: rasterio vsicurl issues with docker-compose
Sean Gillies
Thank you for following up. Getting a resolution, when we can, is super important for a forum like this. In this case, the error rasterio._err.CPLE_OpenFailedError: '/vsicurl/http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif' does not exist in the file system, and is not recognized as a supported dataset name.
On Thu, Dec 5, 2019 at 7:06 PM Madhav Desetty <madhav@...> wrote: I found my problem. I had extra double quote from a copy paste around for this env var CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".TIF,.ovr,.jp2,.tif" which I was setting via .env file and referencing it in the docker-compose.yml file. Once I removed the quotes for the curl allowed extensions, it started working again. -- Sean Gillies
|
|
Re: rasterio vsicurl issues with docker-compose
I found my problem. I had extra double quote from a copy paste around for this env var CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".TIF,.ovr,.jp2,.tif" which I was setting via .env file and referencing it in the docker-compose.yml file. Once I removed the quotes for the curl allowed extensions, it started working again.
|
|
New cp38 wheels for rasterio 1.1.1
Sean Gillies
Hi all, Manylinux1 and macosx wheels for Python 3.8 (tagged cp38) are on PyPI alongside the wheels for versions 3.7 and earlier that we uploaded on 2019-11-04. I hope you find them useful! -- Sean Gillies
|
|
Re: need help installing rasterio with WebP
tgertin@...
Thank you for looking into this. I am running these commands using the conda-forge channel:
``` conda create --name rasterio_w_webp_test1 python=3.7 conda config --add channels conda-forge
conda install gdal libgdal
conda install rasterio
pip install rio-cogeo
```
Then I am using the rio cogeo command that uses WebP: ``` rio cogeo create -p webp input.tif output.tif ``` but I get the following error: ``` File "rasterio/shutil.pyx", line 139, in rasterio.shutil.copy
File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_AppDefinedError: Cannot create TIFF file due to missing codec for WEBP.
```
so I think something didn't install correctly. I don't need to use conda, but it seems like a good option because it uses virtual environments.
|
|
Re: need help installing rasterio with WebP
Howard Butler
Bitner recently added WebP to linux GDAL Conda Forge builds https://github.com/conda-forge/gdal-feedstock/pull/346 I have no idea when that will make it into Anaconda's channel though. Howard
|
|