Date   

Re: Reading NetCDF file as inMemoryFile

Even Rouault
 

On vendredi 20 décembre 2019 09:45:45 CET vincent.sarago@gmail.com wrote:
Yes it is set to yes,
You were responding to "userfaultfd support: yes" ?

Hum, then I'm not sure. Are you running in a container ? Maybe there are
some restrictions by default. Dunno. But you should see GDAL error messages
if userfaulfd system calls fail at runtime. There are quite a lot of them in
https://github.com/OSGeo/gdal/blob/master/gdal/port/cpl_userfaultfd.cpp

And fallback to the magic solution of open source projects (the reason why
we all use open source, right ;-) ?): take your favorite debugger and break at
https://github.com/OSGeo/gdal/blob/master/gdal/frmts/netcdf/netcdfdataset.cpp#L7274
and follow what happens then...
It should normally go to the call to nc_open_mem()

--
Spatialys - Geospatial professional services
http://www.spatialys.com


Re: Reading NetCDF file as inMemoryFile

vincent.sarago@...
 

Yes it is set to yes,

here the full log https://gist.github.com/vincentsarago/36473e6322336e84cf928ef445db64cc


Re: Reading NetCDF file as inMemoryFile

vincent.sarago@...
 

Then I get "not recognized as a supported file format."
```
>>> f = open("/local/OR_ABI-L1b-RadF-M6C04_G16_s20193221600287_e20193221609595_c20193221610025.nc", "rb")
>>> with MemoryFile(f) as mem:
...     with mem.open(driver="netCDF") as mem_dst:
...             print(mem_dst.meta)
...
Traceback (most recent call last):
  File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__
  File "rasterio/_shim.pyx", line 78, in rasterio._shim.open_dataset
  File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: '/vsimem/808d1a21-9154-43a9-a7a8-459294b33fe4.' not recognized as a supported file format.
```

same when using

```
with open("/local/OR_ABI-L1b-RadF-M6C04_G16_s20193221600287_e20193221609595_c20193221610025.nc", "rb") as f:
     with rasterio.open(f, driver="netCDF") as src_dst:

Traceback (most recent call last):
  File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__
  File "rasterio/_shim.pyx", line 78, in rasterio._shim.open_dataset
  File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: '/vsimem/b370432d-9a53-4c40-9315-bc9ad408589d.' not recognized as a supported file format.
```


Re: Reading NetCDF file as inMemoryFile

Even Rouault
 

On vendredi 20 décembre 2019 07:22:11 CET vincent.sarago@gmail.com wrote:

$ more /proc/version
Linux version 4.9.184-linuxkit (root@a8c33e955a82) (gcc version 8.3.0
(Alpine 8.3.0) ) #1 SMP Tue Jul 2 22:58:16 UTC 2019 ...
# From gdal /configure
NetCDF support: yes
NetCDF has netcdf_mem.h: yes
Can you check for the following too ?
userfaultfd support: yes
to check that you actually built against sufficiently recent kernel headers.


--
Spatialys - Geospatial professional services
http://www.spatialys.com


Re: Reading NetCDF file as inMemoryFile

Sean Gillies
 

Vincent,

Can you try naming the driver when you open the in-memory dataset? Something like

with MemoryFile(...) as memfile:
    with memfile.open(driver="netCDF") as dataset:
        ....


On Fri, Dec 20, 2019 at 8:23 AM <vincent.sarago@...> wrote:

[Edited Message Follows]

Thanks for your answer Even, and be assured that I always read the manual before asking question :-) 


```
$ more /proc/version
Linux version 4.9.184-linuxkit (root@a8c33e955a82) (gcc version 8.3.0 (Alpine 8.3.0) ) #1 SMP Tue Jul 2 22:58:16 UTC 2019
...
# From gdal /configure
  NetCDF support:            yes
  NetCDF has netcdf_mem.h:   yes
...
# NetCDF install
ENV NETCDF_VERSION=4.6.3

# NetCDF
RUN mkdir /tmp/netcdf \
&& curl -sfL https://github.com/Unidata/netcdf-c/archive/v$NETCDF_VERSION.tar.gz | tar zxf - -C /tmp/netcdf --strip-components=1

RUN cd /tmp/netcdf \
&& CPPFLAGS="-I${PREFIX}/include" LDFLAGS="-L${PREFIX}/lib" \
./configure \
--with-default-chunk-size=67108864 \
--with-chunk-cache-size=67108864 \
--prefix=$PREFIX \
--disable-static \
--enable-netcdf4 \
--enable-dap \
--with-pic \
&& make -j $(nproc) --silent && make install && make clean \
&& rm -rf /tmp/netcdf
```

My configuration meets the current spec from the docs, and still fallback to HDF5



--
Sean Gillies


Re: Reading NetCDF file as inMemoryFile

vincent.sarago@...
 
Edited

Thanks for your answer Even, and be assured that I always read the manual before asking question :-) 


```
$ more /proc/version
Linux version 4.9.184-linuxkit (root@a8c33e955a82) (gcc version 8.3.0 (Alpine 8.3.0) ) #1 SMP Tue Jul 2 22:58:16 UTC 2019
...
# From gdal /configure
  NetCDF support:            yes
  NetCDF has netcdf_mem.h:   yes
...
# NetCDF install
ENV NETCDF_VERSION=4.6.3

# NetCDF
RUN mkdir /tmp/netcdf \
&& curl -sfL https://github.com/Unidata/netcdf-c/archive/v$NETCDF_VERSION.tar.gz | tar zxf - -C /tmp/netcdf --strip-components=1

RUN cd /tmp/netcdf \
&& CPPFLAGS="-I${PREFIX}/include" LDFLAGS="-L${PREFIX}/lib" \
./configure \
--with-default-chunk-size=67108864 \
--with-chunk-cache-size=67108864 \
--prefix=$PREFIX \
--disable-static \
--enable-netcdf4 \
--enable-dap \
--with-pic \
&& make -j $(nproc) --silent && make install && make clean \
&& rm -rf /tmp/netcdf
```

My configuration meets the current spec from the docs, and still fallback to HDF5


Re: Reading NetCDF file as inMemoryFile

Even Rouault
 

On vendredi 20 décembre 2019 05:03:36 CET vincent.sarago@gmail.com wrote:
I'm seeking some advice here,

I'm trying to reduce the memory/disk usage of one of my script where I try
to translate a netcdf file to COG. Ideally I'd love not to save any file to
disk.

The problem I'm facing is when opening the file from disk everything works
fine and the netcdf is recognise as `netCDF` but when load the same file in
memory it is then recognized as a HDF5...
RTF(anstatic)M :-)

https://gdal.org/drivers/raster/netcdf.html#vsi-virtual-file-system-api-support

"Since GDAL 2.4, and with Linux kernel >=4.3 and libnetcdf >=4.5, read operations on /vsi file systems are supported."

When building GDAL, you must see "NetCDF has netcdf_mem.h: yes" in the summary output of ./configure

If you don't meet those requirements, /vsimem/ access on netCDF4/HDF5 files will
fallback to the HDF5 driver, which has proper support for pluggable I/O since libhdf5 allows it

The netCDF library has no pluggable I/O layer, hence the GDAL support uses the netCDF
in-memory file API combined with the Linux userfaultfd mechanism to populated the
in-memory mapping with data. That said, for a file hosted in /vsimem/ (to
be opposed to /vsi network file systems), we could probably improve that to avoid
any Linux specificities.

Even

--
Spatialys - Geospatial professional services
http://www.spatialys.com


Reading NetCDF file as inMemoryFile

vincent.sarago@...
 

I'm seeking some advice here, 

I'm trying to reduce the memory/disk usage of one of my script where I try to translate a netcdf file to COG. Ideally I'd love not to save any file to disk.

The problem I'm facing is when opening the file from disk everything works fine and the netcdf is recognise as `netCDF` but when load the same file in memory it is then recognized as a HDF5... 

```
import rasterio
from rasterio.io import MemoryFile
 
src_path = "OR_ABI-L1b-RadF-M6C04_G16_s20193221600287_e20193221609595_c20193221610025.nc"
 
with rasterio.open(src_path) as src_dst:
    print(src_dst.name)
    print(src_dst.meta)
 
OR_ABI-L1b-RadF-M6C04_G16_s20193221600287_e20193221609595_c20193221610025.nc
{'driver': 'netCDF', 'dtype': 'float_', 'nodata': None, 'width': 512, 'height': 512, 'count': 0, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0,
       0.0, 1.0, 0.0)}
 
with open(src_path, "rb") as f:
    with rasterio.open(f) as src_dst:
        with src_dst.open() as mem:
            print(mem.name)
            print(mem.meta)
 
/vsimem/8ee5dd37-9f49-47c9-bce7-a6732c72d4b5.nc
{'driver': 'HDF5', 'dtype': 'float_', 'nodata': None, 'width': 512, 'height': 512, 'count': 0, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0,
       0.0, 1.0, 0.0)}
```


The problem with HDF5 is that it seems to loose any geographical information and thus is not usable for the process I'm doing.

thanks for your help 


Re: Define invalid regions of VRT file in a way that allows windowing

Sean Gillies
 

Hi Sean,

You're correct, mask() does require a dataset object (no numpy array) and will read the entire dataset. Can you use multiple, smaller, VRTs? Using a shapefile is fine. If you want to remove the dependence on fiona, you could serialize the shapes to JSON and use Python's json module to decode them.

On Tue, Dec 10, 2019 at 10:21 AM rasterio via Groups.Io <rasterio=attackllama.com@groups.io> wrote:
Hi all,

I'm using a VRT file to represent a bunch of individual map tiles as one
big image (following the help of Dion Haefner via this list - thanks
again!). I noticed that some of the data is invalid but not marked as
such (via the invalid data value). I'd like to define these regions as
invalid using a file (e.g. a shapefile) in a way that would allow me to
call `dataset.read(..., masked=True)` and get a masked array that masks
out both data that has the dataset's invalid data value, and data that I
manually define to be invalid in my file.

I found the examples on the site using Fiona to load a shapefile
(https://rasterio.readthedocs.io/en/stable/topics/masking-by-shapefile.html)
and use it to mask some data. I figured that would be useful, so I built
a shapefile with my invalid regions as polygons. I'm now having trouble
reading from my VRT file with these shapes masked out. I've got the
following code:

import rasterio
import fiona

with rasterio.open(VRT_FILE) as dataset, fiona.open(SHAPE_FILE, "r") as
shapefile:
     shapes = [feature["geometry"] for feature in shapefile]
     masked_data = rasterio.mask.mask(dataset, shapes, invert=True)
     ...

but this throws the error `MemoryError: Unable to allocate array with
shape (216640, 282250) and data type bool`.

It seems the `rasterio.mask.mask` method tries to read the WHOLE
dataset, and apparently has no way to define a window. Until now (before
playing with shapefiles) I've been reading the dataset with
`dataset.read(... window=Window(...))`, which lets me define a window to
avoid reading the whole (huge) dataset at once. I'd like to do this for
my shape-masked data too.

I have two questions:

  - Is my approach to defining these invalid regions - i.e. with a
shapefile - reasonable? Or is there a better way, ideally one that
doesn't involve loading, manually masking and then writing a huge file
back to disk?
  - If my approach is indeed reasonable, then is there a way to create a
`DatasetReader`-like object that supports `.read(...)` with a window
parameter, using data masked from my shapefile?

Best wishes,


Sean


--
Sean Gillies


Rasterio 1.1.2

Sean Gillies
 

Hi all,

An sdist and manylinux1 and macosx wheels for pythons 2.7, 3.5-3.8 are on PyPI now. Please note that these wheels contain the base PROJ datum grids (version 1.8) and are somewhat larger than previous wheels.

--
Sean Gillies


Re: exporting single banc with a colormap to RGB.tif

Sean Gillies
 

Hi,

The library doesn't automatically convert single band color-mapped data to 3-band RGB data. You'll need to construct a new output array filled with values from the colormap and then write that to the output file. Numpy will help: make a new 2D array and replace index values with an [R, G, B] array from the colormap, then use numpy.moveaxis to turn this into a 3-band array that rasterio can write.

On Fri, Dec 6, 2019 at 7:25 PM Eyal Saiet <ejsaiet@...> wrote:
Hello,
I could not figure out how to export a single band with a colormap as an RGB.tif
A snippet of my code:
with rasterio.open(new_f_cmp,'w',**raster_cmp_p) as dst:
    dst.write(file_np_man_cs_int8.data,1)
    dst.write_colormap(1, f_dic)
    cmap=dst.colormap(1)

thanks


--


Eyal Saiet


The mind is not a vessel to be filled, but a fire to be kindled. Plutarch


--
Sean Gillies


Define invalid regions of VRT file in a way that allows windowing

rasterio@...
 

Hi all,

I'm using a VRT file to represent a bunch of individual map tiles as one big image (following the help of Dion Haefner via this list - thanks again!). I noticed that some of the data is invalid but not marked as such (via the invalid data value). I'd like to define these regions as invalid using a file (e.g. a shapefile) in a way that would allow me to call `dataset.read(..., masked=True)` and get a masked array that masks out both data that has the dataset's invalid data value, and data that I manually define to be invalid in my file.

I found the examples on the site using Fiona to load a shapefile (https://rasterio.readthedocs.io/en/stable/topics/masking-by-shapefile.html) and use it to mask some data. I figured that would be useful, so I built a shapefile with my invalid regions as polygons. I'm now having trouble reading from my VRT file with these shapes masked out. I've got the following code:

import rasterio
import fiona

with rasterio.open(VRT_FILE) as dataset, fiona.open(SHAPE_FILE, "r") as shapefile:
shapes = [feature["geometry"] for feature in shapefile]
masked_data = rasterio.mask.mask(dataset, shapes, invert=True)
...

but this throws the error `MemoryError: Unable to allocate array with shape (216640, 282250) and data type bool`.

It seems the `rasterio.mask.mask` method tries to read the WHOLE dataset, and apparently has no way to define a window. Until now (before playing with shapefiles) I've been reading the dataset with `dataset.read(... window=Window(...))`, which lets me define a window to avoid reading the whole (huge) dataset at once. I'd like to do this for my shape-masked data too.

I have two questions:

- Is my approach to defining these invalid regions - i.e. with a shapefile - reasonable? Or is there a better way, ideally one that doesn't involve loading, manually masking and then writing a huge file back to disk?
- If my approach is indeed reasonable, then is there a way to create a `DatasetReader`-like object that supports `.read(...)` with a window parameter, using data masked from my shapefile?

Best wishes,


Sean


Re: rasterio vsicurl issues with docker-compose

Madhav Desetty
 

Frankly, I always thinking giving more context will help especially with the python binding logging gdal errors to stderr and you have to figure out how to do that. However, in my case I guess it was pretty much user error.


exporting single banc with a colormap to RGB.tif

Eyal Saiet
 

Hello,
I could not figure out how to export a single band with a colormap as an RGB.tif
A snippet of my code:
with rasterio.open(new_f_cmp,'w',**raster_cmp_p) as dst:
    dst.write(file_np_man_cs_int8.data,1)
    dst.write_colormap(1, f_dic)
    cmap=dst.colormap(1)

thanks


--


Eyal Saiet


The mind is not a vessel to be filled, but a fire to be kindled. Plutarch


Including PROJ datum grids in rasterio wheels, yes or no?

Sean Gillies
 

Hi all,

I have built new wheels for rasterio 1.1.1 here


which, for the first time, contain the base PROJ datum grids (version 1.8). This increases the size of the wheels by several MB and brings the installed size up to 52MB (on a Mac):

$ du rasterio
10720   rasterio/proj_data
30092   rasterio/.dylibs
164     rasterio/rio
7444    rasterio/gdal_data
52212   rasterio

My question to the user community is: would you rather have heavier wheels with the base datum grids or lighter wheels without?

If consensus is for even more batteries included, I will give the new wheels a unique build tag, upload them to PyPI and then delete the existing 1.1.1 wheels. Some of you may notice warnings from your build tools about hash changes.

--
Sean Gillies


Re: rasterio vsicurl issues with docker-compose

Sean Gillies
 

Thank you for following up. Getting a resolution, when we can, is super important for a forum like this.

In this case, the error

rasterio._err.CPLE_OpenFailedError: '/vsicurl/http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif' does not exist in the file system, and is not recognized as a supported dataset name.

occurs because GDAL was configured to ignore/disregard .tif resources. It's actually an accurate error message. If anything is missing, it's the context. I'm curious to know if you (or anybody else) feels like the current set of GDAL configuration options should surface in the exception. In this case, it might have revealed the problem in seconds.

On Thu, Dec 5, 2019 at 7:06 PM Madhav Desetty <madhav@...> wrote:
I found my problem. I had extra double quote from a copy paste around for this env var CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".TIF,.ovr,.jp2,.tif" which I was setting via .env file and referencing it in the docker-compose.yml file. Once I removed the quotes for the curl allowed extensions, it started working again. 


--
Sean Gillies


Re: rasterio vsicurl issues with docker-compose

Madhav Desetty
 

I found my problem. I had extra double quote from a copy paste around for this env var CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".TIF,.ovr,.jp2,.tif" which I was setting via .env file and referencing it in the docker-compose.yml file. Once I removed the quotes for the curl allowed extensions, it started working again. 


New cp38 wheels for rasterio 1.1.1

Sean Gillies
 

Hi all,

Manylinux1 and macosx wheels for Python 3.8 (tagged cp38) are on PyPI alongside the wheels for versions 3.7 and earlier that we uploaded on 2019-11-04. I hope you find them useful!

--
Sean Gillies


Re: need help installing rasterio with WebP

tgertin@...
 

Thank you for looking into this. I am running these commands using the conda-forge channel:

```
conda create --name rasterio_w_webp_test1 python=3.7
 
conda config --add channels conda-forge
 
conda install gdal libgdal
 
conda install rasterio
 
pip install rio-cogeo
```

Then I am using the rio cogeo command that uses WebP:
```
rio cogeo create -p webp input.tif output.tif
```

but I get the following error:
```
File "rasterio/shutil.pyx", line 139, in rasterio.shutil.copy
  File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_AppDefinedError: Cannot create TIFF file due to missing codec for WEBP.
```

so I think something didn't install correctly. I don't need to use conda, but it seems like a good option because it uses virtual environments.



Re: need help installing rasterio with WebP

Howard Butler
 



On Dec 5, 2019, at 10:09 AM, Sean Gillies <sean.gillies@...> wrote:

Hi Tom,

I'm not an Anaconda user and don't know how to build packages for it. I think you'll want to find a conda channel that has rasterio packages with WebP support built in. I think that's most likely to be the conda-forge channel. If it doesn't have WebP support, I suggest requesting it at https://github.com/conda-forge/gdal-feedstock.

Bitner recently added WebP to linux GDAL Conda Forge builds https://github.com/conda-forge/gdal-feedstock/pull/346 I have no idea when that will make it into Anaconda's channel though.

Howard

301 - 320 of 698