Hi Erick,
I'm not familiar with Lustre and only slightly familiar with the details of GDAL's netCDF driver. I think, since the problem manifests with gdalinfo as well as rasterio programs, that the best source of help will be the gdal-dev list: https://lists.osgeo.org/mailman/listinfo/gdal-dev/. There have been recent discussions related to HDF5 and netCDF4 there and GDAL's developer, Even Rouault, will probably have some insights. I hate to redirect you to another email list, but gdal-dev seems to be the best place to ask in this case. When you get help there, I'll make sure to follow up here.
toggle quoted message
Show quoted text
Hi,
My team and I have been facing some problems when reading netcdf files in `LUSTRE` filesystem with `ncdump` tool regarding an HDF5 1.10.1 issue in `ubuntu:bionic` (see for example: https://github.com/ALPSCore/ALPSCore/issues/410)
We made a workaround by installing `netcdf-bin` using repositories of `ubuntu:xenial` in `bionic`. Although `ncdump` have successfully read the netcdf file, we couldn't fix it (sort of...) for `gdalinfo` and `rasterio` for example:
this works:
ncdump /LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc|head -n 20
gdalinfo /LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc |head -n 20
gdalinfo -sd 1 /LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc |head -n 20
but this neither of next lines work:
gdalinfo netcdf:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean
gdalinfo NetCDF:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean
``` ERROR 4: NetCDF:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean: No such file or directory gdalinfo failed - unable to open 'NetCDF:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean'. ```
This ultimately leads to error with `rio insp` in both next lines :
rio insp netcdf:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean
rio insp NetCDF:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean
``` ERROR:root:Exception caught during processing Traceback (most recent call last): File "rasterio/_base.pyx", line 214, in rasterio._base.DatasetBase.__init__ File "rasterio/_shim.pyx", line 64, in rasterio._shim.open_dataset File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer rasterio._err.CPLE_OpenFailedError: netcdf:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/rasterio/rio/insp.py", line 77, in insp with rasterio.open(input, mode) as src: File "/usr/local/lib/python3.6/dist-packages/rasterio/env.py", line 423, in wrapper return f(*args, **kwds) File "/usr/local/lib/python3.6/dist-packages/rasterio/__init__.py", line 216, in open s = DatasetReader(path, driver=driver, **kwargs) File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__ rasterio.errors.RasterioIOError: netcdf:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean: No such file or directory Aborted!
```
But this works for both `gdalinfo` and `rio insp` tools:
``` gdalinfo hdf5:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc://blue_mean
Driver: HDF5Image/HDF5 Dataset Files: /LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc Size is 1667, 1667 Coordinate System is `' Metadata: Conventions=CF-1.6, ACDD-1.3 date_created=2019-04-26T21:07:47.595416 geospatial_bounds=POLYGON ((-98.8618335626467 19.8093029971201,-98.8722811078025 19.3587577175966,-98.3949028300917 19.3481841431337,-98.382861955109 19.7986884105649,-98.8618335626467 19.8093029971201)) geospatial_bounds_crs=EPSG:4326 ... ```
``` rio insp hdf5:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc://blue_mean
/usr/local/lib/python3.6/dist-packages/rasterio/__init__.py:216: NotGeoreferencedWarning: Dataset has no geotransform set. The identity matrix may be returned. s = DatasetReader(path, driver=driver, **kwargs) Rasterio 1.0.23 Interactive Inspector (Python 3.6.7) Type "src.meta", "src.read(1)", or "help(src)" for more information. >>> src.meta {'driver': 'HDF5Image', 'dtype': 'int16', 'nodata': None, 'width': 1667, 'height': 1667, 'count': 1, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0, 0.0, 1.0, 0.0)}
>>> a=src.read() >>> a array([[[632, 603, 586, ..., 505, 483, 471], [567, 538, 536, ..., 468, 478, 523], [614, 580, 537, ..., 540, 633, 675], ..., [828, 810, 804, ..., 275, 268, 299], [857, 844, 823, ..., 310, 290, 307], [840, 854, 836, ..., 320, 288, 294]]], dtype=int16)
```
This is related with LUSTRE FS because if we copy our NetCDF file to /root location then:
``` this works:
gdalinfo netcdf:/root/madmex_003_37_-32_1996-01-01.nc:blue_mean
Driver: netCDF/Network Common Data Format Files: /root/madmex_003_37_-32_1996-01-01.nc Size is 1667, 1667 Coordinate System is: PROJCS["unnamed", GEOGCS["WGS 84", ... ```
``` rio insp netcdf:/root/madmex_003_37_-32_1996-01-01.nc:blue_mean Rasterio 1.0.23 Interactive Inspector (Python 3.6.7) Type "src.meta", "src.read(1)", or "help(src)" for more information. >>> src.meta {'driver': 'netCDF', 'dtype': 'int16', 'nodata': -32767.0, 'width': 1667, 'height': 1667, 'count': 1, 'crs': CRS.from_wkt('PROJCS["unnamed",GEOGCS["WGS 84",DATUM["unknown",SPHEROID["WGS84",6378137,6556752.3141]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433]],PROJECTION["Lambert_Conformal_Conic_2SP"],PARAMETER["standard_parallel_1",17.5],PARAMETER["standard_parallel_2",29.5],PARAMETER["latitude_of_origin",12],PARAMETER["central_meridian",-102],PARAMETER["false_easting",2500000],PARAMETER["false_northing",0]]'), 'transform': Affine(30.0, 0.0, 2827530.0, 0.0, -30.0, 876410.0)} >>> a=src.read() >>> a array([[[632, 603, 586, ..., 505, 483, 471], [567, 538, 536, ..., 468, 478, 523], [614, 580, 537, ..., 540, 633, 675], ..., [828, 810, 804, ..., 275, 268, 299], [857, 844, 823, ..., 310, 290, 307], [840, 854, 836, ..., 320, 288, 294]]], dtype=int16) ```
We are using `ubuntu:bionic` docker image so you can reproduce this error using next lines:
``` sudo docker run -v /LUSTRE/:/LUSTRE/ \ -v /LUSTRE/MADMEX/dir_test_mount:/temporal \ --name mounting_volume_test_bionic \ --hostname antares3-datacube \ -dit ubuntu:bionic /bin/bash ```
enter to docker container:
``` sudo docker exec -it mounting_volume_test_bionic bash ```
and execute:
``` apt-get update
export DEBIAN_FRONTEND=noninteractive && echo "America/Mexico_City" > /etc/timezone && apt-get install -y tzdata apt-get update && apt-get install -y \ wget curl \ openssh-server \ openssl \ sudo \ nano \ software-properties-common \ git \ vim \ vim-gtk \ htop \ build-essential \ libssl-dev \ libffi-dev \ cmake \ python3-dev \ python3-pip \ python3-setuptools \ ca-certificates \ postgresql-client \ libudunits2-dev \ nodejs && pip3 install --upgrade pip
#install netcdf:
echo -e 'deb http://security.ubuntu.com/ubuntu/ xenial-security universe\ndeb http://archive.ubuntu.com/ubuntu/ xenial universe' >> /etc/apt/sources.list apt update apt install -t xenial-security libcurl3-gnutls apt install -y -t xenial netcdf-bin ncdump /LUSTRE/MADMEX/madmex_003_37_-32_1996-01-01.nc | head -n 30
#Install spatial libraries add-apt-repository -y ppa:ubuntugis/ubuntugis-unstable && apt-get -qq update
apt-get install -y \ ncview \ libproj-dev \ libgeos-dev \ gdal-bin \ libgdal-dev pip install numpy && pip install GDAL==$(gdal-config --version) --global-option=build_ext --global-option='-I/usr/include/gdal' && pip install rasterio export LC_ALL=C.UTF-8 export LANG=C.UTF-8 export GDAL_DATA=/usr/share/gdal/ ```
Hope you can help us
Cheers,
Erick
|
Hi Sean, Thank you for your answer. Even answered that LUSTRE is causing this behaviour and my best option is building GDAL and libnetcdf against a libhdf5 version that plays well with LUSTRE. In ubuntu xenial we didn't have this problems so we will discuss if it's an option using a different version of libhdf5. Thank you, Erick ----- Mensaje original ----- De: "Sean Gillies" <sean.gillies@...> Para: main@rasterio.groups.io Enviados: Miércoles, 5 de Junio 2019 13:25:55 Asunto: Re: [rasterio] gdalinfo's and rasterio's reading problem in LUSTRE FS with NetCDF file (ubuntu:bionic) Hi Erick, I'm not familiar with Lustre and only slightly familiar with the details of GDAL's netCDF driver. I think, since the problem manifests with gdalinfo as well as rasterio programs, that the best source of help will be the gdal-dev list: https://lists.osgeo.org/mailman/listinfo/gdal-dev/. There have been recent discussions related to HDF5 and netCDF4 there and GDAL's developer, Even Rouault, will probably have some insights. I hate to redirect you to another email list, but gdal-dev seems to be the best place to ask in this case. When you get help there, I'll make sure to follow up here.
toggle quoted message
Show quoted text
On Wed, Jun 5, 2019 at 9:22 AM <epalacios@...> wrote: Hi,
My team and I have been facing some problems when reading netcdf files in `LUSTRE` filesystem with `ncdump` tool regarding an HDF5 1.10.1 issue in `ubuntu:bionic` (see for example: https://github.com/ALPSCore/ALPSCore/issues/410)
We made a workaround by installing `netcdf-bin` using repositories of `ubuntu:xenial` in `bionic`. Although `ncdump` have successfully read the netcdf file, we couldn't fix it (sort of...) for `gdalinfo` and `rasterio` for example:
this works:
ncdump /LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc|head -n 20
gdalinfo /LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc |head -n 20
gdalinfo -sd 1 /LUSTRE/MADMEX/dir_test_mount/ madmex_003_37_-32_1996-01-01.nc |head -n 20
but this neither of next lines work:
gdalinfo netcdf:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc: blue_mean
gdalinfo NetCDF:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc: blue_mean
``` ERROR 4: NetCDF:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean: No such file or directory gdalinfo failed - unable to open 'NetCDF:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc: blue_mean'. ```
This ultimately leads to error with `rio insp` in both next lines :
rio insp netcdf:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc: blue_mean
rio insp NetCDF:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc: blue_mean
``` ERROR:root:Exception caught during processing Traceback (most recent call last): File "rasterio/_base.pyx", line 214, in rasterio._base.DatasetBase.__init__ File "rasterio/_shim.pyx", line 64, in rasterio._shim.open_dataset File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer rasterio._err.CPLE_OpenFailedError: netcdf:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/rasterio/rio/insp.py", line 77, in insp with rasterio.open(input, mode) as src: File "/usr/local/lib/python3.6/dist-packages/rasterio/env.py", line 423, in wrapper return f(*args, **kwds) File "/usr/local/lib/python3.6/dist-packages/rasterio/__init__.py", line 216, in open s = DatasetReader(path, driver=driver, **kwargs) File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__ rasterio.errors.RasterioIOError: netcdf:/LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc:blue_mean: No such file or directory Aborted!
```
But this works for both `gdalinfo` and `rio insp` tools:
``` gdalinfo hdf5:/LUSTRE/MADMEX/dir_test_mount/ madmex_003_37_-32_1996-01-01.nc://blue_mean
Driver: HDF5Image/HDF5 Dataset Files: /LUSTRE/MADMEX/dir_test_mount/madmex_003_37_-32_1996-01-01.nc Size is 1667, 1667 Coordinate System is `' Metadata: Conventions=CF-1.6, ACDD-1.3 date_created=2019-04-26T21:07:47.595416 geospatial_bounds=POLYGON ((-98.8618335626467 19.8093029971201,-98.8722811078025 19.3587577175966,-98.3949028300917 19.3481841431337,-98.382861955109 19.7986884105649,-98.8618335626467 19.8093029971201)) geospatial_bounds_crs=EPSG:4326 ... ```
``` rio insp hdf5:/LUSTRE/MADMEX/dir_test_mount/ madmex_003_37_-32_1996-01-01.nc://blue_mean
/usr/local/lib/python3.6/dist-packages/rasterio/__init__.py:216: NotGeoreferencedWarning: Dataset has no geotransform set. The identity matrix may be returned. s = DatasetReader(path, driver=driver, **kwargs) Rasterio 1.0.23 Interactive Inspector (Python 3.6.7) Type "src.meta", "src.read(1)", or "help(src)" for more information.
src.meta
{'driver': 'HDF5Image', 'dtype': 'int16', 'nodata': None, 'width': 1667, 'height': 1667, 'count': 1, 'crs': None, 'transform': Affine(1.0, 0.0, 0.0, 0.0, 1.0, 0.0)}
a=src.read() a
array([[[632, 603, 586, ..., 505, 483, 471], [567, 538, 536, ..., 468, 478, 523], [614, 580, 537, ..., 540, 633, 675], ..., [828, 810, 804, ..., 275, 268, 299], [857, 844, 823, ..., 310, 290, 307], [840, 854, 836, ..., 320, 288, 294]]], dtype=int16)
```
This is related with LUSTRE FS because if we copy our NetCDF file to /root location then:
``` this works:
gdalinfo netcdf:/root/madmex_003_37_-32_1996-01-01.nc:blue_mean
Driver: netCDF/Network Common Data Format Files: /root/madmex_003_37_-32_1996-01-01.nc Size is 1667, 1667 Coordinate System is: PROJCS["unnamed", GEOGCS["WGS 84", ... ```
``` rio insp netcdf:/root/madmex_003_37_-32_1996-01-01.nc:blue_mean Rasterio 1.0.23 Interactive Inspector (Python 3.6.7) Type "src.meta", "src.read(1)", or "help(src)" for more information.
src.meta
{'driver': 'netCDF', 'dtype': 'int16', 'nodata': -32767.0, 'width': 1667, 'height': 1667, 'count': 1, 'crs': CRS.from_wkt('PROJCS["unnamed",GEOGCS["WGS 84",DATUM["unknown",SPHEROID["WGS84",6378137,6556752.3141]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433]],PROJECTION["Lambert_Conformal_Conic_2SP"],PARAMETER["standard_parallel_1",17.5],PARAMETER["standard_parallel_2",29.5],PARAMETER["latitude_of_origin",12],PARAMETER["central_meridian",-102],PARAMETER["false_easting",2500000],PARAMETER["false_northing",0]]'), 'transform': Affine(30.0, 0.0, 2827530.0, 0.0, -30.0, 876410.0)}
a=src.read() a
array([[[632, 603, 586, ..., 505, 483, 471], [567, 538, 536, ..., 468, 478, 523], [614, 580, 537, ..., 540, 633, 675], ..., [828, 810, 804, ..., 275, 268, 299], [857, 844, 823, ..., 310, 290, 307], [840, 854, 836, ..., 320, 288, 294]]], dtype=int16) ```
We are using `ubuntu:bionic` docker image so you can reproduce this error using next lines:
``` sudo docker run -v /LUSTRE/:/LUSTRE/ \ -v /LUSTRE/MADMEX/dir_test_mount:/temporal \ --name mounting_volume_test_bionic \ --hostname antares3-datacube \ -dit ubuntu:bionic /bin/bash ```
enter to docker container:
``` sudo docker exec -it mounting_volume_test_bionic bash ```
and execute:
``` apt-get update
export DEBIAN_FRONTEND=noninteractive && echo "America/Mexico_City" > /etc/timezone && apt-get install -y tzdata apt-get update && apt-get install -y \ wget curl \ openssh-server \ openssl \ sudo \ nano \ software-properties-common \ git \ vim \ vim-gtk \ htop \ build-essential \ libssl-dev \ libffi-dev \ cmake \ python3-dev \ python3-pip \ python3-setuptools \ ca-certificates \ postgresql-client \ libudunits2-dev \ nodejs && pip3 install --upgrade pip
#install netcdf:
echo -e 'deb http://security.ubuntu.com/ubuntu/ xenial-security universe\ndeb http://archive.ubuntu.com/ubuntu/ xenial universe' >> /etc/apt/sources.list apt update apt install -t xenial-security libcurl3-gnutls apt install -y -t xenial netcdf-bin ncdump /LUSTRE/MADMEX/madmex_003_37_-32_1996-01-01.nc | head -n 30
#Install spatial libraries add-apt-repository -y ppa:ubuntugis/ubuntugis-unstable && apt-get -qq update
apt-get install -y \ ncview \ libproj-dev \ libgeos-dev \ gdal-bin \ libgdal-dev pip install numpy && pip install GDAL==$(gdal-config --version) --global-option=build_ext --global-option='-I/usr/include/gdal' && pip install rasterio export LC_ALL=C.UTF-8 export LANG=C.UTF-8 export GDAL_DATA=/usr/share/gdal/ ```
Hope you can help us
Cheers,
Erick
-- Sean Gillies
|