returning subset of raster data via API


 

Hi all: as part of working on OGC API - Coverages support in pygeoapi, I'm writing the equivalent of what WCS GetCoverage would do; that is, clip by bbox and subset by bands if requested.

An initial implementation can be found in this branch.

At the moment, the data is returned as JSON-ified ndarrays. I would like to return the data in its native format. What would be a general approach here? Would one use a MemoryFile, write to it, and then read it back to the caller?

Thanks

..Tom


Sean Gillies
 

Hi Tom,

On Fri, Jan 17, 2020 at 8:35 AM Tom Kralidis <tom.kralidis@...> wrote:

Hi all: as part of working on OGC API - Coverages support in pygeoapi, I'm writing the equivalent of what WCS GetCoverage would do; that is, clip by bbox and subset by bands if requested.

An initial implementation can be found in this branch.

At the moment, the data is returned as JSON-ified ndarrays. I would like to return the data in its native format. What would be a general approach here? Would one use a MemoryFile, write to it, and then read it back to the caller?

Thanks

When you say that you want to return the data in its native format, do you mean a stream of bytes that contain, for example, a GeoTIFF? The MemoryFile class would serve you well in that case if you wanted to avoid writing to disk, or one of Python's temporary file objects would do just as well if you didn't mind or wanted a temp file on disk. You would be limited to single-file formats if you used MemoryFile, our abstraction doesn't cover multiple files. It seems to me that there would be a related problem for multiple files on the serialization end, whether you used MemoryFile or not: how do you put multiple files in a single stream of bytes?

--
Sean Gillies


 

(sorry, just getting back to this!)

Thanks Sean. Indeed the MemoryFile strategy works as expected:

import sys

import rasterio
from rasterio.io import MemoryFile

with rasterio.open(sys.argv[1]) as src:
    with MemoryFile() as memfile:
        profile = src.profile
        with memfile.open(**profile) as dest:
            dest.write(src.read())

Are there any known issues with reading/subsetting, and writing back a NetCDF via MemoryFile? Traceback:

(pygeoapi) (base) tomkralidis@aruba:~/Dev/pygeoapi/pygeoapi-tomkralidis$ python dd.py tests/data/CMIP5_rcp8.5_annual_abs_latlon1x1_PCP_pctl25_P1Y.nc 
Traceback (most recent call last):
  File "rasterio/_io.pyx", line 1135, in rasterio._io.DatasetWriterBase.__init__
  File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: Unable to create netCDF file /vsimem/fa663a74-d548-494d-a2ab-880bdff831c4/fa663a74-d548-494d-a2ab-880bdff831c4. (Error code 2): No such file or directory .

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dd.py", line 9, in <module>
    with memfile.open(**profile) as dest:
  File "/Users/tomkralidis/opt/miniconda3/lib/python3.7/site-packages/rasterio/env.py", line 382, in wrapper
    return f(*args, **kwds)
  File "/Users/tomkralidis/opt/miniconda3/lib/python3.7/site-packages/rasterio/io.py", line 136, in open
    nodata=nodata, sharing=sharing, **kwargs)
  File "rasterio/_io.pyx", line 1139, in rasterio._io.DatasetWriterBase.__init__
rasterio.errors.RasterioIOError: Unable to create netCDF file /vsimem/fa663a74-d548-494d-a2ab-880bdff831c4/fa663a74-d548-494d-a2ab-880bdff831c4. (Error code 2): No such file or directory .

Versions:

>>> import rasterio
>>> rasterio.__version__
'1.1.5'
>>> rasterio.gdal_version()
'3.0.4'

Thanks

..Tom


Sean Gillies
 

Hi Tom,

On Mon, Aug 3, 2020 at 5:32 PM Tom Kralidis <tom.kralidis@...> wrote:

(sorry, just getting back to this!)

Thanks Sean. Indeed the MemoryFile strategy works as expected:

import sys

import rasterio
from rasterio.io import MemoryFile

with rasterio.open(sys.argv[1]) as src:
    with MemoryFile() as memfile:
        profile = src.profile
        with memfile.open(**profile) as dest:
            dest.write(src.read())

Are there any known issues with reading/subsetting, and writing back a NetCDF via MemoryFile? Traceback:

(pygeoapi) (base) tomkralidis@aruba:~/Dev/pygeoapi/pygeoapi-tomkralidis$ python dd.py tests/data/CMIP5_rcp8.5_annual_abs_latlon1x1_PCP_pctl25_P1Y.nc 
Traceback (most recent call last):
  File "rasterio/_io.pyx", line 1135, in rasterio._io.DatasetWriterBase.__init__
  File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: Unable to create netCDF file /vsimem/fa663a74-d548-494d-a2ab-880bdff831c4/fa663a74-d548-494d-a2ab-880bdff831c4. (Error code 2): No such file or directory .

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dd.py", line 9, in <module>
    with memfile.open(**profile) as dest:
  File "/Users/tomkralidis/opt/miniconda3/lib/python3.7/site-packages/rasterio/env.py", line 382, in wrapper
    return f(*args, **kwds)
  File "/Users/tomkralidis/opt/miniconda3/lib/python3.7/site-packages/rasterio/io.py", line 136, in open
    nodata=nodata, sharing=sharing, **kwargs)
  File "rasterio/_io.pyx", line 1139, in rasterio._io.DatasetWriterBase.__init__
rasterio.errors.RasterioIOError: Unable to create netCDF file /vsimem/fa663a74-d548-494d-a2ab-880bdff831c4/fa663a74-d548-494d-a2ab-880bdff831c4. (Error code 2): No such file or directory .

Versions:

>>> import rasterio
>>> rasterio.__version__
'1.1.5'
>>> rasterio.gdal_version()
'3.0.4'

Thanks

..Tom

I don't use netcdf files with GDAL often and have never tried to write a netcdf file to vsimem. It could be that there is a combination of creation or config options required that I don't know about. I see some mention about platform requirements in https://github.com/OSGeo/gdal/pull/786 but am not sure what to make of that.

--
Sean Gillies


 

Per https://gdal.org/user/virtual_file_systems.html#drivers-supporting-virtual-file-systems and confirmed by Even R, GDAL's NetCDF driver cannot write to /vsi*. Perhaps a workaround can be to drop to a tempfile approach when writing NetCDF.

Thanks again for the clarifications.


Sean Gillies
 

On Wed, Aug 5, 2020 at 5:41 AM Tom Kralidis <tom.kralidis@...> wrote:

Per https://gdal.org/user/virtual_file_systems.html#drivers-supporting-virtual-file-systems and confirmed by Even R, GDAL's NetCDF driver cannot write to /vsi*. Perhaps a workaround can be to drop to a tempfile approach when writing NetCDF.

Thanks again for the clarifications.


You're welcome. It's unfortunate that we don't have a more informative message from GDAL that we can raise to users. "No such file or directory" is too opaque.

--
Sean Gillies