Re: Reading from S3


hughes.lloyd@...
 

I am trying to read a GeoTIFF from a private AWS S3 bucket. I have configured GDAL and the appropriate files ~/.aws/config and ~/.aws/credentials. I am using a non-standard AWS region as well, so I needed to set the AWS_S3_ENDPOINT environment variable.

I am able to read the GeoTIFF information using both gdalinfo and rio:

$ gdalinfo /vsis3/s1-image-dataset/test.tif
Driver: GTiff/GeoTIFF
Files: /vsis3/s1-image-dataset/test.tif
Size is 33959, 38507
Coordinate System is:
PROJCS["WGS 84 / UTM zone 17N",
....

and using rio:

$ rio info s3://s1-image-dataset/test.tif
{"bounds": [689299.5634174921, 2622862.3065700093, 1028889.5634174921, 3007932.3065700093], "colorinterp": ["gray"], "compress": "deflate", "count": 1, "crs": "EPSG:32617", "descriptions": [null], "driver": "GTiff" ....

However, when I try to read it in a script using the rasterio Python API the I received the following error:

CPLE_OpenFailedError: '/vsis3/s1-image-dataset/test.tif' not recognized as a supported file format.

The code I am using which produced the issues is

import rasterio
path = "s3://s1-image-dataset/test.tif"
with rasterio.Env(AWS_S3_ENDPOINT='s3.<my region>.amazonaws.com'):
    with rasterio.open(path) as f:
        img = f.read()

This is using Python 3.7, rasterio 1.0.25, and GDAL 2.4.2

The problem only occurs when running this in a Jupyter Notebook (Pangeo to be precise) and it appears that Rasterio exits the environment prematurely

---------------------------------------------------------------------------
CPLE_OpenFailedError                      Traceback (most recent call last)
rasterio/_base.pyx in rasterio._base.DatasetBase.__init__()

rasterio/_shim.pyx in rasterio._shim.open_dataset()

rasterio/_err.pyx in rasterio._err.exc_wrap_pointer()

CPLE_OpenFailedError: '/vsis3/s1-image-dataset/test.tif' does not exist in the file system, and is not recognized as a supported dataset name.





Join {main@rasterio.groups.io to automatically receive all group messages.