Re: Reading from S3


Sean Gillies
 

Hi,

The following log message catches my eye:
env: AWS_S3_ENDPOINT="us-west-1"
If that is set in your notebook's environment, it will override the value you pass to Env() in your program, and it looks to be incorrect.

On Thu, Sep 5, 2019 at 8:17 AM <hughes.lloyd@...> wrote:

I am trying to read a GeoTIFF from a private AWS S3 bucket. I have configured GDAL and the appropriate files ~/.aws/config and ~/.aws/credentials. I am using a non-standard AWS region as well, so I needed to set the AWS_S3_ENDPOINT environment variable.

I am able to read the GeoTIFF information using both gdalinfo and rio:

$ gdalinfo /vsis3/s1-image-dataset/test.tif
Driver: GTiff/GeoTIFF
Files: /vsis3/s1-image-dataset/test.tif
Size is 33959, 38507
Coordinate System is:
PROJCS["WGS 84 / UTM zone 17N",
....

and using rio:

$ rio info s3://s1-image-dataset/test.tif
{"bounds": [689299.5634174921, 2622862.3065700093, 1028889.5634174921, 3007932.3065700093], "colorinterp": ["gray"], "compress": "deflate", "count": 1, "crs": "EPSG:32617", "descriptions": [null], "driver": "GTiff" ....

However, when I try to read it in a script using the rasterio Python API the I received the following error:

CPLE_OpenFailedError: '/vsis3/s1-image-dataset/test.tif' not recognized as a supported file format.

The code I am using which produced the issues is

import rasterio
path = "s3://s1-image-dataset/test.tif"
with rasterio.Env(AWS_S3_ENDPOINT='s3.<my region>.amazonaws.com'):
    with rasterio.open(path) as f:
        img = f.read()

This is using Python 3.7, rasterio 1.0.25, and GDAL 2.4.2

The problem only occurs when running this in a Jupyter Notebook (Pangeo to be precise) and it appears that Rasterio exits the environment prematurely

DEBUG:rasterio.env:Entering env context: <rasterio.env.Env object at 0x7f97fb41d898>
DEBUG:rasterio.env:Starting outermost env
DEBUG:rasterio.env:No GDAL environment exists
DEBUG:rasterio.env:New GDAL environment <rasterio._env.GDALEnv object at 0x7f97fb41d908> created
DEBUG:rasterio._env:GDAL_DATA found in environment: '/srv/conda/envs/notebook/share/gdal'.
DEBUG:rasterio._env:PROJ_LIB found in environment: '/srv/conda/envs/notebook/share/proj'.
DEBUG:rasterio._env:Started GDALEnv <rasterio._env.GDALEnv object at 0x7f97fb41d908>.
DEBUG:rasterio.env:Entered env context: <rasterio.env.Env object at 0x7f97fb41d898>
DEBUG:rasterio.env:Got a copy of environment <rasterio._env.GDALEnv object at 0x7f97fb41d908> options
DEBUG:rasterio.env:Entering env context: <rasterio.env.Env object at 0x7f97fb3c5898>
DEBUG:rasterio.env:Got a copy of environment <rasterio._env.GDALEnv object at 0x7f97fb41d908> options
DEBUG:rasterio.env:Entered env context: <rasterio.env.Env object at 0x7f97fb3c5898>
DEBUG:rasterio._base:Sharing flag: 32
DEBUG:rasterio.env:Exiting env context: <rasterio.env.Env object at 0x7f97fb3c5898>
DEBUG:rasterio.env:Cleared existing <rasterio._env.GDALEnv object at 0x7f97fb41d908> options
DEBUG:rasterio._env:Stopped GDALEnv <rasterio._env.GDALEnv object at 0x7f97fb41d908>.
DEBUG:rasterio.env:No GDAL environment exists
DEBUG:rasterio.env:New GDAL environment <rasterio._env.GDALEnv object at 0x7f97fb41d908> created
DEBUG:rasterio._env:GDAL_DATA found in environment: '/srv/conda/envs/notebook/share/gdal'.
DEBUG:rasterio._env:PROJ_LIB found in environment: '/srv/conda/envs/notebook/share/proj'.
DEBUG:rasterio._env:Started GDALEnv <rasterio._env.GDALEnv object at 0x7f97fb41d908>.
DEBUG:rasterio.env:Exited env context: <rasterio.env.Env object at 0x7f97fb3c5898>
DEBUG:rasterio.env:Exiting env context: <rasterio.env.Env object at 0x7f97fb41d898>
DEBUG:rasterio.env:Cleared existing <rasterio._env.GDALEnv object at 0x7f97fb41d908> options
DEBUG:rasterio._env:Stopped GDALEnv <rasterio._env.GDALEnv object at 0x7f97fb41d908>.
DEBUG:rasterio.env:Exiting outermost env
DEBUG:rasterio.env:Exited env context: <rasterio.env.Env object at 0x7f97fb41d898>
env: AWS_ACCESS_KEY_ID="XXXXXX"
env: AWS_SECRET_ACCESS_KEY="XXXXXXXX"
env: AWS_S3_ENDPOINT="us-west-1"
---------------------------------------------------------------------------
CPLE_OpenFailedError                      Traceback (most recent call last)
rasterio/_base.pyx in rasterio._base.DatasetBase.__init__()

rasterio/_shim.pyx in rasterio._shim.open_dataset()

rasterio/_err.pyx in rasterio._err.exc_wrap_pointer()

CPLE_OpenFailedError: '/vsis3/s1-image-dataset/test.tif' does not exist in the file system, and is not recognized as a supported dataset name.

--
Sean Gillies

Join {main@rasterio.groups.io to automatically receive all group messages.