Re: Handling non-AWS S3 services

Guillaume Lostis


Yes, I will open a PR tomorrow, once I add a test for this argument.

Guillaume Lostis

On Wed, 11 Sep 2019 at 18:47, Sean Gillies <sean.gillies@...> wrote:
Hi Guillaume,

On Wed, Sep 11, 2019 at 8:25 AM Guillaume Lostis <g.lostis@...> wrote:

Hi all,

I've been working with a non-AWS S3 file storage lately, so I had to tackle the question of how to use rasterio with it. The service ( is S3-compatible, so it works with AWS libraries such as boto3 or awscli.

I've managed to make it work with rasterio, but in a manner that doesn't really satisfy me. I'm writing this message to ask if you would agree to make some changes to AWSSession in order to better handle non-AWS S3 providers.

Here is some context on the problem: since the endpoint_url of the service is different from GDAL's default (, I currently need to write code along the lines of:

import rasterio

with rasterio.Env(profile_name="openio", AWS_S3_ENDPOINT=""):
    with"s3://bucket/file.tiff") as src:

The code works, but I don't like the fact that I have to use a mix of rasterio-esque arguments to use an AWSSession and some GDAL-esque arguments to patch a missing argument in the AWSSession.

The nice thing about AWSSession is that it uses a boto3.Session, which in turn reads my ~/.aws/config and ~/.aws/credentials files in which I've saved my OpenIO credentials and region name under a profile named openio (this way I can easily switch between AWS and OpenIO buckets).

The not-so-nice thing is that boto3.Session objects do not handle the specification of a custom endpoint_url. This is intentional and is done because a Session is made to talk to different services (EC2, S3, ...), which have different URLs (more info in the first few comments of this issue). A boto3.Session.client, however, accepts a custom endpoint_url. For example, to have boto3 work with OpenIO, I do the following:

import boto3

session = boto3.Session(profile_name="oio")
client = session.client("s3", endpoint_url="")
# use the client to retrieve files, etc.

From what I've understood by reading the code, AWSSession uses a boto3.Session only to handle the credentials retrieval part, and then stores them in a _creds attribute. After that, the boto3.Session is not used for anything else. Since a boto3.Session cannot handle the retrieval of a custom endpoint_url, would it be acceptable to add an endpoint_url argument to the AWSSession? I have tested this patch and it does what I want, because I can run the following code (which, IMO, is nicer than the first one):

import rasterio

session = rasterio.session.AWSSession(profile_name="openio", endpoint_url="")
with rasterio.Env(session=session):
    with"s3://bucket/file.tiff") as src:

What do you think of this? Is it in the scope of rasterio's AWSSession, or not?


Guillaume Lostis

Yes, I think this is well within rasterio's scope. Let's do it. Can you submit a PR?

Sean Gillies

Join to automatically receive all group messages.