Unable to read COG via /vsicurl using rasterio (but can access via GDAL)


henry@...
 

Hello! I am trying to access the Harmonized Landsat Sentinel cloud optimized geotiffs and am struggling to read the files using rasterio. I am running Ubuntu 20.04, and have installed GDAL 3.4.0 from the ubuntugis-unstable PPA and rasterio 1.2.10 from pypi in a python virtual environment.

This `gdalinfo` call behaves as expected:

```sh
gdalinfo /vsicurl/https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T13TEK.2021261T175011.v2.0/HLS.S30.T13TEK.2021261T175011.v2.0.B04.tif \
  --config CPL_VSIL_CURL_USE_HEAD FALSE \
  --config CPL_CURL_VERBOSE YES \
  --config GDAL_HTTP_COOKIEJAR /tmp/cookies.txt

```

But reading the raster using rasterio in a python session does not!
``python
import rasterio as rio
url = 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T13TEK.2021261T175011.v2.0/HLS.S30.T13TEK.2021261T175011.v2.0.B04.tif'
with rio.Env(CPL_CURL_VERBOSE=True, GDAL_HTTP_COOKIEJAR='/tmp/cookies.txt', GDAL_DISABLE_READDIR_ON_OPEN=True, CPL_VSIL_CURL_ALLOWED_EXTENSIONS='TIF'):
    r = rio.open(url)

```
The tail of the error message with curl output looks like this:
```

< HTTP/1.1 303 See Other

< Content-Type: application/json

< Content-Length: 0

< Connection: keep-alive

< Server: CloudFront

< Date: Wed, 23 Feb 2022 21:03:31 GMT

< x-amzn-Remapped-x-amzn-RequestId: 984f04ce-4a4f-468f-82aa-d20cab3e1b7b

< x-amzn-Remapped-Content-Length: 0

< x-amzn-Remapped-Connection: keep-alive

< X-Request-Id: NHQIHdHSrm4e-4RWn2R0RiEGoHSoFTicTYI4nH62Q4-TzKX3X66xlA==

< x-amz-apigw-id: OA4djHpePHcFqoQ=

< Cache-Control: private, max-age=3540

< x-amzn-Remapped-Server: Server

< Location: https://d1nklfio7vscoe.cloudfront.net/s3-2d2df3a34830d5223d1e9547cd713408/lp-prod-protected.s3.us-west-2.amazonaws.com/HLSS30.020/HLS.S30.T13TEK.2021261T175011.v2.0/HLS.S30.T13TEK.2021261T175011.v2.0.B04.tif?A-userid=hrodman1&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAZLX6ZES42JF7SMUS%252F20220223%252Fus-west-2%252Fs3%252Faws4_request&X-Amz-Date=20220223T210331Z&X-Amz-Expires=3600&X-Amz-Security-Token=FwoGZXIvYXdzEDYaDPTQu3TsL8Js%252Fk0KxiK3AUa2TuxCD%252Fd1SRCht5WQihvNjeg1F8uQ2Dy%252FY1RJN%252Fayv5ZVAiXkTnYDfJiOgZpDiMw7gWI5fBntcpiz7m5a4yvzfVGucMaCMjlj4%252BdFaqJeOLgpSowuYWw%252B6H5f36uSGgKF1pQw7eeVnuRQ0j3Llp%252BXX1hyP2ymnL5HO6huutM%252Bd6BD%252Bu1ynCZmJqNYoVQqysmTXdp%252Fs2TKIW0R7agT4O1h21SIbZdZHZ%252F9hgtHzLQeCoIHp3HLuyijwtqQBjItdskFBhJKRIDVWU1dG7szzpdLHKN2KjzB%252BeiTo3YuFHXU4aENqpTBawLYBypq&X-Amz-SignedHeaders=host&X-Amz-Signature=35a9a4bc828179c5c565c8bcd9f62575ac006da0494a08a8bf925fa65d1c2549

< X-Amzn-Trace-Id: Root=1-6216a123-370f1e0a2e8065ed6ca302f9;Sampled=0

< x-amzn-Remapped-Date: Wed, 23 Feb 2022 21:03:31 GMT

< X-Content-Type-Options: nosniff

< X-Frame-Options: SAMEORIGIN

< X-XSS-Protection: 1; mode=block

< Strict-Transport-Security: max-age=31536000

< X-Forwarded-For: 75.134.137.166

< x-amzn-RequestId: abb6a9c7-2e15-4809-953d-2e8d16bf5f71

< X-Cache: Miss from cloudfront

< Via: 1.1 3dc94622fb840cab73b3ddb08a5c9680.cloudfront.net (CloudFront)

< X-Amz-Cf-Pop: MSP50-C1

< X-Amz-Cf-Id: NHQIHdHSrm4e-4RWn2R0RiEGoHSoFTicTYI4nH62Q4-TzKX3X66xlA==

* Failed writing header

* stopped the pause stream!

* Closing connection 0

Traceback (most recent call last):

  File "rasterio/_base.pyx", line 261, in rasterio._base.DatasetBase.__init__

  File "rasterio/_shim.pyx", line 78, in rasterio._shim.open_dataset

  File "rasterio/_err.pyx", line 216, in rasterio._err.exc_wrap_pointer

rasterio._err.CPLE_HttpResponseError: HTTP response code: 303

 

During handling of the above exception, another exception occurred:

 

Traceback (most recent call last):

  File "hls.py", line 6, in <module>

    r = rio.open(url)

  File "/root/Envs/sequoia/lib/python3.8/site-packages/rasterio/env.py", line 437, in wrapper

    return f(*args, **kwds)

  File "/root/Envs/sequoia/lib/python3.8/site-packages/rasterio/__init__.py", line 220, in open

    s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)

  File "rasterio/_base.pyx", line 263, in rasterio._base.DatasetBase.__init__

rasterio.errors.RasterioIOError: HTTP response code: 303

```

When I install gdal and rasterio from conda-forge into a conda environment, everything works (without those GDAL flags)! Many of the system dependencies have later versions in the conda environment (curl version 7.81.0 in conda vs 7.68.0 without conda).

This works in conda environment
```python

import rasterio as rio
url = 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T13TEK.2021261T175011.v2.0/HLS.S30.T13TEK.2021261T175011.v2.0.B04.tif'
r = rio.open(url)

```

Does anyone have a clue about how to get rasterio working in this context without the conda installation?

Join main@rasterio.groups.io to automatically receive all group messages.