rasterio vsicurl issues with docker-compose
I am having a strange rasterio issue where if I build rasterio docker image from osgeo/gdal:ubuntu-small-latest as the base image and run it as docker-compose up, the rio fails to open any http based COG files. It works fine when the same TIF file is copied locally and accessed from a mounted drive. However, gdalinfo succeeds for COG file on S3 via HTTP url. When I run the docker image without docker-compose and use the docker commands (docker build && docker run), rio works for COG file on S3 via HTTP url. It seems like some environment is not being set or initialized when run via docker-compose and I am missing something obvious. Anyone ran into this issue before and any debugging tips to debug why the FileOpen is failing via Docker Compose? I set gdal CPL_DEBUG and CPL_LOG_ERRORS env to ON and logs are still minimal. OGRCT: PROJ >= 4.8.0 features enabled OGRCT: Using locale-safe proj version 1.1.1 $ rio info -v http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif OGRCT: PROJ >= 4.8.0 features enabled OGRCT: Using locale-safe proj version Traceback (most recent call last): File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__ File "rasterio/_shim.pyx", line 67, in rasterio._shim.open_dataset File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer rasterio._err.CPLE_OpenFailedError: '/vsicurl/http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif' does not exist in the file system, and is not recognized as a supported dataset name.
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/bin/rio", line 11, in <module> sys.exit(main_group()) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 17, in new_func return f(get_current_context(), *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/rasterio/rio/info.py", line 66, in info with ctx.obj['env'], rasterio.open(input) as src: File "/usr/local/lib/python3.6/dist-packages/rasterio/env.py", line 445, in wrapper return f(*args, **kwds) File "/usr/local/lib/python3.6/dist-packages/rasterio/__init__.py", line 219, in open s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs) File "rasterio/_base.pyx", line 218, in rasterio._base.DatasetBase.__init__ rasterio.errors.RasterioIOError: '/vsicurl/http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif' does not exist in the file system, and is not recognized as a supported dataset name. $ gdalinfo http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif HTTP: Fetch(http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif) HTTP: libcurl/7.58.0 GnuTLS/3.5.18 zlib/1.2.11 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) nghttp2/1.30.0 librtmp/2.3 GDAL: GDALOpen(/vsimem/http_1/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif, this=0x55e917f3afa0) succeeds as GTiff. GDAL: GDALOpen(http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif, this=0x55e917f3afa0) succeeds as HTTP. Driver: GTiff/GeoTIFF GDAL: GDALDefaultOverviews::OverviewScan() MDReaderPleiades: Not a Pleiades product MDReaderPleiades: Not a Pleiades product Files: none associated Size is 49402, 28398 Coordinate System is: PROJCRS["WGS 84 / UTM zone 20N", BASEGEOGCRS["WGS 84", DATUM["World Geodetic System 1984", ELLIPSOID["WGS 84",6378137,298.257223563, LENGTHUNIT["metre",1]]], PRIMEM["Greenwich",0, ANGLEUNIT["degree",0.0174532925199433]], ID["EPSG",4326]], CONVERSION["UTM zone 20N", METHOD["Transverse Mercator", ID["EPSG",9807]], PARAMETER["Latitude of natural origin",0, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8801]], PARAMETER["Longitude of natural origin",-63, ANGLEUNIT["degree",0.0174532925199433], ID["EPSG",8802]], PARAMETER["Scale factor at natural origin",0.9996, SCALEUNIT["unity",1], ID["EPSG",8805]], PARAMETER["False easting",500000, LENGTHUNIT["metre",1], ID["EPSG",8806]], PARAMETER["False northing",0, LENGTHUNIT["metre",1], ID["EPSG",8807]]], CS[Cartesian,2], AXIS["(E)",east, ORDER[1], LENGTHUNIT["metre",1]], AXIS["(N)",north, ORDER[2], LENGTHUNIT["metre",1]], USAGE[ SCOPE["unknown"], AREA["World - N hemisphere - 66°W to 60°W - by country"], BBOX[0,-66,84,-60]], ID["EPSG",32620]] Data axis to CRS axis mapping: 1,2 Origin = (494088.931940000038594,1993386.902620000066236) Pixel Size = (0.027070000000000,-0.027070000000000) Metadata: AREA_OR_POINT=Area TIFFTAG_SOFTWARE=pix4dmapper Image Structure Metadata: COMPRESSION=YCbCr JPEG INTERLEAVE=PIXEL SOURCE_COLOR_SPACE=YCbCr GTiff: ScanDirectories() GTiff: Opened 24701x14199 overview. GTiff: Opened 12351x7100 overview. GTiff: Opened 6176x3550 overview. GTiff: Opened 3088x1775 overview. GTiff: Opened 1544x888 overview. GTiff: Opened 772x444 overview. GTiff: Opened 386x222 overview. Corner Coordinates: Upper Left ( 494088.932, 1993386.903) ( 63d 3'21.05"W, 18d 1'44.14"N) Lower Left ( 494088.932, 1992618.169) ( 63d 3'21.04"W, 18d 1'19.13"N) Upper Right ( 495426.244, 1993386.903) ( 63d 2'35.56"W, 18d 1'44.15"N) Lower Right ( 495426.244, 1992618.169) ( 63d 2'35.56"W, 18d 1'19.14"N) Center ( 494757.588, 1993002.536) ( 63d 2'58.30"W, 18d 1'31.64"N) Band 1 Block=512x512 Type=Byte, ColorInterp=Red NoData Value=-10000 Overviews: 24701x14199, 12351x7100, 6176x3550, 3088x1775, 1544x888, 772x444, 386x222 Band 2 Block=512x512 Type=Byte, ColorInterp=Green NoData Value=-10000 Overviews: 24701x14199, 12351x7100, 6176x3550, 3088x1775, 1544x888, 772x444, 386x222 Band 3 Block=512x512 Type=Byte, ColorInterp=Blue NoData Value=-10000 Overviews: 24701x14199, 12351x7100, 6176x3550, 3088x1775, 1544x888, 772x444, 386x222 GDAL: GDALClose(http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif, this=0x55e917f3afa0) GDAL: In GDALDestroy - unloading GDAL shared library. $ |
|
Sean Gillies
Hi, There's a big, but not obvious, difference between 1. rio info https://example.com/file.tif and 2. gdalinfo https://example.com/file.tif In the first case, rasterio dispatches https (or http) URLs to vsicurl. In the second case, gdalinfo does not. Rather, it downloads the entire file to a temp location and opens it locally, not using vsicurl. I've forgotten where this is documented on the GDAL site. To compare rasterio and gdalinfo, you must use `gdalinfo /vsicurl/https://example.com/file.tif`. I think we can get to the bottom of this, starting with running the following command: CPL_CURL_VERBOSE=YES rio -vvv info http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif Can you do that and see if any HTTP connection problems are revealed? On Tue, Dec 3, 2019 at 9:27 AM Madhav Desetty <madhav@...> wrote:
--
Sean Gillies |
|
This the log I getting when the rio command with verbose curl. Not seeing specific http connection errors.
$ CPL_CURL_VERBOSE=YES rio -vvv info http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif OGRCT: PROJ >= 4.8.0 features enabled OGRCT: Using locale-safe proj version DEBUG:botocore.hooks:Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane DEBUG:botocore.hooks:Changing event name from before-call.apigateway to before-call.api-gateway DEBUG:botocore.hooks:Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict DEBUG:botocore.hooks:Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration DEBUG:botocore.hooks:Changing event name from before-parameter-build.route53 to before-parameter-build.route-53 DEBUG:botocore.hooks:Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search DEBUG:botocore.hooks:Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section DEBUG:botocore.hooks:Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask DEBUG:botocore.hooks:Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section DEBUG:botocore.hooks:Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search DEBUG:botocore.hooks:Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section DEBUG:botocore.credentials:Looking for credentials via: env INFO:botocore.credentials:Found credentials in environment variables. DEBUG:rasterio.env:Entering env context: <rasterio.env.Env object at 0x7f1a7cf62208> DEBUG:rasterio.env:Starting outermost env DEBUG:rasterio.env:No GDAL environment exists DEBUG:rasterio.env:New GDAL environment <rasterio._env.GDALEnv object at 0x7f1a7d6ad048> created DEBUG:rasterio._env:GDAL_DATA found in environment: '/usr/local/lib/python3.6/dist-packages/rasterio/gdal_data'. DEBUG:rasterio._env:PROJ data files are available at built-in paths DEBUG:rasterio._env:Started GDALEnv <rasterio._env.GDALEnv object at 0x7f1a7d6ad048>. DEBUG:rasterio.env:Entered env context: <rasterio.env.Env object at 0x7f1a7cf62208> DEBUG:rasterio.env:Got a copy of environment <rasterio._env.GDALEnv object at 0x7f1a7d6ad048> options DEBUG:rasterio.env:Entering env context: <rasterio.env.Env object at 0x7f1a7ccf2320> DEBUG:rasterio.env:Got a copy of environment <rasterio._env.GDALEnv object at 0x7f1a7d6ad048> options DEBUG:rasterio.env:Entered env context: <rasterio.env.Env object at 0x7f1a7ccf2320> DEBUG:rasterio._base:Sharing flag: 0 DEBUG:rasterio._env:CPLE_None in CPLError: `/vsicurl/http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif' does not exist in the file system, and is not recognized as a supported dataset name. DEBUG:rasterio.env:Exiting env context: <rasterio.env.Env object at 0x7f1a7ccf2320> DEBUG:rasterio.env:Cleared existing <rasterio._env.GDALEnv object at 0x7f1a7d6ad048> options DEBUG:rasterio._env:Stopped GDALEnv <rasterio._env.GDALEnv object at 0x7f1a7d6ad048>. DEBUG:rasterio.env:No GDAL environment exists DEBUG:rasterio.env:New GDAL environment <rasterio._env.GDALEnv object at 0x7f1a7d6ad048> created DEBUG:rasterio._env:GDAL_DATA found in environment: '/usr/local/lib/python3.6/dist-packages/rasterio/gdal_data'. DEBUG:rasterio._env:PROJ data files are available at built-in paths DEBUG:rasterio._env:Started GDALEnv <rasterio._env.GDALEnv object at 0x7f1a7d6ad048>. DEBUG:rasterio.env:Exited env context: <rasterio.env.Env object at 0x7f1a7ccf2320> DEBUG:rasterio.env:Exiting env context: <rasterio.env.Env object at 0x7f1a7cf62208> DEBUG:rasterio.env:Cleared existing <rasterio._env.GDALEnv object at 0x7f1a7d6ad048> options DEBUG:rasterio._env:Stopped GDALEnv <rasterio._env.GDALEnv object at 0x7f1a7d6ad048>. DEBUG:rasterio.env:Exiting outermost env DEBUG:rasterio.env:Exited env context: <rasterio.env.Env object at 0x7f1a7cf62208> Traceback (most recent call last): File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__ File "rasterio/_shim.pyx", line 67, in rasterio._shim.open_dataset File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer rasterio._err.CPLE_OpenFailedError: '/vsicurl/http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif' does not exist in the file system, and is not recognized as a supported dataset name.
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/bin/rio", line 11, in <module> sys.exit(main_group()) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 17, in new_func return f(get_current_context(), *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/rasterio/rio/info.py", line 66, in info with ctx.obj['env'], rasterio.open(input) as src: File "/usr/local/lib/python3.6/dist-packages/rasterio/env.py", line 445, in wrapper return f(*args, **kwds) File "/usr/local/lib/python3.6/dist-packages/rasterio/__init__.py", line 219, in open s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs) File "rasterio/_base.pyx", line 218, in rasterio._base.DatasetBase.__init__ rasterio.errors.RasterioIOError: '/vsicurl/http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif' does not exist in the file system, and is not recognized as a supported dataset name. |
|
Sean Gillies
GDAL's verbose HTTP "logs" are printed directly to stderr and won't be captured by the Python logger. Can you try to record them somehow? I think we'll see some clues there. I don't want to speculate, but maybe docker compose's networking system lacks or disables features we need? On Tue, Dec 3, 2019 at 1:52 PM Madhav Desetty <madhav@...> wrote: This the log I getting when the rio command with verbose curl. Not seeing specific http connection errors. -- Sean Gillies |
|
I found my problem. I had extra double quote from a copy paste around for this env var CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".TIF,.ovr,.jp2,.tif" which I was setting via .env file and referencing it in the docker-compose.yml file. Once I removed the quotes for the curl allowed extensions, it started working again.
|
|
Sean Gillies
Thank you for following up. Getting a resolution, when we can, is super important for a forum like this. In this case, the error rasterio._err.CPLE_OpenFailedError: '/vsicurl/http://oin-hotosm.s3.amazonaws.com/59c66c5223c8440011d7b1e4/0/7ad397c0-bba2-4f98-a08a-931ec3a6e943.tif' does not exist in the file system, and is not recognized as a supported dataset name. On Thu, Dec 5, 2019 at 7:06 PM Madhav Desetty <madhav@...> wrote: I found my problem. I had extra double quote from a copy paste around for this env var CPL_VSIL_CURL_ALLOWED_EXTENSIONS=".TIF,.ovr,.jp2,.tif" which I was setting via .env file and referencing it in the docker-compose.yml file. Once I removed the quotes for the curl allowed extensions, it started working again. -- Sean Gillies |
|
Frankly, I always thinking giving more context will help especially with the python binding logging gdal errors to stderr and you have to figure out how to do that. However, in my case I guess it was pretty much user error.
|
|