Re: Asyncio + Rasterio for slow network requests?


Dion Häfner
 

Hey Sean,

Sorry, I should have been clearer.

As it stands, my statement is false: GDAL is of course designed to be thread-safe, so doing concurrent reads in different threads *should* work. But in our experience, it doesn't, to the point that we have given up on threads entirely.

Relevant issues from last year:

https://github.com/mapbox/rasterio/issues/1686

https://github.com/OSGeo/gdal/issues/1960

https://github.com/OSGeo/gdal/issues/1244

Even though GDAL#1244 was closed as fixed, we still observed the problem, so I suspect there is another race condition somewhere within GDAL.

Anyway, this wasn't meant as a general statement, just a personal word of advice. To me, multiprocessing seems like a saner alternative at the moment, but YMMV.

Best,
Dion

On 30/03/2020 23.38, Sean Gillies via Groups.Io wrote:
Hi Kyle, Dion:
On Mon, Mar 30, 2020 at 1:41 PM <kylebarron2@gmail.com <mailto:kylebarron2@gmail.com>> wrote:
Sorry for the slow response. As Vincent noted, just moving back to
GDAL 2.4 made the process ~8x faster, from 1.7s to read to ~200ms to
read each source tile.

> A constant time regardless of the amount of overlap suggests to
me that your source files may lack the proper tiling.
According to the AWS NAIP docs
<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.opendata.aws%2Fnaip%2Freadme.html&;data=02%7C01%7Cdion.haefner%40nbi.ku.dk%7Cb9cc3dbc79f94c7384e608d7d4f2bc87%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637212011326047186&sdata=2TmmefR1N6U0yCsJnUyRxI7SHokonzs9%2FfqYQdCTzG8%3D&reserved=0>,
the COG sources were created with
gdal_translate -b 1 -b 2 -b 3 -of GTiff -co tiled=yes -co
BLOCKXSIZE=512 -co BLOCKYSIZE=512 -co COMPRESS=DEFLATE -co
PREDICTOR=2 src_dataset dst_dataset
gdaladdo -r average -ro src_dataset 2 4 8 16 32 64
gdal_translate -b 1 -b 2 -b 3 -of GTiff -co TILED=YES -co
BLOCKXSIZE=512 -co BLOCKYSIZE=512 -co COMPRESS=JPEG -co
JPEG_QUALITY=85 -co PHOTOMETRIC=YCBCR -co COPY_SRC_OVERVIEWS=YES
–config GDAL_TIFF_OVR_BLOCKSIZE 512 src_dataset dst_dataset
Thank you for the details.

> asyncio's run_in_executor does the exact same thing as using a
thread pool
That makes sense, and I ultimately expected to not be able to make
progress since it's GDAL making the low level requests.

> Usually, reading a tile from S3 takes something like 10-100ms if you do it right.
Moving back to GDAL 2.4 got around these speeds.

> At the moment, GDAL reads are not thread-safe!
That's really great to keep in mind! Means I'll probably shy away
from attempting concurrency with GDAL in general.
Dion, can you say a little more about reads not being thread-safe?
It's intended that we can call GDAL's RasterIO functions in different threads concurrently as long as we don't share dataset handles between threads. If we observe otherwise, then there is a GDAL bug that we can fix.
There is an additional consideration for VRTs explained in https://gdal.org/drivers/raster/vrt.html#multi-threading-issues <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgdal.org%2Fdrivers%2Fraster%2Fvrt.html%23multi-threading-issues&;data=02%7C01%7Cdion.haefner%40nbi.ku.dk%7Cb9cc3dbc79f94c7384e608d7d4f2bc87%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637212011326057177&sdata=SY3j36QvO4iwrO9bIRngc4tQ2FgcGISDJ7CszoHayls%3D&reserved=0>. If we have multiple VRTs, used in different threads, pointing to the same URLs, we need to take an extra step to prevent GDAL from accidentally sharing those non-VRT dataset handles between the threads.
--
Sean Gillies
--

Dion Häfner
PhD Student

Niels Bohr Institute
Physics of Ice, Climate and Earth
University of Copenhagen
Tagensvej 16, DK-2200 Copenhagen, DENMARK

_.~"~._.~"~._.~"~._.~"~._.~"~._.~"~._.~"~._

Join main@rasterio.groups.io to automatically receive all group messages.