Date
1 - 4 of 4
tempfile.NamedTemporaryFile behaving as /vsimem and eating all the machine memory
vincent.sarago@...
While working on https://github.com/cogeotiff/rio-cogeo/pull/75 we noticed strange behaviors with `vsimem` driver (this could be a GDAL but TBH).
1. When using `tempfile.NamedTemporaryFile()` Rasterio uses `vsimem` driver Here I was expecting Rasterio/GDAL to behave as `tempfile` was a regular file. 2. When closing a `vsimem` (`MemoryFile` or `tempfile`) we observe a huge memory surge when working with big images. code: https://github.com/cogeotiff/rio-cogeo/pull/75#issuecomment-482745580 Tested on Mac OS and linux with python 3.7 (gdal 2.4 and 2.3) Thanks
|
|
Sean Gillies
Hi Vincent, This is expected (if not well-documented) behavior. tempfile.NamedTemporaryFile() returns an open Python file object, not a filename. GDAL can't use a Python file object, so in that case rasterio.open reads all the bytes from the file object, copies them to the vsimem filesystem, and opens that vsimem file. I think what you want do do is pass the name of the temp file object to GDAL. Like this: with tempfile.NamedTemporaryFile() as temp: with rasterio.open(temp.name) as dataset: print(dataset) No copy in the vsimem filesystem will be made.
On Tue, Apr 16, 2019 at 6:55 AM <vincent.sarago@...> wrote: While working on https://github.com/cogeotiff/rio-cogeo/pull/75 we noticed strange behaviors with `vsimem` driver (this could be a GDAL but TBH). --
Sean Gillies
|
|
vincent.sarago@...
Thanks Sean this is really helpful and love the `temp.name` solution.
About the second point, do you have any idea why `/vsimem` driver need so much memory when exiting/closing ? Should I raise this to the gdal list?
|
|
Sean Gillies
Vincent. At https://github.com/mapbox/rasterio/blob/master/rasterio/__init__.py#L191, a big GeoTIFF is created in RAM. Then at https://github.com/mapbox/rasterio/blob/master/rasterio/__init__.py#L199 that GeoTIFF is read into memory *again* so that it can be written to the Python file object. There will be two copies in memory. It's terribly inefficient, but I don't want to spend the time to optimize this case when I should be documenting the limitations instead.
On Tue, Apr 16, 2019 at 12:47 PM <vincent.sarago@...> wrote: Thanks Sean this is really helpful and love the `temp.name` solution. --
Sean Gillies
|
|