Topics

MemoryFile workflow - should closing a dataset close the memfile?


Dion Häfner <dion.haefner@...>
 

You can use [contextlib.ExitStack](https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack) in situations like these to save indentation levels (or enter multiple contexts in a comprehension).

On 29/07/2019 09.45, ronipay via Groups.Io wrote:

Thanks for the reply Luke!

A context manager would work, but it's ergonomically annoying, no? I have a several functions I would like to pipe, each returning a raster. In that case I will have a lot of nested with statements:

with do_something1(ds1) as ds2:

    with do_something2(ds2) as ds3:

        with ...


Currently my solution was to change MemoryFile to NamedTemporaryFile, which solves the memory leak issue at the expense of some speed (but it's not terrible when you have an SSD).


ronipay@...
 

Thanks for the reply Luke!

A context manager would work, but it's ergonomically annoying, no? I have a several functions I would like to pipe, each returning a raster. In that case I will have a lot of nested with statements:

with do_something1(ds1) as ds2:

    with do_something2(ds2) as ds3:

        with ...


Currently my solution was to change MemoryFile to NamedTemporaryFile, which solves the memory leak issue at the expense of some speed (but it's not terrible when you have an SSD).


Luke
 

You could use a context manager to clean it up automatically:

from contextlib import contextmanager
 
import rasterio
from rasterio import MemoryFile
 
@contextmanager
def mem_raster(data, **profile):
    with MemoryFile() as memfile:
        with memfile.open(**profile) as dataset_writer:
            dataset_writer.write(data)
 
        with memfile.open() as dataset_reader:
            yield dataset_reader
 
 
#setup, get data from somewhere, copy or create profile etc...
 
with mem_raster(data, **profile) as ds:
    do_something_with(ds)
 
# the memfile is cleaned up after exiting the with context


ronipay@...
 

Hey all,

I have a general workflow which causes a memory leak. I have several functions which receive 1 or more rasters, perform some operations, and return a MemoryFile based raster, on which I perform other operations, which may also return a memory based raster, etc.

Closing this raster (or deleting the object) does not free the MemoryFile memory. So for now I have to propagate the memfile to all consumer functions which is annoying, not ergonomic, and prone to mistakes.

I'm familiar with rioxarray which might be useful, but we have a lot of code based on pure rasterio.

What is the best way to deal with this? Is it possible to create a specialized DatasetReader/DatasetWriter which closes the memory file on __del__?

Thanks