An open Rasterio dataset object should not be passed between multiple processes or threads; the underlying GDALDataset is not thread safe. Additionally, the dataset's lifecycle should be made explicit - either by explicitly calling .close or opening as a context manager (recommended).
It's not clear what your intention is with the `worker` function but I can see two ways to approach it, depending on your goal
if each process simply needs access to the array of data, I would read all of the data in __init__ and close out the dataset before invoking any parallel workers. Then you're just sharing a numpy array instead of a stateful dataset object.
def __init__(self):
with rasterio.open('/Users/mperry/work/rasterio/tests/data/RGB.byte.tif') as src:
data = src.read()
if you need to read different parts of the dataset from each process, you should pass the dataset
path and open/close the a new dataset within each thread. You can't share a dataset object between threads/procs but you can create multiple datasets pointing to the same resource.
Hope this helps.