Re: rio stack with compression
toggle quoted messageShow quoted text
I don't understand the issue entirely, but as Even Rouault explains in https://lists.osgeo.org/pipermail/gdal-dev/2015-June/041917.html a "naive" use of the write() method of a Rasterio dataset (and rio-stack does use it naively) can result in redundant blocks of data written to the GeoTIFF file. The workarounds are to specify band interleaving (as you discovered), or increase GDAL_CACHEMAX to allow the entire output file to be kept in memory until it is written. Remember: a dataset's write() method causes data to be written to the GDAL block cache and data is written from that cache to the GeoTIFF file when the block is bumped out by new data coming in, or when the dataset object is closed.
These are the issues that I need to bring up in Rasterio's documentation.
Hope this helps!
On Thu, Sep 6, 2018 at 5:37 AM Luke Pinner <lukepinnerau@...> wrote:
I stacked some singleband rasters into a multiband raster using `rio stack` with a compress=deflate creation option. The compressed output filesize was approx 1.4x the sum of the uncompressed input filesizes. Specifying --co interleave=band when running `rio stack` got the output filesize down to 0.25x the uncompressed input (per issue #70). Running `gdalbuildvrt` to stack them, then `rio convert` or `gdal_translate` on the VRT with the same creation options as the original rio stack command also results in an output raster of around 0.25x the uncompressed input, but with default pixel interleaving.