Topics

Megatiles (COG) support for Rasterio

Madhav Desetty
 

Has anyone attempted to support COG megatiles, large image mosaics which is much larger (giga or terabytes of data) and it does not make sense to distribute as one COG, in rasterio. There was some discussion during the FOSS4G-NA 2018 BoF session where Planet guys apparently used some naming scheme that a tile service can use to figure out what to serve up, without having to rely on any external database or key. I am not sure how that was done. I was thinking if we can do at much lower level like rasterio or even gdal level where you create a .vrt and gdal handle the reading, something along these lines: 
...
<SimpleSource>
      <SourceFilename relativeToVRT="0">/vsigs/gcsbucket-1/mosaic_C01R01.tif</SourceFilename>
      <SourceBand>1</SourceBand>
      <SourceProperties RasterXSize="26005" RasterYSize="26010" DataType="Byte" BlockXSize="512" BlockYSize="512" />
      <SrcRect xOff="0" yOff="0" xSize="26005" ySize="26010" />
      <DstRect xOff="0" yOff="123155.263157894" xSize="13686.8421052633" ySize="13689.4736842106" />
</SimpleSource>
...
Maybe there are better ways. I was curious if anyone has tackled similar issue.

vincent.sarago@...
 

Hi,
This is a really interesting topics. I tried couple weeks ago to get a better understanding how this could work with Rasterio (and rio-tiler). 

I was under the impression that rasterio (or rio-tiler) was doing something wrong when trying to do partial read from distant COG. Here is one question I asked to Even (GDAL guru): https://lists.osgeo.org/pipermail/gdal-dev/2018-October/049199.html

I'll try to have more time during the following weeks to set some test to narrow what I was doing wrong. 

Vincent 

Guy Doulberg
 

Hi

We are creating small COGs up to 100MB and mosaic them together using VRT

Then we build another layer of overviews on top of the mosaic which is a COG BIGTIFF,

we are using rasterio.read with windows

By doing that:
1. Each of the tiles (cogs) is not to big and we can work on
2. if you want original resolution you going directly to the tile itself,
3. if you want larger area you going to the overviews with lower resultion

This entire setup is backed by azure blob storage

hope that helps

Madhav Desetty
 

Hi Guy,
Thanks for the response. So when you call rasterio.read, are you passing in .vrt file or do you handle which file to use .vrt vs. COG BigTIFF overview in your application? I am presuming gdal is handling reading the BigTIFF overview of the mosaic which is related to .vrt. 

Guy Doulberg
 

To rasterio read I use the vrt file.

The read uses the tiles or the overviews according to the out_shape you pass to the read



בתאריך יום ד׳, 2 בינו׳ 2019, 17:39, מאת <madhav@...>:

Hi Guy,
Thanks for the response. So when you call rasterio.read, are you passing in .vrt file or do you handle which file to use .vrt vs. COG BigTIFF overview in your application? I am presuming gdal is handling reading the BigTIFF overview of the mosaic which is related to .vrt. 

Madhav Desetty
 

Ok cool. Do you mind sharing the gdaladdo command to generate the COG Overview file? I am presuming you used the gdaladdo command. 

Guy Doulberg
 

Hi,

I gladly share,

Actually I am using rasterio build_overviews, in the flowing link you can find the code
https://github.com/satellogic/telluric/blob/master/telluric/util/raster_utils.py#L363

To tell you the truth I suspect that running just `gdaladdo` works faster, so I have thoughts about using `gdaladdo`, it works like with any other raster

Guy

Madhav Desetty
 

Ah ok. I was not sure if rasterio or gdaladdo by default would create the Cloud Optimized GeoTIFFs when you build the overviews. The docs don't seems to make that very clear. It sounds like they do. I will give it a spin with rasterio and also see if I can get the same results with gdaladdo.