How to extract DATA_ENCODING creation option type from an input grib2 file for use in the output grib2 file


Shane Mill - NOAA Affiliate
 

Hey everyone,

I know that grib2 isn't commonly used by this community so I apologize that I keep bringing it up... You can probably tell that I am super fond of it! Anyways, all kidding aside, I am very impressed with the abilities of Rasterio and how responsive this community has been.

In a previous topic, Sean and I discussed that creation options for a specific driver can be assigned when opening a grib file in writing mode.
ie. "with rasterio.open(file.grb2, 'w', driver='GRIB',dtype='float64'.....DATA_ENCODING='COMPLEX_PACKING',SPATIAL_DIFFERENCING_ORDER=2) as ds:"

The grib2 specific creation options can be found here:
https://www.gdal.org/frmt_grib.html

This works well in terms of creating a grib2 file encoded in a way that can be read by gdalinfo, toolsUI, and other grib2 readers. What I want/need to figure out now is how to be able to extract the DATA_ENCODING and SPATIAL_DIFFERENCING_ORDER from an input grib2 file so that these creation options don't need to be hardcoded when creating a new grib2 file. In other words, say that you have model data in an input grib2 file, and you create a new grib2 file with a new derived parameter that is derived from two bands in the input grib2 file. I would want the DATA_ENCODING to be the same for both the input and output grib2 file.

Many of the creation options are tracked in the metadata, but as far as I can tell, DATA_ENCODING is not. I'm wondering if it actually is somewhere and I'm just missing it. 

At this link (https://github.com/mapbox/rasterio/issues/405), Sean, you have indicated that the GDAL API doesn't track creation options, but you were able to develop this in Rasterio. Going back to https://www.gdal.org/frmt_grib.html, It looks like all of the Product Identification and Definition is either tracked in src.meta, src.profile, or src.tags(band#). It doesn't appear that Data Encoding is.

I just wanted to see if you (or anyone else) had any feedback on this or know if there is anyway of extracting the type of data encoding from an input file.

Thanks as always!
Shane




Sean Gillies
 

Hi Shane,

Before Rasterio 1.0, the project did record creation options in a custom tag (metadata) namespace. We discovered problems when converting from GeoTIFF to other formats: creation options are very format specific and few of the GeoTIFF ones make any sense for other formats. It turned out that if you, for example, translated from GeoTIFF to something else and then back to GeoTIFF, you could end up with recorded creation options that didn't reflect what existed in the data. We decided to remove this feature in Rasterio 1.0 and leave it up to developers to do it for themselves if they need it.

I see what you mean about the GRIB driver's DATA_ENCODING option. I don't see a good way to get this from the GDAL API. You might have to use a native GRIB API or begin to store the value of this creation option in your application or in custom namespace in your dataset metadata.


On Tue, Apr 2, 2019 at 3:22 PM Shane Mill - NOAA Affiliate via Groups.Io <shane.mill=noaa.gov@groups.io> wrote:

Hey everyone,

I know that grib2 isn't commonly used by this community so I apologize that I keep bringing it up... You can probably tell that I am super fond of it! Anyways, all kidding aside, I am very impressed with the abilities of Rasterio and how responsive this community has been.

In a previous topic, Sean and I discussed that creation options for a specific driver can be assigned when opening a grib file in writing mode.
ie. "with rasterio.open(file.grb2, 'w', driver='GRIB',dtype='float64'.....DATA_ENCODING='COMPLEX_PACKING',SPATIAL_DIFFERENCING_ORDER=2) as ds:"

The grib2 specific creation options can be found here:
https://www.gdal.org/frmt_grib.html

This works well in terms of creating a grib2 file encoded in a way that can be read by gdalinfo, toolsUI, and other grib2 readers. What I want/need to figure out now is how to be able to extract the DATA_ENCODING and SPATIAL_DIFFERENCING_ORDER from an input grib2 file so that these creation options don't need to be hardcoded when creating a new grib2 file. In other words, say that you have model data in an input grib2 file, and you create a new grib2 file with a new derived parameter that is derived from two bands in the input grib2 file. I would want the DATA_ENCODING to be the same for both the input and output grib2 file.

Many of the creation options are tracked in the metadata, but as far as I can tell, DATA_ENCODING is not. I'm wondering if it actually is somewhere and I'm just missing it. 

At this link (https://github.com/mapbox/rasterio/issues/405), Sean, you have indicated that the GDAL API doesn't track creation options, but you were able to develop this in Rasterio. Going back to https://www.gdal.org/frmt_grib.html, It looks like all of the Product Identification and Definition is either tracked in src.meta, src.profile, or src.tags(band#). It doesn't appear that Data Encoding is.

I just wanted to see if you (or anyone else) had any feedback on this or know if there is anyway of extracting the type of data encoding from an input file.

Thanks as always!
Shane





--
Sean Gillies


Shane Mill - NOAA Affiliate
 

Hey Sean,

Thanks for the response, I can definitely see why that would be problematic, especially for converting between formats. That makes perfect sense to me. It's unfortunate that it isn't originally provided in the GDAL API, but I'll look at other ways of getting around this with a native GRIB API or within the application.

Again, thanks as always for the context and background. It's very much appreciated.

Shane


Shane Mill - NOAA Affiliate
 

In case anyone ends up needing to do this, I ended up using eccodes and you can retrieve and set 'packingType' which is the same as DATA_ENCODING with codes_get and codes_set

Thanks,
-Shane


 

Thanks Shane/Sean. On the pygeoapi side, I also stumbled on this issue and was able to allow for user specific options in configuration and passing that on to rasterio accordingly. Definitely not dynamic, but worked for me, at least from the perspective of consistent (enough) production of GRIB2.

Cheers

..Tom