MODIS (HDF4) data with MemoryFile


Hussain Hassam
 

Hello,

 

I posted the below message in the Spatial Community #mapbox slack channel, but I have yet to receive a response. I am hoping to get some information about this before I consider it a bug and post on github issues.

 

We have been trying to open HDF4 files (MODIS data) directly from a source (e.g. https://e4ftl01.cr.usgs.gov//DP131/MOLT/MOD09A1.061/2021.06.26/MOD09A1.A2021177.h14v00.061.2021186035450.hdf) with rasterio (in Python).The first issue is that the source requires authentication. We solved by using requests to download the HDF file into a buffer (python io.BytesIO object. We also seeked it back to 0,0 after writing to it). Our thought was to then load this into rasterio using rasterio.io.MemoryFile. When we try to open the memoryfile, we get an error:

 

CPLE_OpenFailedError: '/vsimem/c604e9d5-cc4b-42bc-a34a-de4e541355cc/c604e9d5-cc4b-42bc-a34a-de4e541355cc.hdf' not recognized as a supported file format.

 

The odd part about this is if we write the io.BytesIO object into a file on disk first (example.hdf) then do a rasterio.open('example.hdf'), everything works as expected, however, we are trying to avoid any sort of IO for a streamlined process.

 

Thank you for your time,

Hussain Hassam


Sean Gillies
 

Hi Hussain,

Thanks for posting here. I don't participate in that Slack site and so I don't see any rasterio questions there.

Can you show the exact code you are using?

I see in GDAL's tests and in an announcement https://lists.osgeo.org/pipermail/gdal-dev/2018-August/048934.html that in-memory HDF5 is supported. It may be possible that in-memory HDF4 is not supported by GDAL. I am not a regular HDF user and so I am not certain whether that should be expected to work or not.

On Fri, Jul 9, 2021 at 4:29 PM Hussain Hassam via groups.io <hhassam=hatfieldgroup.com@groups.io> wrote:

Hello,

 

I posted the below message in the Spatial Community #mapbox slack channel, but I have yet to receive a response. I am hoping to get some information about this before I consider it a bug and post on github issues.

 

We have been trying to open HDF4 files (MODIS data) directly from a source (e.g. https://e4ftl01.cr.usgs.gov//DP131/MOLT/MOD09A1.061/2021.06.26/MOD09A1.A2021177.h14v00.061.2021186035450.hdf) with rasterio (in Python).The first issue is that the source requires authentication. We solved by using requests to download the HDF file into a buffer (python io.BytesIO object. We also seeked it back to 0,0 after writing to it). Our thought was to then load this into rasterio using rasterio.io.MemoryFile. When we try to open the memoryfile, we get an error:

 

CPLE_OpenFailedError: '/vsimem/c604e9d5-cc4b-42bc-a34a-de4e541355cc/c604e9d5-cc4b-42bc-a34a-de4e541355cc.hdf' not recognized as a supported file format.

 

The odd part about this is if we write the io.BytesIO object into a file on disk first (example.hdf) then do a rasterio.open('example.hdf'), everything works as expected, however, we are trying to avoid any sort of IO for a streamlined process.

 

Thank you for your time,

Hussain Hassam

_._,_._,_

--
Sean Gillies


Even Rouault
 

It may be possible that in-memory HDF4 is not supported by GDAL.
in-memory HDF4 or any /vsi virtual file system will not work with HDF4. The underlying library has no abstraction for I/O and uses plain C standard library API


--
http://www.spatialys.com
My software is free, but my time generally not.


Hussain Hassam
 

Ahh I see. Thank you Sean & Evan for your quick responses. We will proceed with our backup plan of storing data to disk or bucket and using it from there!

Best Regards,
Hussain

-----Original Message-----
From: main@rasterio.groups.io <main@rasterio.groups.io> On Behalf Of Even Rouault via groups.io
Sent: July 9, 2021 3:49 PM
To: main@rasterio.groups.io
Subject: Re: [rasterio] MODIS (HDF4) data with MemoryFile


It may be possible that in-memory HDF4 is not supported by GDAL.
in-memory HDF4 or any /vsi virtual file system will not work with HDF4.
The underlying library has no abstraction for I/O and uses plain C standard library API


--
https://can01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.spatialys.com%2F&;data=04%7C01%7Chhassam%40hatfieldgroup.com%7C266bcd2bcf6547425a5808d9432bb181%7Ce0b2a496c1864e92b4b07d466b05d8d9%7C0%7C0%7C637614677490906436%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Mzer7qw088AOX0lDrFS3R3P%2BWk8an9y9cs8WGDHrDvM%3D&amp;reserved=0
My software is free, but my time generally not.