Date   

Re: opening file with forced CRS

Alan Snow
 

This should be fixed in 1.1.5 IIRC: #1248


opening file with forced CRS

 

Hi all: as part of using rasterio for OGC API - Coverages work in pygeoapi, a client can pass a bbox parameter as a shortcut to spatially subset a coverage using lat/long coordinates. The idea here, then, is that lat/long coordinates would be reprojected into the data source's native coordinates and then the coverage would be subsetted accordingly.

Having said this, using example files in https://dd.weather.gc.ca/model_hrdps/continental/grib2/06/002, I'm unable to derive the native crs from rasterio in Python:

>>> import rasterio
>>> ds = rasterio.open('CMC_hrdps_continental_HGT_ISBL_0985_ps2.5km_2020102706_P002-00.grib2')
>>> ds.bounds
BoundingBox(left=-0.5, bottom=1455.5, right=2575.5, top=-0.5)
>>> ds.crs
>>> ds.crs is None
True
>>> 

However, when running through rio info, I am able to see bounds and CRS:

$ rio info CMC_hrdps_continental_HGT_ISBL_0985_ps2.5km_2020102706_P002-00.grib2
WARNING:rasterio._env:CPLE_AppDefined in Unable to perform coordinate transformations, so the correct projected geotransform could not be deduced from the lat/long control points.  Defaulting to ungeoreferenced.
{"bounds": [-2099127.4944969374, -5739388.521499627, 4340872.505503062, -2099388.5214996273], "colorinterp": ["undefined"], "count": 1, "crs": "PROJCS[\"unnamed\",GEOGCS[\"Coordinate System imported from GRIB file\",DATUM[\"unknown\",SPHEROID[\"Sphere\",6371229,0]],PRIMEM[\"Greenwich\",0],UNIT[\"degree\",0.0174532925199433]],PROJECTION[\"Polar_Stereographic\"],PARAMETER[\"latitude_of_origin\",60],PARAMETER[\"central_meridian\",252],PARAMETER[\"scale_factor\",1],PARAMETER[\"false_easting\",0],PARAMETER[\"false_northing\",0],UNIT[\"Metre\",1]]", "descriptions": ["98500[Pa] ISBL=\"Isobaric surface\""], "driver": "GRIB", "dtype": "float64", "height": 1456, "indexes": [1], "lnglat": [-92.0404517295466, 52.147885433427845], "mask_flags": [["nodata"]], "nodata": 9999.0, "res": [2500.0, 2500.0], "shape": [1456, 2576], "tiled": false, "transform": [2500.0, 0.0, -2099127.4944969374, 0.0, -2500.0, -2099388.5214996273, 0.0, 0.0, 1.0], "units": [null], "width": 2576}

I need the crs value to be able to reproject the incoming lat/long bbox into native coordinates to subset accordingly. Is there a way to set a given dataset's CRS through code (in readonly mode)?

Thanks

..Tom


Re: Rasterio and GDAL_CACHEMAX

Sean Gillies
 

Hi Angus,

On Mon, Oct 26, 2020 at 6:04 PM Angus Dickey <angus@...> wrote:
Does GDAL_CACHEMAX have to be set in bytes when using rasterio? I see in the docs there is an example using MBs but it seems to be causing rasterio to set a very small cache size when I use it. For example, accessing a COG in S3 using rasterio 1.1.8:

# No problem here
with rasterio.Env() as env:
    # Prints 851132006 (5% of my system RAM in bytes)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# No problem here either
with rasterio.Env(GDAL_CACHEMAX=536870912) as env:
    # Prints  536870912
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# Really slow
with rasterio.Env(GDAL_CACHEMAX=512) as env:
    # Prints 512 (in bytes?)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG


It seems like rasterio is setting the GDAL raster block cache to 512 bytes and this is causing the slow read. I don't really understand the internals of rasterio but it looks to be using GDALSetCacheMax() (which only accepts bytes) and is passing it 512. I might be misunderstanding the problem though, it could be something else slowing things down but that is the only change I am making.

Any input is appreciated.

Thanks,

Angus

Yes, GDAL_CACHEMAX passed to `Env()` must be in bytes (since https://github.com/mapbox/rasterio/pull/1042/files).

--
Sean Gillies


Rasterio and GDAL_CACHEMAX

Angus Dickey
 

Does GDAL_CACHEMAX have to be set in bytes when using rasterio? I see in the docs there is an example using MBs but it seems to be causing rasterio to set a very small cache size when I use it. For example, accessing a COG in S3 using rasterio 1.1.8:

# No problem here
with rasterio.Env() as env:
    # Prints 851132006 (5% of my system RAM in bytes)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# No problem here either
with rasterio.Env(GDAL_CACHEMAX=536870912) as env:
    # Prints  536870912
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG

# Really slow
with rasterio.Env(GDAL_CACHEMAX=512) as env:
    # Prints 512 (in bytes?)
    print(get_gdal_config('GDAL_CACHEMAX'))
    with rasterio.open('s3://path/to/cog') as src:
        # Do stuff with the COG


It seems like rasterio is setting the GDAL raster block cache to 512 bytes and this is causing the slow read. I don't really understand the internals of rasterio but it looks to be using GDALSetCacheMax() (which only accepts bytes) and is passing it 512. I might be misunderstanding the problem though, it could be something else slowing things down but that is the only change I am making.

Any input is appreciated.

Thanks,

Angus


rasterio 1.1.8-1 wheels published to PyPI

Sean Gillies
 

Hi all,

We found a bug in GDAL versions before 3.2.0 and have patched GDAL 2.4.4 and made new 1.1.8 binary wheels with a "-1" build tag. See https://pypi.org/project/rasterio/1.1.8/#files. If you want this fix and have already installed 1.1.8 wheels, run the following command.

    pip install -I --force-reinstall rasterio==1.1.8

Big thanks to Drew Bollinger for reporting the problem in https://github.com/mapbox/rasterio/issues/2022 and to Even Rouault for the quick fix.

--
Sean Gillies


Re: Best way to sample all data in a raster

Alan Snow
 

I believe that this is what you are looking for: https://rasterio.readthedocs.io/en/latest/topics/resampling.html


Best way to sample all data in a raster

himat15@...
 

I have a large raster file, and I want to visualize it. But opening it as a dataset causes my program to crash due to running out of memory.

Since my only purpose is to visualize the data, the data I get can just be a subsample of the overall data (imagine taking every other pixel).

Is there a built-in way to do this though while reading the dataset, or do I have to separately generate the list of (x,y) coordinates I want in an alternating fashion, and then pass that to  rasterio.sample.sample_gen? https://rasterio.readthedocs.io/en/latest/api/rasterio.sample.html

Is there no way to directly read a random or alternating sample of the data from a direct dataset.read() call?


Merge with mean

felix.kessler1@...
 

Hey Guys,

I am trying to use rasterio.merge.merge to merge ~15 Datasets. This is working with the standard methods, but I want to calculate the mean of the datasets for each pixel.

 

Currently I am trying to use the following function:
def nanmean(old_data, new_data, old_nodata = -10000, new_nodata=-10000, index=None, roff=None, coff=None):
    old_data[:] = np.nanmean([old_data, new_data], axis = 0)

It seems (my output) that the mean calculation is not ignoring the no data values. I also tried to use np.where to set NaN's where the -10000 is in both datasets.

Any help would be great!


rasterio 1.1.8

Sean Gillies
 

Hi all,

Here are the changes in version 1.1.8, which is on PyPI now.


Using certifi means that users of the manylinux1 wheels no longer need set the CURL_CA_BUNDLE environment variable to make HTTPS requests for COG blocks in their favorite cloud. We can all thank Alan Snow for this idea.

Regards,

--
Sean Gillies


cligj 0.7.0

Sean Gillies
 

Hi all,

Version 0.7.0 is on PyPI now. There are no changes other than warnings about the changes coming in 1.0.0: https://github.com/mapbox/cligj/blob/master/CHANGES.txt#L4.

Regards,

--
Sean Gillies


Re: Padding rasterio windows

ashnair0007@...
 

Thanks. Setting the boundless flag fixed the problem.


Re: Padding rasterio windows

Luke
 
Edited

Have you tried keeping your windows all the same size (including the edge windows) and specifying `boundless=True` and a `fill_value=some appropriate value` to fill out your windows in your ` `dataset.read` call?

https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html#rasterio.io.DatasetReader.read

i.e. (untested)

dataset.read(window=window, boundless=True, fill_value=dataset.nodata)


Padding rasterio windows

ashnair0007@...
 

I need to tile a large geotiff into equally sized tiles (with/without overlap depending on user). While I've been able to do that, a problem arises in the edge windows wherein the edge windows have a shorter length/width. I understand why this happens but I would like the edge windows to have the same size as well. 

One way to do this might be get all the edge windows, pad them and reproject them so that padding doesn't mess with the geo-referencing.  Is this an appropriate approach?


cligj 0.6b1

Sean Gillies
 

Hi all,

Another rasterio-related announcement: I'm working on releasing version 0.6.0 of cligj next week. Here's the ticket about it https://github.com/mapbox/cligj/issues/28. This version fixes a bug and also announces a future change for the API. See https://github.com/mapbox/cligj/blob/master/CHANGES.txt for details.

Regards,

--
Sean Gillies


rio-mbtiles 1.5b2

Sean Gillies
 

Hi all,

I'm working on releasing version 1.5.0 of rio-mbtiles by the end of October and published a second beta today. See https://github.com/mapbox/rio-mbtiles/issues/61 and https://github.com/mapbox/rio-mbtiles/blob/master/CHANGES.txt if you're interested in this rasterio CLI plugin. Your feedback will be much appreciated!

Regards,

--
Sean Gillies


Re: Issues Setting tags for GRIB messsages when outputting to GRIB

Shane Mill - NOAA Affiliate
 

That was the issue. For anyone who outputs data to GRIB. Make sure you set the:

GRIB_PDS_TEMPLATE_NUMBERS
GRIB_IDS
GRIB_DISCIPLINE


Re: Issues Setting tags for GRIB messsages when outputting to GRIB

Shane Mill - NOAA Affiliate
 

Noticing that I may need to set the GRIB_DISCIPLINE. If I am able to do that through rasterio, I will be good to go. I will update this message momentarily


Issues Setting tags for GRIB messsages when outputting to GRIB

Shane Mill - NOAA Affiliate
 

Hello,

When writing data to GRIB files, I've noticed sometimes an inconsistency with setting the GRIB_PDS_TEMPLATE_NUMBERS. This parameter drives the parameter name for the GRIB message.

Below is a snippet of the code:

with rasterio.open('/mnt/data-api/output.grb', 'r+', driver='GRIB',DATA_ENCODING='SIMPLE_PACKING') as ds:
    for idx,t in enumerate(tags,start=0):
       ds.update_tags(idx+1,GRIB_PDS_TEMPLATE_NUMBERS=t['GRIB_PDS_TEMPLATE_NUMBERS'])
       ds.update_tags(idx+1,GRIB_IDS=t['GRIB_IDS'])

tags is a prepopulated list containing the "tags" for each grib message. I am simply updating the tags in the new GRIB file to contain the correct GRIB_PDS_TEMPLATE_NUMBERS and GRIB_IDS.

The problem- Sometimes, the tag is not properly set. 

An example: For the Grib parameterName "Soil Temperature", this is the corresponding tag:

{'GRIB_COMMENT': 'Soil temperature [C]', 'GRIB_DISCIPLINE': '2(Land_Surface)', 'GRIB_ELEMENT': 'TSOIL', 'GRIB_FORECAST_SECONDS': '0 sec', 'GRIB_IDS': 'CENTER=7(US-NCEP) SUBCENTER=0 MASTER_TABLE=2 LOCAL_TABLE=1 SIGNF_REF_TIME=1(Start_of_Forecast) REF_TIME=2020-09-29T00:00:00Z PROD_STATUS=0(Operational) TYPE=1(Forecast)', 'GRIB_PDS_PDTN': '0', 'GRIB_PDS_TEMPLATE_ASSEMBLED_VALUES': '0 2 2 0 81 0 0 1 0 106 0 1 106 2 200', 'GRIB_PDS_TEMPLATE_NUMBERS': '0 2 2 0 81 0 0 0 1 0 0 0 0 106 0 0 0 0 1 106 2 0 0 0 200', 'GRIB_REF_TIME': '1601337600 sec UTC', 'GRIB_SHORT_NAME': '1-2-DBLL', 'GRIB_UNIT': '[C]', 'GRIB_VALID_TIME': '1601337600 sec UTC'}

However, the outputted GRIB file has:

{'GRIB_COMMENT': 'Potential temperature [C]', 'GRIB_DISCIPLINE': '0(Meteorological)', 'GRIB_ELEMENT': 'POT', 'GRIB_FORECAST_SECONDS': '0 sec', 'GRIB_IDS': 'CENTER=7(US-NCEP) SUBCENTER=0 MASTER_TABLE=2 LOCAL_TABLE=0 SIGNF_REF_TIME=1(Start_of_Forecast) REF_TIME=2020-09-29T00:00:00Z PROD_STATUS=0(Operational) TYPE=1(Forecast)', 'GRIB_PDS_PDTN': '0', 'GRIB_PDS_TEMPLATE_ASSEMBLED_VALUES': '0 2 2 0 81 0 0 1 0 106 0 1 106 2 200', 'GRIB_PDS_TEMPLATE_NUMBERS': '0 2 2 0 81 0 0 0 1 0 0 0 0 106 0 0 0 0 1 106 2 0 0 0 200', 'GRIB_REF_TIME': '1601337600 sec UTC', 'GRIB_SHORT_NAME': '1-2-DBLL', 'GRIB_UNIT': '[C]', 'GRIB_VALID_TIME': '1601337600 sec UTC'}

Notice, it was intended to set to "Soil Temperature", but the output GRIB has "Potential Temperature". Do you know of a way to overcome this?










Rasterio 1.1.7

Sean Gillies
 

Hi all,

Version 1.1.7 is on PyPI now. Here are the changes: https://github.com/mapbox/rasterio/blob/master/CHANGES.txt#L10-L23. Thank you, Alan and Vincent!

--
Sean Gillies


Rasterio 1.1.6

Sean Gillies
 

Hi all,

Wheels for 1.1.6 are on PyPI now. There are a number of significant bug fixes: https://github.com/mapbox/rasterio/blob/master/CHANGES.txt#L10.

The manylinux1 and macosx wheels include GDAL 2.4.4, still, but with two new patches. I'm waiting for more comments in another thread before switching to GDAL 3.1 and PROJ 7. We are not yet building wheels for Python 3.9 or 3.10, sorry. That depends on some updates to the build infrastructure that I have no time for at the moment.

Thanks for all the great bug reports!

--
Sean Gillies

141 - 160 of 764