Rasterio read outshape resample does not give correct array
Hi all,
I have a population grid and want to decrease the resolution by a factor 2. So I would like to get a raster that has 2 times less rows and columns, a resolution that is 2 times as large as the original one, while retaining the same overall number of population.
This should be a straightforward operation, but the population numbers change incorrectly if I follow the documentation. When the resampled array is written to file, the numbers are incorrect.
Shouldn't the resample method yield an array with resampled cell values to get the same number of total population? Any tips and hints would be appreciated!


Hobart, Geordie (NRCan/RNCan)
Hello. I think that your problem is a little more complex than just resampling. The bilinear interpolation averages the surrounding cells to create a smooth image. You need to sum the four surrounding cells, and populate the new grid with the result. Not sure exactly how to do it but a clever google should find what you need. Good Luck.
From: main@rasterio.groups.io <main@rasterio.groups.io>
On Behalf Of tinak.contact@...
Sent: March 16, 2021 14:26 To: main@rasterio.groups.io Subject: [rasterio] Rasterio read outshape resample does not give correct array
[Edited Message Follows] Hi all, I have a population grid and want to decrease the resolution by a factor 2. So I would like to get a raster that has 2 times less rows and columns, a resolution that is 2 times as large as the original one, while retaining the same overall number of population.
This should be a straightforward operation, but the population numbers change incorrectly if I follow the documentation. When the resampled array is written to file, the numbers are incorrect. I have a population grid and want to decrease the resolution by a factor 2. So I would like to get a raster that has 2 times less rows and columns, a resolution that is 2 times as large as the original one, while retaining the same number of population.
Shouldn't the resample method yield an array with resampled cell values to get the same number of total population? Any tips and hints would be appreciated!


Sean Gillies
Hi, The "average" resampling algorithm should give you better results than "bilinear". I'm not sure it is guaranteed to preserve the sum, especially in the case where the original dataset's width and height are not multiples of 2. Documentation on the resampling algorithms can be found here: https://gdal.org/programs/gdal_translate.html#cmdoptiongdal_translater. Rasterio uses these same options. On Tue, Mar 16, 2021 at 4:05 PM <tinak.contact@...> wrote:

Sean Gillies 

Thank you very much, I found scikit
block_reduce which did the trick, but I was wondering if there are any gdal or rasterio functions who could do the same. I found from (here) and ( here) that GDAL 3.1 or later does have
as interpolation method. 

Loïc Dutrieux
With count data, like population, it only makes sense to aggregate using a sum resampling function (like you did with block_reduce).
Using rasterio you can perform a decimated read (setting out_shape) using resampling=Resampling.sum. I see from the docs that it requires GDAL >= 3.1 though (https://rasterio.readthedocs.io/en/latest/api/rasterio.enums.html#rasterio.enums.Resampling) It can give unexpected results if the file has precomputed overviews, then no matter which resampling method you specify, rasterio won't be able to ignore them (unless built against GDAL >= 3.3 which is very unlikely at the moment). cheers, Loïc ________________________________________ From: main@rasterio.groups.io <main@rasterio.groups.io> on behalf of tinak.contact@... <tinak.contact@...> Sent: 17 March 2021 07:06:04 To: main@rasterio.groups.io Subject: Re: [rasterio] Rasterio read outshape resample does not give correct array [Edited Message Follows] [Reason: added gdal option ] Thank you very much, I found scikit block_reduce<https://urldefense.com/v3/__https://scikitimage.org/docs/dev/api/skimage.measure.html*blockreduce__;Iw!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBBxB81wfd0QZgYJdbZuqh9ZDbxz4GMJQqQ$> which did the trick, but I was wondering if there are any gdal or rasterio functions who could do the same. I found from (here<https://urldefense.com/v3/__https://gis.stackexchange.com/questions/152661/downsamplinggeotiffusingsummationgdalnumpy__;!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBBxB81wfd0QZgYJdbZuqh9ZDbxz3Jieug0$>) and ( here<https://urldefense.com/v3/__https://gdal.org/programs/gdalwarp.html__;!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBBxB81wfd0QZgYJdbZuqh9ZDbxzYF_dZ3o$>) that GDAL 3.1 or later does have r sum or r average as interpolation method. However, if I apply the gdal translate command I am not getting the correct numbers. Could this be a problem with my gdal? Solution with scikit block_reduce :<https://urldefense.com/v3/__https://scikitimage.org/docs/dev/api/skimage.measure.html*blockreduce__;Iw!!DOxrgLBm!SYQfUIkp3oT6I4nkx5rhk0VhOIXLuJc03ZvgBBxB81wfd0QZgYJdbZuqh9ZDbxz4GMJQqQ$> from skimage.measure import block_reduce import matplotlib.pyplot as plt factor = 2 with rasterio.open("mwi_ppp_2020_UTM.tif") as dataset: data = dataset.read(1) print('population file', data[np.where(data>0)].sum()) # give zero to no dataval data=np.where(data<=0,0, data) # reduce by factor x=block_reduce(data, block_size=(factor,factor), func=np.sum) # mask out no data vals x_masked=np.ma.masked_equal(x,0) plt.imshow(x_masked) plt.colorbar() print('population resampled',x_masked.sum()) [cid:attach_0_166D0C1E101B79CD_3895@groups.io] Using gdal translate r average (decrease by factor 10) import subprocess subprocess.call("""gdal_translate tr 0.0083 0.0083 r average\ /Users/admin/Downloads/mwi_ppp_2020.tif gdal_test.tif""", shell=True) Population after 206120.61 
