Date   

problem with window sizes when parallelizing a funciton

javier lopatin
 

Hi all,
I'm currently using rasterio to process time series analysis of data. Because the raster tiles are too big, I'm trying to implement a windowed function with parallel processing. I tried your tutorial on concurrent processing (
https://github.com/mapbox/rasterio/blob/master/docs/topics/concurrency.rst) and it works beautify with my own functions. I'm just wondering why it is only working with window sizes of 128 (profile.update(blockxsize=128,blockysize=128tiled=True)​), just like in the example. If I change these values to e.g. 200 or 500 it does not work anymore. I'm currently trying with a 3,000 X 3,000 raster size. Because loops are slow, I assume that increasing a bit the window size could be helpful. The error message that I receive if I change blockxsize is:

Traceback (most recent call last):
  File "TSA.py", line 252, in <module>
    main(args.inputImage, args.outputImage, args.j)
  File "TSA.py", line 225, in main
    parallel_process(infile, outfile, n_jobs)
  File "TSA.py", line 187, in parallel_process
    with rasterio.open(outfile, "w", **profile) as dst:
  File "/home/javier/miniconda3/lib/python3.6/site-packages/rasterio/env.py", line 398, in wrapper
    return f(*args, **kwds)
  File "/home/javier/miniconda3/lib/python3.6/site-packages/rasterio/__init__.py", line 226, in open
    **kwargs)
  File "rasterio/_io.pyx", line 1129, in rasterio._io.DatasetWriterBase.__init__
  File "rasterio/_err.pyx", line 194, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_AppDefinedError: _TIFFVSetField:/home/javier/Documents/SF_delta/Sentinel/TSA/X-004_Y-001/2015-2019_001-365_LEVEL4_TSA_SEN2L_EVI_C0_S0_FAVG_TY_C95T_FBY_TSA.tif: Bad value 500 for "TileWidth" tag
 

The used function is below (see the whole script at 
https://github.com/JavierLopatin/Python-Remote-Sensing-Scripts/blob/master/TSA.py):

def parallel_process(infile, outfile, n_jobs):
    """
    Process infile block-by-block with parallel processing
    and write to a new file.
    """
    from tqdm import tqdm # progress bar
 
    with rasterio.Env():
 
        with rasterio.open(infile) as src:
 
            # Create a destination dataset based on source params. The
            # destination will be tiled, and we'll process the tiles
            # concurrently.
            profile = src.profile
            profile.update(blockxsize=128, blockysize=128,
                           count=6, dtype='float64', tiled=True)
 
            with rasterio.open(outfile, "w", **profile) as dst:
 
                # Materialize a list of destination block windows
                # that we will use in several statements below.
                windows = [window for ij, window in dst.block_windows()]
 
                # This generator comprehension gives us raster data
                # arrays for each window. Later we will zip a mapping
                # of it with the windows list to get (window, result)
                # pairs.
                data_gen = (src.read(window=window) for window in windows)
 
                with concurrent.futures.ProcessPoolExecutor(
                    max_workers=n_jobs
                ) as executor:
 
                    # We map the TSA() function over the raster
                    # data generator, zip the resulting iterator with
                    # the windows list, and as pairs come back we
                    # write data to the destination dataset.
                    for window, result in zip(
                        tqdm(windows), executor.map(TSA, data_gen)
                    ):
                        dst.write(result, window=window)

Hope you guys can help. 

Cheers, Javier


Re: Python rasterio for saveing GeoTIFF files and read in ArcGIS or QGIS

Gabriel Cotlier
 

Dear Sean,
Thank you very much for your help.
I appreciate it.
Regards
Gabriel


On Monday, May 20, 2019, Sean Gillies <sean.gillies@...> wrote:
Thank you for providing a complete example of your code and the ArcMap error message. Without those details, it can be hard to provide help.

When you open the files for writing, there is one thing missing: the *transform* keyword argument that determines the spatial extent of the dataset. I think that in your case, you will have success if you copy the value from the source datasets:

filename = "20180308_133037_1024_3B_AnalyticMS.tif"

# Copy the transform and crs objects from the source dataset.
with rasterio.open(filename) as src:
    dst_transform = src.transform
    dst_crs = src.crs

# Pass the copied values to rasterio.open().
new_dataset = rasterio.open(
       'ReflectanceB1.tif',
       'w',
       driver='GTiff',
       height=band_blue_reflectance.shape[0],
       width=band_blue_reflectance.shape[1],
       count=1,
       dtype=band_blue_reflectance.dtype,
       crs=dst_crs,
       transform=dst_transform
)    
    
new_dataset.write(band_blue_reflectance, 1)
new_dataset.close()


On Sat, May 18, 2019 at 12:39 PM <gabiklm01@...> wrote:
After following the steps in the tutorial of the Planet website (https://developers.planet.com/tutorials/convert-planetscope-imagery-from-radiance-to-reflectance/), I want to save each individual band of a RapidEye image Level 3B as a separate GeoTIFF layer file to be able to open it in any other GIS-software such as QGIS, ArcGIS other. I could not succeed in saving each band separately as GeoTIFF layer file. Here is the code:
 
```
#!/usr/bin/env python
# coding: utf-8
 
import rasterio
import numpy as np
 
filename = "20180308_133037_1024_3B_AnalyticMS.tif"
 
# Load red and NIR bands - note all PlanetScope 4-band images have band order BGRN
with rasterio.open(filename) as src:
    band_blue_radiance = src.read(1)
    
with rasterio.open(filename) as src:
    band_green_radiance = src.read(2)
 
with rasterio.open(filename) as src:
    band_red_radiance = src.read(3)
 
with rasterio.open(filename) as src:
    band_nir_radiance = src.read(4)
 
from xml.dom import minidom
 
xmldoc = minidom.parse("20180308_133037_1024_3B_AnalyticMS_metadata.xml")
nodes = xmldoc.getElementsByTagName("ps:bandSpecificMetadata")
 
# XML parser refers to bands by numbers 1-4
coeffs = {}
for node in nodes:
    bn = node.getElementsByTagName("ps:bandNumber")[0].firstChild.data
    if bn in ['1', '2', '3', '4']:
        i = int(bn)
        value = node.getElementsByTagName("ps:reflectanceCoefficient")[0].firstChild.data
        coeffs[i] = float(value)
 
print "Conversion coefficients:", coeffs
 
# Multiply the Digital Number (DN) values in each band by the TOA reflectance coefficients
band_blue_reflectance = band_blue_radiance * coeffs[1]
band_green_reflectance = band_green_radiance * coeffs[2]
band_red_reflectance = band_red_radiance * coeffs[3]
band_nir_reflectance = band_nir_radiance * coeffs[4]
 
import numpy as np
print "Red band radiance is from {} to {}".format(np.amin(band_red_radiance), np.amax(band_red_radiance))
print "Red band reflectance is from {} to {}".format(np.amin(band_red_reflectance), np.amax(band_red_reflectance))
 
print "Before Scaling, red band reflectance is from {} to {}".format(np.amin(band_red_reflectance), np.amax(band_red_reflectance))
 
# Here we include a fixed scaling factor. This is common practice.
scale = 10000
blue_ref_scaled = scale * band_blue_reflectance
green_ref_scaled = scale * band_green_reflectance
red_ref_scaled = scale * band_red_reflectance
nir_ref_scaled = scale * band_nir_reflectance
 
print "After Scaling, red band reflectance is from {} to {}".format(np.amin(red_ref_scaled), np.amax(red_ref_scaled))
```
Here I tried unsuccessfully to save each individual band as GeotiFF file.
 
I have tried, any idea how to do the save of each band correctly to further open in QGIS orArGIS for calculating NDVI there?
 
```
dst_crs='EPSG:32720'
 
##  saving bands individualy  
new_dataset = rasterio.open(
       'ReflectanceB1.tif',
       'w',
       driver='GTiff',
       height=band_blue_reflectance.shape[0],
       width=band_blue_reflectance.shape[1],
       count=1,
       dtype=band_blue_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_blue_reflectance, 1)
new_dataset.close()
   
   
new_dataset = rasterio.open(
       'ReflectanceB2.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()
 
new_dataset = rasterio.open(
       'ReflectanceB3.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()    
 
 
new_dataset = rasterio.open(
       'ReflectanceB4.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()
```
I'm using as input image "20180308_133037_1024_3B_AnalyticMS.tif" instead of "20180308_133037_1024_3B_AnalyticMS_SR.tif" don't know if is the problem
 
I'm getting resulting GeoTIFF file but when I open it in ArcMap I get this error
 


The GeoTIFF band are there and can be seen at ArcMap screen...   but the error appear anyway... might be related to coordinate systems to be defined possibly while saving in rasterio? And need to have it correctly projected for clipping with shapefiles.



--
Sean Gillies


Re: Python rasterio for saveing GeoTIFF files and read in ArcGIS or QGIS

Sean Gillies
 

Thank you for providing a complete example of your code and the ArcMap error message. Without those details, it can be hard to provide help.

When you open the files for writing, there is one thing missing: the *transform* keyword argument that determines the spatial extent of the dataset. I think that in your case, you will have success if you copy the value from the source datasets:

filename = "20180308_133037_1024_3B_AnalyticMS.tif"

# Copy the transform and crs objects from the source dataset.
with rasterio.open(filename) as src:
    dst_transform = src.transform
    dst_crs = src.crs

# Pass the copied values to rasterio.open().
new_dataset = rasterio.open(
       'ReflectanceB1.tif',
       'w',
       driver='GTiff',
       height=band_blue_reflectance.shape[0],
       width=band_blue_reflectance.shape[1],
       count=1,
       dtype=band_blue_reflectance.dtype,
       crs=dst_crs,
       transform=dst_transform
)    
    
new_dataset.write(band_blue_reflectance, 1)
new_dataset.close()


On Sat, May 18, 2019 at 12:39 PM <gabiklm01@...> wrote:
After following the steps in the tutorial of the Planet website (https://developers.planet.com/tutorials/convert-planetscope-imagery-from-radiance-to-reflectance/), I want to save each individual band of a RapidEye image Level 3B as a separate GeoTIFF layer file to be able to open it in any other GIS-software such as QGIS, ArcGIS other. I could not succeed in saving each band separately as GeoTIFF layer file. Here is the code:
 
```
#!/usr/bin/env python
# coding: utf-8
 
import rasterio
import numpy as np
 
filename = "20180308_133037_1024_3B_AnalyticMS.tif"
 
# Load red and NIR bands - note all PlanetScope 4-band images have band order BGRN
with rasterio.open(filename) as src:
    band_blue_radiance = src.read(1)
    
with rasterio.open(filename) as src:
    band_green_radiance = src.read(2)
 
with rasterio.open(filename) as src:
    band_red_radiance = src.read(3)
 
with rasterio.open(filename) as src:
    band_nir_radiance = src.read(4)
 
from xml.dom import minidom
 
xmldoc = minidom.parse("20180308_133037_1024_3B_AnalyticMS_metadata.xml")
nodes = xmldoc.getElementsByTagName("ps:bandSpecificMetadata")
 
# XML parser refers to bands by numbers 1-4
coeffs = {}
for node in nodes:
    bn = node.getElementsByTagName("ps:bandNumber")[0].firstChild.data
    if bn in ['1', '2', '3', '4']:
        i = int(bn)
        value = node.getElementsByTagName("ps:reflectanceCoefficient")[0].firstChild.data
        coeffs[i] = float(value)
 
print "Conversion coefficients:", coeffs
 
# Multiply the Digital Number (DN) values in each band by the TOA reflectance coefficients
band_blue_reflectance = band_blue_radiance * coeffs[1]
band_green_reflectance = band_green_radiance * coeffs[2]
band_red_reflectance = band_red_radiance * coeffs[3]
band_nir_reflectance = band_nir_radiance * coeffs[4]
 
import numpy as np
print "Red band radiance is from {} to {}".format(np.amin(band_red_radiance), np.amax(band_red_radiance))
print "Red band reflectance is from {} to {}".format(np.amin(band_red_reflectance), np.amax(band_red_reflectance))
 
print "Before Scaling, red band reflectance is from {} to {}".format(np.amin(band_red_reflectance), np.amax(band_red_reflectance))
 
# Here we include a fixed scaling factor. This is common practice.
scale = 10000
blue_ref_scaled = scale * band_blue_reflectance
green_ref_scaled = scale * band_green_reflectance
red_ref_scaled = scale * band_red_reflectance
nir_ref_scaled = scale * band_nir_reflectance
 
print "After Scaling, red band reflectance is from {} to {}".format(np.amin(red_ref_scaled), np.amax(red_ref_scaled))
```
Here I tried unsuccessfully to save each individual band as GeotiFF file.
 
I have tried, any idea how to do the save of each band correctly to further open in QGIS orArGIS for calculating NDVI there?
 
```
dst_crs='EPSG:32720'
 
##  saving bands individualy  
new_dataset = rasterio.open(
       'ReflectanceB1.tif',
       'w',
       driver='GTiff',
       height=band_blue_reflectance.shape[0],
       width=band_blue_reflectance.shape[1],
       count=1,
       dtype=band_blue_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_blue_reflectance, 1)
new_dataset.close()
   
   
new_dataset = rasterio.open(
       'ReflectanceB2.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()
 
new_dataset = rasterio.open(
       'ReflectanceB3.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()    
 
 
new_dataset = rasterio.open(
       'ReflectanceB4.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()
```
I'm using as input image "20180308_133037_1024_3B_AnalyticMS.tif" instead of "20180308_133037_1024_3B_AnalyticMS_SR.tif" don't know if is the problem
 
I'm getting resulting GeoTIFF file but when I open it in ArcMap I get this error
 


The GeoTIFF band are there and can be seen at ArcMap screen...   but the error appear anyway... might be related to coordinate systems to be defined possibly while saving in rasterio? And need to have it correctly projected for clipping with shapefiles.



--
Sean Gillies


Python rasterio for saveing GeoTIFF files and read in ArcGIS or QGIS

Gabriel Cotlier
 

After following the steps in the tutorial of the Planet website (https://developers.planet.com/tutorials/convert-planetscope-imagery-from-radiance-to-reflectance/), I want to save each individual band of a RapidEye image Level 3B as a separate GeoTIFF layer file to be able to open it in any other GIS-software such as QGIS, ArcGIS other. I could not succeed in saving each band separately as GeoTIFF layer file. Here is the code:
 
```
#!/usr/bin/env python
# coding: utf-8
 
import rasterio
import numpy as np
 
filename = "20180308_133037_1024_3B_AnalyticMS.tif"
 
# Load red and NIR bands - note all PlanetScope 4-band images have band order BGRN
with rasterio.open(filename) as src:
    band_blue_radiance = src.read(1)
    
with rasterio.open(filename) as src:
    band_green_radiance = src.read(2)
 
with rasterio.open(filename) as src:
    band_red_radiance = src.read(3)
 
with rasterio.open(filename) as src:
    band_nir_radiance = src.read(4)
 
from xml.dom import minidom
 
xmldoc = minidom.parse("20180308_133037_1024_3B_AnalyticMS_metadata.xml")
nodes = xmldoc.getElementsByTagName("ps:bandSpecificMetadata")
 
# XML parser refers to bands by numbers 1-4
coeffs = {}
for node in nodes:
    bn = node.getElementsByTagName("ps:bandNumber")[0].firstChild.data
    if bn in ['1', '2', '3', '4']:
        i = int(bn)
        value = node.getElementsByTagName("ps:reflectanceCoefficient")[0].firstChild.data
        coeffs[i] = float(value)
 
print "Conversion coefficients:", coeffs
 
# Multiply the Digital Number (DN) values in each band by the TOA reflectance coefficients
band_blue_reflectance = band_blue_radiance * coeffs[1]
band_green_reflectance = band_green_radiance * coeffs[2]
band_red_reflectance = band_red_radiance * coeffs[3]
band_nir_reflectance = band_nir_radiance * coeffs[4]
 
import numpy as np
print "Red band radiance is from {} to {}".format(np.amin(band_red_radiance), np.amax(band_red_radiance))
print "Red band reflectance is from {} to {}".format(np.amin(band_red_reflectance), np.amax(band_red_reflectance))
 
print "Before Scaling, red band reflectance is from {} to {}".format(np.amin(band_red_reflectance), np.amax(band_red_reflectance))
 
# Here we include a fixed scaling factor. This is common practice.
scale = 10000
blue_ref_scaled = scale * band_blue_reflectance
green_ref_scaled = scale * band_green_reflectance
red_ref_scaled = scale * band_red_reflectance
nir_ref_scaled = scale * band_nir_reflectance
 
print "After Scaling, red band reflectance is from {} to {}".format(np.amin(red_ref_scaled), np.amax(red_ref_scaled))
```
Here I tried unsuccessfully to save each individual band as GeotiFF file.
 
I have tried, any idea how to do the save of each band correctly to further open in QGIS orArGIS for calculating NDVI there?
 
```
dst_crs='EPSG:32720'
 
##  saving bands individualy  
new_dataset = rasterio.open(
       'ReflectanceB1.tif',
       'w',
       driver='GTiff',
       height=band_blue_reflectance.shape[0],
       width=band_blue_reflectance.shape[1],
       count=1,
       dtype=band_blue_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_blue_reflectance, 1)
new_dataset.close()
   
   
new_dataset = rasterio.open(
       'ReflectanceB2.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()
 
new_dataset = rasterio.open(
       'ReflectanceB3.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()    
 
 
new_dataset = rasterio.open(
       'ReflectanceB4.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()
```
I'm using as input image "20180308_133037_1024_3B_AnalyticMS.tif" instead of "20180308_133037_1024_3B_AnalyticMS_SR.tif" don't know if is the problem
 
I'm getting resulting GeoTIFF file but when I open it in ArcMap I get this error
 


The GeoTIFF band are there and can be seen at ArcMap screen...   but the error appear anyway... might be related to coordinate systems to be defined possibly while saving in rasterio? And need to have it correctly projected for clipping with shapefiles.


save

Gabriel Cotlier
 

After following the steps in the tutorial of the Planet website (https://developers.planet.com/tutorials/convert-planetscope-imagery-from-radiance-to-reflectance/), I want to save each individual band of a RapidEye image Level 3B as a separate GeoTIFF layer file to be able to open it in any other GIS-software such as QGIS, ArcGIS other. I could not succeed in saving each band separately as GeoTIFF layer file. Here is the code:
 
```
#!/usr/bin/env python
# coding: utf-8
 
import rasterio
import numpy as np
 
filename = "20180308_133037_1024_3B_AnalyticMS.tif"
 
# Load red and NIR bands - note all PlanetScope 4-band images have band order BGRN
with rasterio.open(filename) as src:
    band_blue_radiance = src.read(1)
    
with rasterio.open(filename) as src:
    band_green_radiance = src.read(2)
 
with rasterio.open(filename) as src:
    band_red_radiance = src.read(3)
 
with rasterio.open(filename) as src:
    band_nir_radiance = src.read(4)
 
from xml.dom import minidom
 
xmldoc = minidom.parse("20180308_133037_1024_3B_AnalyticMS_metadata.xml")
nodes = xmldoc.getElementsByTagName("ps:bandSpecificMetadata")
 
# XML parser refers to bands by numbers 1-4
coeffs = {}
for node in nodes:
    bn = node.getElementsByTagName("ps:bandNumber")[0].firstChild.data
    if bn in ['1', '2', '3', '4']:
        i = int(bn)
        value = node.getElementsByTagName("ps:reflectanceCoefficient")[0].firstChild.data
        coeffs[i] = float(value)
 
print "Conversion coefficients:", coeffs
 
# Multiply the Digital Number (DN) values in each band by the TOA reflectance coefficients
band_blue_reflectance = band_blue_radiance * coeffs[1]
band_green_reflectance = band_green_radiance * coeffs[2]
band_red_reflectance = band_red_radiance * coeffs[3]
band_nir_reflectance = band_nir_radiance * coeffs[4]
 
import numpy as np
print "Red band radiance is from {} to {}".format(np.amin(band_red_radiance), np.amax(band_red_radiance))
print "Red band reflectance is from {} to {}".format(np.amin(band_red_reflectance), np.amax(band_red_reflectance))
 
print "Before Scaling, red band reflectance is from {} to {}".format(np.amin(band_red_reflectance), np.amax(band_red_reflectance))
 
# Here we include a fixed scaling factor. This is common practice.
scale = 10000
blue_ref_scaled = scale * band_blue_reflectance
green_ref_scaled = scale * band_green_reflectance
red_ref_scaled = scale * band_red_reflectance
nir_ref_scaled = scale * band_nir_reflectance
 
print "After Scaling, red band reflectance is from {} to {}".format(np.amin(red_ref_scaled), np.amax(red_ref_scaled))
```
Here I tried unsuccessfully to save each individual band as GeotiFF file.
 
I have tried, any idea how to do the save of each band correctly to further open in QGIS orArGIS for calculating NDVI there?
 
```
dst_crs='EPSG:32720'
 
##  saving bands individualy  
new_dataset = rasterio.open(
       'ReflectanceB1.tif',
       'w',
       driver='GTiff',
       height=band_blue_reflectance.shape[0],
       width=band_blue_reflectance.shape[1],
       count=1,
       dtype=band_blue_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_blue_reflectance, 1)
new_dataset.close()
   
   
new_dataset = rasterio.open(
       'ReflectanceB2.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()
 
new_dataset = rasterio.open(
       'ReflectanceB3.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()    
 
 
new_dataset = rasterio.open(
       'ReflectanceB4.tif',
       'w',
       driver='GTiff',
       height=band_red_reflectance.shape[0],
       width=band_red_reflectance.shape[1],
       count=1,
       dtype=band_red_reflectance.dtype,
       crs=dst_crs,
)    
    
new_dataset.write(band_red_reflectance, 1)
new_dataset.close()
```
I'm using as input image "20180308_133037_1024_3B_AnalyticMS.tif" instead of "20180308_133037_1024_3B_AnalyticMS_SR.tif" don't know if is the problem
 
I'm getting resulting GeoTIFF file but when I open it in ArcMap I get this error
 
[![enter image description here][1]][1]
 
 
  [1]: https://i.stack.imgur.com/zJ73V.jpg
 
The GeoTIFF band are there and can be seen at ArcMap screen...   but the error appear anyway... might be related to coordinate systems to be defined possibly while saving in rasterio? And need to have it correctly projected for clipping with shapefiles.


Rasterio 1.0.23

Sean Gillies
 

Hi all,

Rasterio 1.0.23 is on PyPI now. It has a number of important bug fixes:
- rio-calc output is divided into chunks of user-specified size to constrain the
  amount of memory used (#1668).
- Attempts to set attributes of datasets opened in "r" mode now raise a custom
  DatasetAttributeError. This exception derives from both RasterioError and
  NotImplementedError, which maintains backwards compatibility (#1676).
- Block sizes are no longer guarded when creating untiled datasets (#1689).
- CRS objects are now hashable and equivalent CRS objects have the same hash
  value (#1684).
- Allow AWS regions to be specified no matter the signing of requests (#1670).
- Add links to API documentation from the Python quickstart guide.
- Use "CRS.from_epsg({})" instead of "CRS.from_dict(init='epsg:{}')" as the
  representation for CRS objects that are completely described by an EPSG code.
- Use GDAL's string parsing to get metadata item keys and values, which
  accommodates uncommon "KEY:VALUE" forms.
If you're a rio-calc user, be sure to update the snuggs package as well. 1.4.6 has several very important bug fixes.
Thank you for the excellent bug reports,
--
Sean Gillies


Re: Can't read in a Landsat geotiff and use multiple threads to write out different windows at the same time

Ryan Avery
 

Thanks Sean, very helpful and makes sense.


On Tue, May 7, 2019 at 9:30 AM Sean Gillies <sean.gillies@...> wrote:
The VRT_SHARED_SOURCE option only affects the connection pool within the VRT driver code and that's not relevant to your code as far as I can see.

A single opened rasterio dataset cannot be safely used by multiple threads, concurrently. There is more about this limitation at https://trac.osgeo.org/gdal/wiki/FAQMiscellaneous#IstheGDALlibrarythread-safe.

Each of your threads will need exclusive use of a dataset object. The simplest way to achieve this is to call rasterio.open(..., sharing=False) in new threads to get a new dataset and then close it before the thread is joined. Another way is to create a suitably-sized pool of dataset objects (also using the sharing=False option) within your application and then assign these to your threads on a as-needed basis.

I hope this helps,

On Mon, May 6, 2019 at 7:02 PM <ravery@...> wrote:

I moved this github issue over here since it seems like more of a usage/help question. Below is the issue with a reproducible example if you have a Landsat tif.
https://github.com/mapbox/rasterio/issues/1681

The program fails at different windows when writing out windows with threads. This issue is related to this issue https://github.com/mapnik/node-mapnik/issues/437#issuecomment-103806098

I tried setting VRT_SHARED_SOURCE=0 with the rasterion.Env() context manager to allow threads to not step on each other but it didn't do anything.

Have others encountered this issue and found a solution?



--
Sean Gillies



--
Ryan Avery
Graduate Student, WAVES Lab
Department of Geography
University of California, Santa Barbara


Re: Can't read in a Landsat geotiff and use multiple threads to write out different windows at the same time

Sean Gillies
 

The VRT_SHARED_SOURCE option only affects the connection pool within the VRT driver code and that's not relevant to your code as far as I can see.

A single opened rasterio dataset cannot be safely used by multiple threads, concurrently. There is more about this limitation at https://trac.osgeo.org/gdal/wiki/FAQMiscellaneous#IstheGDALlibrarythread-safe.

Each of your threads will need exclusive use of a dataset object. The simplest way to achieve this is to call rasterio.open(..., sharing=False) in new threads to get a new dataset and then close it before the thread is joined. Another way is to create a suitably-sized pool of dataset objects (also using the sharing=False option) within your application and then assign these to your threads on a as-needed basis.

I hope this helps,

On Mon, May 6, 2019 at 7:02 PM <ravery@...> wrote:

I moved this github issue over here since it seems like more of a usage/help question. Below is the issue with a reproducible example if you have a Landsat tif.
https://github.com/mapbox/rasterio/issues/1681

The program fails at different windows when writing out windows with threads. This issue is related to this issue https://github.com/mapnik/node-mapnik/issues/437#issuecomment-103806098

I tried setting VRT_SHARED_SOURCE=0 with the rasterion.Env() context manager to allow threads to not step on each other but it didn't do anything.

Have others encountered this issue and found a solution?



--
Sean Gillies


Can't read in a Landsat geotiff and use multiple threads to write out different windows at the same time

Ryan Avery
 

I moved this github issue over here since it seems like more of a usage/help question. Below is the issue with a reproducible example if you have a Landsat tif.
https://github.com/mapbox/rasterio/issues/1681

The program fails at different windows when writing out windows with threads. This issue is related to this issue https://github.com/mapnik/node-mapnik/issues/437#issuecomment-103806098

I tried setting VRT_SHARED_SOURCE=0 with the rasterion.Env() context manager to allow threads to not step on each other but it didn't do anything.

Have others encountered this issue and found a solution?


Re: NameError: name 'Window' is not defined

quinsen.joel@...
 

Thanks for both your help! I see the import statement in the docs now, cheers!

-Quin


Re: NameError: name 'Window' is not defined

Sean Gillies
 

Thanks for the help, rickD!

In https://github.com/mapbox/rasterio/commit/eebf54b74fa0f718cb8baaa44aa20616e405b4e4 I added the missing Window class import. Read the Docs should be updated in a couple minutes. I think this should help.


On Mon, May 6, 2019 at 1:38 PM <debboutr@...> wrote:

you just need the class in your namespace, try:

`from rasterio.windows import Window`

I would agree it's not easy from the docs to know where to go and find this.

cheers

 

~rickD



--
Sean Gillies


Re: NameError: name 'Window' is not defined

debboutr@...
 

you just need the class in your namespace, try:

`from rasterio.windows import Window`

I would agree it's not easy from the docs to know where to go and find this.

cheers

 

~rickD


Re: NameError: name 'Window' is not defined

Sean Gillies
 

Hi Quin,

The Window class is defined in the rasterio.windows module and you must explicitly import it into your program's code like this:

    from rasterio.windows import Window

Example code in our docs might obscure this need, if you see any docs that don't show import of the Window class, let me know and I'll fix them.

On Mon, May 6, 2019 at 1:23 PM <quinsen.joel@...> wrote:
Hello. I'm new to this forum, so let me know if there are any problems with my question here.

I am trying to use rasterio's windowed reading ability and I am hoping to use the following structure as shown on rasterios documentation page (https://rasterio.readthedocs.io/en/latest/topics/windowed-rw.html):
 
 
```
with rasterio.open('file.tif') as src:
   window = src.read(1, window=Window(0, 0, new_height, new_width)) 
```
 
 
However, when I try to do this I get:
 
```NameError: name 'Window' is not defined```
 
Shouldn't Window() be imported with rasterio? Is this a problem with the version of rasterio I have downloaded or am I missing something really easy here?
 
```
print (rasterio.__version__)
1.0.7
```
 
Thank you for your guidance,
Quin



--
Sean Gillies


NameError: name 'Window' is not defined

quinsen.joel@...
 

Hello. I'm new to this forum, so let me know if there are any problems with my question here.

I am trying to use rasterio's windowed reading ability and I am hoping to use the following structure as shown on rasterios documentation page (https://rasterio.readthedocs.io/en/latest/topics/windowed-rw.html):
 
 
```
with rasterio.open('file.tif') as src:
   window = src.read(1, window=Window(0, 0, new_height, new_width)) 
```
 
 
However, when I try to do this I get:
 
```NameError: name 'Window' is not defined```
 
Shouldn't Window() be imported with rasterio? Is this a problem with the version of rasterio I have downloaded or am I missing something really easy here?
 
```
print (rasterio.__version__)
1.0.7
```
 
Thank you for your guidance,
Quin


Re: WarpedVRT and resampling ?

vincent.sarago@...
 

Hi Sean, sorry for not being really clear. 

> Can you explain what processes produced those images? What program produced the blurry one and what program produced the non-blurry one? Which resampling algorithms were used? 

Those two image were produced by `rio-tiler`. One on linux (blurry) the other on Mac OS (both with wheels, but also tried with gdal sources). The resampling algorithm that produced blurry lines are non-nearest: cubic, bilinear, cubic spline.  

>  if you use the WarpedVRT, you should define its extents so that you don't need to use any resampling at all when you read from it. Also, I don't fully understand the edge effects when you resample in read().

in https://github.com/cogeotiff/rio-tiler/blob/master/rio_tiler/utils.py#L296-L321, what we do first is to calculated the bounds of the VRT (mercator tiles) and then do (decimated) reads. 

The interesting part is the bilinear resampling introduce the blurry lines only when using it on the `WarpedVRT` params and not on the `read`. 

here is a gist showing how to reproduce it: https://gist.github.com/vincentsarago/0b229741c3261a9a447c7ab5dbd9eb05


Re: WarpedVRT and resampling ?

Sean Gillies
 

Hi Vincent,

Can you explain what processes produced those images? What program produced the blurry one and what program produced the non-blurry one? Which resampling algorithms were used?

Yes, you can introduce resampling in two places. I think it is not a good idea to do so: if you use the WarpedVRT, you should define its extents so that you don't need to use any resampling at all when you read from it. Also, I don't fully understand the edge effects when you resample in read().


On Fri, Apr 19, 2019 at 9:37 PM <vincent.sarago@...> wrote:
This was previously discussed on multiple Issues and PR: 
https://github.com/mapbox/rasterio/issues/1206
https://github.com/mapbox/rasterio/pull/1238
https://github.com/mapbox/rasterio/pull/1239

While working with https://github.com/cogeotiff/rio-tiler on different data format (Int16, Float ...) I've encountered some strange behavior when using non-nearest resampling algorithm. 

## Contexte 

When working with `WarpedVRT` you can choose the resampling algorithm twice: 
```
with WarpedVRT(src, resampling=enums.Resampling.nearest) as vrt:
    arr = vrt.read(out_shape=(1, 512, 512, resampling=enums.Resampling.nearest))
```
Sean try to explain this in https://github.com/mapbox/rasterio/pull/1238#issuecomment-353123776

rio-tiler code: https://github.com/cogeotiff/rio-tiler/blob/master/rio_tiler/utils.py#L296-L321

## Behavior 

On the images attached we can see that there are `blurry` lines on the bottom. Those lines appear on the bottom of the tile returned by `rio-tiler` (bottom of the WarpedVRT returned value). 

Observation: 
- Only visible when working on Linux (https://github.com/RemotePixel/amazonlinux-gdal/tree/gdal2.4.0, or with rasterio wheels) 
- Only with Int16 data with negative NoData 
- Not visible when using gdalwarp directly
- Visible only when not using `resampling=enums.Resampling.nearest` in WarpedVRT params

I'm not 100% to understand what is going on and if I should or should not use `enums.Resampling.nearest`  as default resampling in WarpedVRT params so I'm looking for guidance here. 

Thanks,

Vincent 



--
Sean Gillies


WarpedVRT and resampling ?

vincent.sarago@...
 

This was previously discussed on multiple Issues and PR: 
https://github.com/mapbox/rasterio/issues/1206
https://github.com/mapbox/rasterio/pull/1238
https://github.com/mapbox/rasterio/pull/1239

While working with https://github.com/cogeotiff/rio-tiler on different data format (Int16, Float ...) I've encountered some strange behavior when using non-nearest resampling algorithm. 

## Contexte 

When working with `WarpedVRT` you can choose the resampling algorithm twice: 
```
with WarpedVRT(src, resampling=enums.Resampling.nearest) as vrt:
    arr = vrt.read(out_shape=(1, 512, 512, resampling=enums.Resampling.nearest))
```
Sean try to explain this in https://github.com/mapbox/rasterio/pull/1238#issuecomment-353123776

rio-tiler code: https://github.com/cogeotiff/rio-tiler/blob/master/rio_tiler/utils.py#L296-L321

## Behavior 

On the images attached we can see that there are `blurry` lines on the bottom. Those lines appear on the bottom of the tile returned by `rio-tiler` (bottom of the WarpedVRT returned value). 

Observation: 
- Only visible when working on Linux (https://github.com/RemotePixel/amazonlinux-gdal/tree/gdal2.4.0, or with rasterio wheels) 
- Only with Int16 data with negative NoData 
- Not visible when using gdalwarp directly
- Visible only when not using `resampling=enums.Resampling.nearest` in WarpedVRT params

I'm not 100% to understand what is going on and if I should or should not use `enums.Resampling.nearest`  as default resampling in WarpedVRT params so I'm looking for guidance here. 

Thanks,

Vincent 


Re: tempfile.NamedTemporaryFile behaving as /vsimem and eating all the machine memory

Sean Gillies
 

Vincent.

At https://github.com/mapbox/rasterio/blob/master/rasterio/__init__.py#L191, a big GeoTIFF is created in RAM. Then at https://github.com/mapbox/rasterio/blob/master/rasterio/__init__.py#L199 that GeoTIFF is read into memory *again* so that it can be written to the Python file object. There will be two copies in memory. It's terribly inefficient, but I don't want to spend the time to optimize this case when I should be documenting the limitations instead.

On Tue, Apr 16, 2019 at 12:47 PM <vincent.sarago@...> wrote:
Thanks Sean this is really helpful and love the `temp.name` solution. 

About the second point, do you have any idea why `/vsimem` driver need so much memory when exiting/closing ? Should I raise this to the gdal list? 



--
Sean Gillies


Re: tempfile.NamedTemporaryFile behaving as /vsimem and eating all the machine memory

vincent.sarago@...
 

Thanks Sean this is really helpful and love the `temp.name` solution. 

About the second point, do you have any idea why `/vsimem` driver need so much memory when exiting/closing ? Should I raise this to the gdal list? 


Re: tempfile.NamedTemporaryFile behaving as /vsimem and eating all the machine memory

Sean Gillies
 

Hi Vincent,

This is expected (if not well-documented) behavior. tempfile.NamedTemporaryFile() returns an open Python file object, not a filename. GDAL can't use a Python file object, so in that case rasterio.open reads all the bytes from the file object, copies them to the vsimem filesystem, and opens that vsimem file.

I think what you want do do is pass the name of the temp file object to GDAL. Like this:

with tempfile.NamedTemporaryFile() as temp:
    with rasterio.open(temp.name) as dataset:
        print(dataset)

No copy in the vsimem filesystem will be made.

On Tue, Apr 16, 2019 at 6:55 AM <vincent.sarago@...> wrote:
While working on https://github.com/cogeotiff/rio-cogeo/pull/75 we noticed strange behaviors with `vsimem` driver (this could be a GDAL but TBH). 

1. When using `
tempfile.NamedTemporaryFile()` Rasterio uses `vsimem` driver

with tempfile.NamedTemporaryFile() as tmpfile:

   print(tmpfile)

   with rasterio.open(tmpfile, "w", **meta) as tmp_dst:

       print(tmp_dst)

<tempfile._TemporaryFileWrapper object at 0x1151bb2e8>

<open DatasetWriter name='/vsimem/d26d4650-3010-4d39-8b0c-3d947b94f1d5.' mode='w+'>

Here I was expecting Rasterio/GDAL to behave as `tempfile` was a regular file.

2. When closing
 a `vsimem`  (`MemoryFile` or `tempfile`) we observe a huge memory surge when working with big images.

code: 
https://github.com/cogeotiff/rio-cogeo/pull/75#issuecomment-482745580



Tested on Mac OS and linux with python 3.7 (gdal 2.4 and 2.3) 

Thanks 



--
Sean Gillies

481 - 500 of 698