Dataset.sample() allocates only 0 values to the coordinates


aleksandar.ilic@...
 

Expected behaviour is the following from a smaller dataset, sampled with UTM coordinates:
 
Dataset: 104 bands in total
Shape after sampling with coordinates: (8000, 104)

Following code is critical:
#This code is supposed to extract data faster
src = rasterio.open(S1_S2_stack)
img = src.read()   # load our original input file bands to a numby array stack
profile = src.profile  # the copy the profile of the original GeoTIFF input file
with rasterio.io.MemoryFile() as memfile:
    with memfile.open(**profile) as dst:
        for i in range(0, src.count):
            dst.write(img[i], i+1)
    dataset = memfile.open()
 
train_pts = gpd.read_file(training_points) # 1000 points per class
train_pts = train_pts[['GRIDCODE', 'UTM_E', 'UTM_N', 'geometry']]  # These are the attributes in our point dataset
train_pts.index = range(len(train_pts))
coords = [(x,y) for x, y in zip(train_pts.UTM_E, train_pts.UTM_N)] 

train_pts['Raster Value'] = [x for x in dataset.sample(coords)]   # all band values are saved as a list in the Raster Value column 
 
train_pts[bands] = pd.DataFrame(train_pts['Raster Value'].tolist(), index= train_pts.index)  
train_pts = train_pts.drop(['Raster Value'], axis=1)  # Remove Raster Value column
train_pts.head()

It should allocate the band values of the pixel of all images with their respective bands to the coordinate as shown below.


If i apply this approach to my own dataset now, it doesn't allocate any value to the coordinates and it's zero everywhere.

Quick summary on my dataset:

  • Preprocessed Sentinel 1, corregistered a Stack of ~100 images, performed Stack averaging (Minimum) to use as mask later on, to extract only data where all dates are available.
  • Preprocessed Sentinel 2, Landsat 7 & 8
  • Collocated Sentinel 1&2, Landsat 7&8 together with the Stack avg min mask and applied Land/Sea mask to remove Sentinel 1 non available data.
  • Exported to GeoTIFF / BigTIFF file
  • Imported into the jupyter Notebook
  • Opened everything successfully and visualized it

All values allocated to the Training Points have the value 0 and i don’t understand why. I have used dataset.xy(dataset.height // 2, dataset.width // 2) to validate the coordinate format in case long lat were swapped.

The dataset is big, around 100 Sentinel 1 images and 35 Sentinel 2 Landsat 7 & 8 images. With all their bands and addtionally NDVI, NDMI bands

 

Join main@rasterio.groups.io to automatically receive all group messages.