Sampling strategies

In the previous vignettes we discuss how to simulate spatio-temporal data, which in this scenario corresponds to the “truth” (the process we are interested in reconstructing).

# Set study area spatial extent
xlim=20
ylim=20

spatial_field <- define_pattern_s(xlim, ylim, autocorr = TRUE, range = 5)

# Determine duration of time series
time_df <- set_timeline(2, "week")
# Set time series pattern
time_series <- define_pattern_t(time_df,weekly_amplitude = 2, weekly_noise = 1)
#> Warning: One quarter or less of data, ignoring arguments for seasonal and annual
#> patterns

truth <- simulate_spacetime(temporal_pattern = time_series,
                            field = spatial_field,
                            pattern_adherence = 0)

Here we will focus on how to sampling from that truth, mimicking the observation process (how data are collected and represented, including sampling error). There are 3 types of sampling designs allowed.

A regular grid (regular=TRUE)
Random sampling where pixels are randomly selected (regular=FALSE and balance=FALSE)
A “spatially balanced” design based on the method implemented in MBHdesign, which changes the inclusion probabilities to try to avoid clustering of samples (regular=FALSE and balance=TRUE).

Point sampling

To sample points from the study area, we use the function sample_area_pointily. Inside the function we can specify from which data we want to sample (field), the number of samplings (n_samples), the sampling error (sampling_error) and the type of sampling design as specified above. Below we show examples for each of the 3 sampling methods.

#random sampling
point_sample1 <- sample_area_pointily(field= truth, n_samples=10, sampling_error=0.1, regular = FALSE, balanced = FALSE)
#> Joining with `by = join_by(x, y)`

p1<-ggplot(truth, aes(x = x, y = y, fill = value)) +
  geom_raster() +
  labs(fill = "Value", title = "Regular = FALSE, Balanced = FALSE") +
  xlab("X coordinates")+
  ylab("Y coordinates") +
  geom_point(data=point_sample1, aes(x=x,y=y)) + facet_wrap(~date)+
  coord_equal(expand=FALSE, ylim=c(0,ylim), xlim=c(0,xlim))+
  scale_fill_viridis_c()+
  theme_bw()

p1

#balanced sampling
point_sample2 <- sample_area_pointily(truth, 10, 0.1, regular = FALSE, balanced = TRUE)
#> No inclusion.probs supplied, assuming uniform

p2<-ggplot(truth, aes(x = x, y = y, fill = value)) +
  geom_raster() +
  labs(fill = "Value", title = "Regular = FALSE, Balanced = TRUE") +
  xlab("X coordinates")+
  ylab("Y coordinates") +
  geom_point(data=point_sample2, aes(x=x,y=y))+
  facet_wrap(~date)+
  coord_equal(expand=FALSE, ylim=c(0,ylim), xlim=c(0,xlim))+
  scale_fill_viridis_c()+
  theme_bw()

p2

#regular sampling
point_sample3 <- sample_area_pointily(truth,10, 0.1, regular = TRUE, balanced = FALSE)
#> Joining with `by = join_by(x, y)`

p3<-ggplot(truth, aes(x = x, y = y, fill = value)) +
  geom_raster() +
  labs(fill = "Value", title = "Regular = TRUE") +
  xlab("X coordinates")+
  ylab("Y coordinates") +
  geom_point(data=point_sample3, aes(x=x,y=y)) +
  coord_equal(expand=FALSE, ylim=c(0,ylim), xlim=c(0,xlim))+
  facet_wrap(~date)+
  scale_fill_viridis_c()+
  theme_bw()

p3

Area sampling

When using sample_area_areally rather than sampling a point, we are sampling an area. The values in final table are the average per area (with associated error) rather than the point value. While the function is very similar to sample_area_pointly, there are 2 additional parameters: x_range and y_range. These correspond to the distance from the centroid to the perimeter of the area on the x axis and y axis respectability.

To avoid overlap between areas, the centroids are resampled each time areas overlap. Please keep this in mind when selecting the number of points to sample and the area size in relation to the total area.


#set x and y area extent
y_range=3
x_range=3

#regular false and balance false
area_sample1 <- sample_area_areally(field=truth, n_samples =4,x_range =  x_range,y_range = y_range, sampling_error = 0.1, regular = FALSE, balanced = FALSE)

a1<-ggplot(truth, aes(x = x, y = y, fill = value)) +
  geom_raster(alpha=0.8,hjust = 0, vjust = 0) +
  labs(fill = "Value", title = "Regular = FALSE, Balanced = FALSE") +
  xlab("X coordinates")+
  ylab("Y coordinates") +
  geom_rect(data=area_sample1, mapping=aes(xmin=x-x_range, xmax=x+x_range, ymin=y-y_range, ymax=y+y_range, fill=area_value), color="black")+
  geom_point(data=area_sample1, aes(x=x,y=y)) +
  coord_equal(expand=FALSE, ylim=c(0,ylim), xlim=c(0,xlim))+
    facet_wrap(~date)+
  scale_fill_viridis_c()+
  theme_bw()

a1

#regular false and balance true
area_sample2 <- sample_area_areally(field = truth, n_samples = 5,x_range = x_range,y_range = y_range, sampling_error = 0.1, regular = FALSE, balanced = TRUE)
#> No inclusion.probs supplied, assuming uniform
#> No inclusion.probs supplied, assuming uniform
#> No inclusion.probs supplied, assuming uniform
#> No inclusion.probs supplied, assuming uniform
#> No inclusion.probs supplied, assuming uniform

a2<-ggplot(truth, aes(x = x, y = y, fill = value)) +
  geom_raster(alpha=0.8,hjust = 0, vjust = 0) +
  labs(fill = "Value", title = "Regular = FALSE, Balanced = TRUE") +
  xlab("X coordinates")+
  ylab("Y coordinates") +
  geom_rect(data=area_sample2, mapping=aes(xmin=x-x_range, xmax=x+x_range, ymin=y-y_range, ymax=y+y_range, fill=area_value), color="black") +
  geom_point(data=area_sample2, aes(x=x,y=y))+
  coord_equal(expand=FALSE, ylim=c(0,ylim), xlim=c(0,xlim))+
  facet_wrap(~date)+
  scale_fill_viridis_c()+
  theme_bw()

a2

#regular true
area_sample3 <- sample_area_areally(field = truth,n_samples = 10,x_range = x_range,y_range = y_range, sampling_error = 0.1, regular = TRUE, balanced = FALSE)

a3<-ggplot(truth, aes(x = x, y = y, fill = value)) +
 geom_raster(alpha=0.8,hjust = 0, vjust = 0) +
  labs(fill = "Value", title = "Regular = TRUE") +
  xlab("X coordinates")+
  ylab("Y coordinates") +
  geom_rect(data=area_sample3, mapping=aes(xmin=x-x_range, xmax=x+x_range, ymin=y-y_range, ymax=y+y_range, fill=area_value), color="black") +
  geom_point(data=area_sample3, aes(x=x,y=y)) +
  coord_equal(expand=FALSE, ylim=c(0,ylim), xlim=c(0,xlim))+
  facet_wrap(~date)+
  scale_fill_viridis_c()+
  theme_bw()

a3

Sampling simulation

In this vignette

Sampling strategies

Point sampling

Area sampling