Title: | Access and Analyze eBird Status and Trends Data Products |
---|---|
Description: | Tools for accessing and analyzing eBird Status and Trends Data Products (<https://science.ebird.org/en/status-and-trends>). eBird (<https://ebird.org/home>) is a global database of bird observations collected by member of the public. eBird Status and Trends uses these data to model global bird distributions, abundances, and population trends at a high spatial and temporal resolution. |
Authors: | Matthew Strimas-Mackey [aut, cre] , Shawn Ligocki [aut], Tom Auer [aut] , Daniel Fink [aut] , Cornell Lab of Ornithology [cph] |
Maintainer: | Matthew Strimas-Mackey <[email protected]> |
License: | GPL-3 |
Version: | 3.2022.4 |
Built: | 2024-10-28 13:29:43 UTC |
Source: | https://github.com/ebird/ebirdst |
Given a set of points in space and (optionally) time, define a regular grid with given dimensions, and return the grid cell index for each point.
assign_to_grid( points, coords = NULL, is_lonlat = FALSE, res, jitter_grid = TRUE, grid_definition = NULL )
assign_to_grid( points, coords = NULL, is_lonlat = FALSE, res, jitter_grid = TRUE, grid_definition = NULL )
points |
data frame; points with spatial coordinates |
coords |
character; names of the spatial and temporal coordinates in the
input dataframe. Only provide these names if you want to overwrite the
default coordinate names: |
is_lonlat |
logical; if the points are in unprojected, lon-lat
coordinates. In this case, the input data frame should have columns
|
res |
numeric; resolution of the grid in the |
jitter_grid |
logical; whether to jitter the location of the origin of the grid to introduce some randomness. |
grid_definition |
list; object defining the grid via the |
Data frame with the indices of the space-only and spacetime grid
cells. This data frame will have a grid_definition
attribute that can be
used to reconstruct the grid.
set.seed(1) # generate some example points points_xyt <- data.frame(x = runif(100), y = runif(100), t = rnorm(100)) # assign to grid cells <- assign_to_grid(points_xyt, res = c(0.1, 0.1, 0.5)) # assign a second set of points to the same grid assign_to_grid(points_xyt, grid_definition = attr(cells, "grid_definition")) # assign lon-lat points to a 10km space-only grid points_ll <- data.frame(longitude = runif(100, min = -180, max = 180), latitude = runif(100, min = -90, max = 90)) assign_to_grid(points_ll, res = c(10000, 10000), is_lonlat = TRUE) # overwrite default coordinate names, 5km by 1 week grid points_names <- data.frame(lon = runif(100, min = -180, max = 180), lat = runif(100, min = -90, max = 90), day = sample.int(365, size = 100)) assign_to_grid(points_names, res = c(5000, 5000, 7), coords = c("lon", "lat", "day"), is_lonlat = TRUE)
set.seed(1) # generate some example points points_xyt <- data.frame(x = runif(100), y = runif(100), t = rnorm(100)) # assign to grid cells <- assign_to_grid(points_xyt, res = c(0.1, 0.1, 0.5)) # assign a second set of points to the same grid assign_to_grid(points_xyt, grid_definition = attr(cells, "grid_definition")) # assign lon-lat points to a 10km space-only grid points_ll <- data.frame(longitude = runif(100, min = -180, max = 180), latitude = runif(100, min = -90, max = 90)) assign_to_grid(points_ll, res = c(10000, 10000), is_lonlat = TRUE) # overwrite default coordinate names, 5km by 1 week grid points_names <- data.frame(lon = runif(100, min = -180, max = 180), lat = runif(100, min = -90, max = 90), day = sample.int(365, size = 100)) assign_to_grid(points_names, res = c(5000, 5000, 7), coords = c("lon", "lat", "day"), is_lonlat = TRUE)
Given binary observed and predicted response, estimate Matthews correlation coefficient (MCC) and the F1 score.
calculate_mcc_f1(observed, predicted)
calculate_mcc_f1(observed, predicted)
observed |
logical or 0/1; the observed binary response. |
predicted |
logical or 0/1; the predicted binary response. This will typically need to be generated by applying a threshold to the continuous predicted response. |
A list with two elements: mcc
and f1
.
obs <- c(rep(1L, 1000L), rep(0L, 10000L)) pred <- c(rbeta(300L, 12, 2), rbeta(700L, 3, 4), rbeta(10000L, 2, 3)) calculate_mcc_f1(obs > 0, pred > 0.5)
obs <- c(rep(1L, 1000L), rep(0L, 10000L)) pred <- c(rbeta(300L, 12, 2), rbeta(700L, 3, 4), rbeta(10000L, 2, 3)) calculate_mcc_f1(obs > 0, pred > 0.5)
Get the Status and Trends week that a date falls into
date_to_st_week(dates, version = 2022)
date_to_st_week(dates, version = 2022)
dates |
a vector of dates. |
version |
One of |
An integer vector of weeks numbers from 1-52.
d <- as.Date(c("2016-04-08", "2018-12-31", "2014-01-01", "2018-09-04")) date_to_st_week(d)
d <- as.Date(c("2016-04-08", "2018-12-31", "2014-01-01", "2018-09-04")) date_to_st_week(d)
Identify and return the path to the default download directory for eBird
Status and Trends data products. This directory can be defined by setting the
environment variable EBIRDST_DATA_DIR
, otherwise the directory returned by
tools::R_user_dir("ebirdst", which = "data")
will be used.
ebirdst_data_dir()
ebirdst_data_dir()
The path to the data download directory.
ebirdst_data_dir()
ebirdst_data_dir()
In addition to the species-specific data products, the eBird Status data products include two products providing estimates of weekly data coverage at 3 km spatial resolution: site selection probability and spatial coverage. This function downloads these data products in raster GeoTIFF format.
ebirdst_download_data_coverage( path = ebirdst_data_dir(), pattern = NULL, dry_run = FALSE, force = FALSE, show_progress = TRUE )
ebirdst_download_data_coverage( path = ebirdst_data_dir(), pattern = NULL, dry_run = FALSE, force = FALSE, show_progress = TRUE )
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
pattern |
character; regular expression pattern to supply to
str_detect() to filter files to download. This
filter will be applied in addition to any of the |
dry_run |
logical; whether to do a dry run, just listing files that will
be downloaded. This can be useful when testing the use of |
force |
logical; if the data have already been downloaded, should a fresh copy be downloaded anyway. |
show_progress |
logical; whether to print download progress information. |
Path to the folder containing the downloaded data coverage products.
## Not run: # download all data coverage products ebirdst_download_data_coverage() # download just the spatial coverage products ebirdst_download_data_coverage(pattern = "spatial-coverage") # download a single week of data coverage products ebirdst_download_data_coverage(pattern = "01-04") # download all weeks in april ebirdst_download_data_coverage(pattern = "04-") ## End(Not run)
## Not run: # download all data coverage products ebirdst_download_data_coverage() # download just the spatial coverage products ebirdst_download_data_coverage(pattern = "spatial-coverage") # download a single week of data coverage products ebirdst_download_data_coverage(pattern = "01-04") # download all weeks in april ebirdst_download_data_coverage(pattern = "04-") ## End(Not run)
Download eBird Status Data Products for a single species, or for an example
species. Downloading Status and Trends data requires an access key, consult
set_ebirdst_access_key()
for instructions on how to obtain and store this
key. The example data consist of the results for Yellow-bellied Sapsucker
subset to Michigan and are much smaller than the full dataset, making these
data quicker to download and process. Only the low resolution (27 km) data
are available for the example data. In addition, the example data are
accessible without an access key.
ebirdst_download_status( species, path = ebirdst_data_dir(), download_abundance = TRUE, download_occurrence = FALSE, download_count = FALSE, download_ranges = FALSE, download_regional = FALSE, download_pis = FALSE, download_ppms = FALSE, download_all = FALSE, pattern = NULL, dry_run = FALSE, force = FALSE, show_progress = TRUE )
ebirdst_download_status( species, path = ebirdst_data_dir(), download_abundance = TRUE, download_occurrence = FALSE, download_count = FALSE, download_ranges = FALSE, download_regional = FALSE, download_pis = FALSE, download_ppms = FALSE, download_all = FALSE, pattern = NULL, dry_run = FALSE, force = FALSE, show_progress = TRUE )
species |
character; a single species given as a scientific name, common
name or six-letter species code (e.g. "woothr"). The full list of valid
species is in the ebirdst_runs data frame included in this package. To
download the example dataset, use |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
download_abundance |
whether to download estimates of abundance and proportion of population. |
download_occurrence |
logical; whether to download estimates of occurrence. |
download_count |
logical; whether to download estimates of count. |
download_ranges |
logical; whether to download the range polygons. |
download_regional |
logical; whether to download the regional summary stats, e.g. percent of population in regions. |
download_pis |
logical; whether to download spatial estimates of predictor importance. |
download_ppms |
logical; whether to download spatial predictive performance metrics. |
download_all |
logical; download all files in the data package.
Equivalent to setting all the |
pattern |
character; regular expression pattern to supply to
str_detect() to filter files to download. This
filter will be applied in addition to any of the |
dry_run |
logical; whether to do a dry run, just listing files that will
be downloaded. This can be useful when testing the use of |
force |
logical; if the data have already been downloaded, should a fresh copy be downloaded anyway. |
show_progress |
logical; whether to print download progress information. |
The complete data package for each species contains a large number
of files, all of which are cataloged in the vignettes. Most users will only
require a small subset of these files, so by default this function only
downloads the most commonly used files: GeoTIFFs providing estimate of
relative abundance and proportion of population. For those interested in
additional data products, the arguments starting with download_
control
the download of these other products. The pattern
argument provides even
finer grained control over what gets downloaded.
Path to the folder containing the downloaded data package for the
given species. If dry_run = TRUE
a list of files to download will be
returned.
## Not run: # download the example data ebirdst_download_status("yebsap-example") # download the data package for wood thrush ebirdst_download_status("woothr") # use pattern to only download low resolution (27 km) geotiff data # dry_run can be used to see what files will be downloaded ebirdst_download_status("lobcur", pattern = "_27km_", dry_run = TRUE) # use pattern to only download high resolution (3 km) weekly abundance data ebirdst_download_status("lobcur", pattern = "abundance_median_3km", dry_run = TRUE) ## End(Not run)
## Not run: # download the example data ebirdst_download_status("yebsap-example") # download the data package for wood thrush ebirdst_download_status("woothr") # use pattern to only download low resolution (27 km) geotiff data # dry_run can be used to see what files will be downloaded ebirdst_download_status("lobcur", pattern = "_27km_", dry_run = TRUE) # use pattern to only download high resolution (3 km) weekly abundance data ebirdst_download_status("lobcur", pattern = "abundance_median_3km", dry_run = TRUE) ## End(Not run)
Download eBird Trends Data Products for set of species, or for an example
species. Downloading Status and Trends data requires an access key, consult
set_ebirdst_access_key()
for instructions on how to obtain and store this
key. The example data consist of the results for Yellow-bellied Sapsucker
subset to Michigan and are much smaller than the full dataset, making these
data quicker to download and process. The example data are accessible without
an access key.
ebirdst_download_trends( species, path = ebirdst_data_dir(), force = FALSE, show_progress = TRUE )
ebirdst_download_trends( species, path = ebirdst_data_dir(), force = FALSE, show_progress = TRUE )
species |
character; one or more species given as scientific names,
common names or six-letter species codes (e.g. "woothr"). The full list of
valid species can be viewed in the ebirdst_runs data frame included in
this package; species with trends estimates are indicated by the
|
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
force |
logical; if the data have already been downloaded, should a fresh copy be downloaded anyway. |
show_progress |
logical; whether to print download progress information. |
Character vector of paths to the folders containing the downloaded
data packages for the given species. The trends data will be in the
trends/
subdirectory.
## Not run: # download the example data ebirdst_download_trends("yebsap-example") # download the data package for wood thrush ebirdst_download_trends("woothr") # multiple species can be downloaded at once ebirdst_download_trends(c("Sage Thrasher", "Abert's Towhee")) ## End(Not run)
## Not run: # download the example data ebirdst_download_trends("yebsap-example") # download the data package for wood thrush ebirdst_download_trends("woothr") # multiple species can be downloaded at once ebirdst_download_trends(c("Sage Thrasher", "Abert's Towhee")) ## End(Not run)
Generate the color palettes used for the eBird Status and Trends relative abundance and trends maps.
ebirdst_palettes( n, type = c("weekly", "breeding", "nonbreeding", "migration", "prebreeding_migration", "postbreeding_migration", "year_round", "trends") )
ebirdst_palettes( n, type = c("weekly", "breeding", "nonbreeding", "migration", "prebreeding_migration", "postbreeding_migration", "year_round", "trends") )
n |
integer; the number of colors to be in the palette. |
type |
character; the type of color palette: "weekly" for the weekly relative abundance, "trends" for trends color palett, and a season name for the seasonal relative abundance. Note that for trends a diverging palette is returned, while all other palettes are sequential. |
A character vector of hex color codes.
# breeding season color palette ebirdst_palettes(10, type = "breeding")
# breeding season color palette ebirdst_palettes(10, type = "breeding")
Details on the eBird Status and Trends predictor variables or, for variables all derived from the same dataset, details on the dataset.
ebirdst_predictor_descriptions
ebirdst_predictor_descriptions
A data frame with 37 rows and 4 columns
dataset
: dataset name.
predictor
: predictor name or, if multiple variables are derived from
this dataset, the pattern used to generate the names.
description
: detailed description of the dataset or variable.
reference
: a reference to consult for further information on the dataset.
A data frame of the predictors used in the eBird Status and Trends models.
These include effort variables (e.g. distance traveled, number of observers,
etc.) in addition to variables describing the environment (e.g. elevation,
land cover, water cover, etc.). The environmental variables are derived by
summarizing remotely sensed datasets (described in
ebirdst_predictor_descriptions) over a 3 km diameter neighborhood around
each checklist. For categorical datasets, two variables are generated for
each class describing the percent cover (pland
) and edge density (ed
).
ebirdst_predictors
ebirdst_predictors
A data frame with 150 rows and 4 columns:
predictor
: predictor name.
dataset
: dataset name, which can be cross referenced in
ebirdst_predictor_descriptions for further details.
class
: class number or name for categorical variables.
label
: descriptive labels for each predictor variable.
A dataset listing the species for which eBird Status and Trends Data Products are available, with additional information relevant to both the Status and Trends results for each species.
ebirdst_runs
ebirdst_runs
A data frame with 27 variables:
species_code
: alphanumeric eBird species code uniquely identifying the
species
scientific_name
: scientific name.
common_name
: English common name.
is_resident
: classifies this species a resident or a migrant.
breeding_quality
: breeding season quality.
breeding_start
: breeding season start date.
breeding_end
: breeding season start date.
nonbreeding_quality
: non-breeding season quality.
nonbreeding_start
: non-breeding season start date.
nonbreeding_end
: non-breeding season start date.
postbreeding_migration_quality
: post-breeding season quality.
postbreeding_migration_start
: post-breeding season start date.
postbreeding_migration_end
: post-breeding season start date.
prebreeding_migration_quality
: pre-breeding season quality.
prebreeding_migration_start
: pre-breeding season start date.
prebreeding_migration_end
: pre-breeding season start date.
resident_quality
: resident quality.
resident_start
: for resident species, the year-round start date.
resident_end
: for resident species, the year-round end date.
has_trends
: whether or not this species has trends estimates.
trends_season
: season that the trend was estimated for: breeding,
nonbreeding, or resident.
trends_region
: the geographic region that the trend model was run for.
Note that broadly distributed species (e.g. Barn Swallow) will only have
trend estimates for a regional subset of their full range.
trends_start_year
: start year of the trend time period.
trends_end_year
: end year of the trend time period.
trends_start_date
: start date (MM-DD
format) of the season for which the trend was estimated.
trends_end_date
: end date (MM-DD
format) of the season for which the trend was estimated.
rsquared
: R-squared value comparing the actual and estimated trends from the simulations.
beta0
: the intercept of a linear model fitting actual vs. estimated
trends.
(actual ~ estimated
) for the simulations. Positive values of beta0
indicate that the models are systematically underestimating the simulated
trend for this species.
For the Status Data Products, the dates defining the boundaries of the seasons are provided in additional to a quality rating from 0-3 for each season. These dates and quality ratings are assigned through a process of expert review. expert review. Note that missing dates imply that a season failed expert review for that species within that season.
Trends Data Products are only available for a subset of species, indicated by
the has_trends
variable, and for each species the trends is estimated for a
single season. The two predictive performance metrics (rsquared
and
beta0
) are based on a comparison of actual and estimated percent per year
trends for a suite of simulations (see Fink et al. 2023 for further details).
The trends regions are defined as follows:
aus_nz
: Australia and New Zealand
iberia
: Spain and Portugal
india_se_asia
: India, Nepal, Bhutan, Sri Lanka, Thailand, Cambodia,
Malaysia, Brunei, Singapore, and Philippines
japan
: Japan
north_america
: North America including Mexico, Central America, and the
Caribbean, but excluding Nunavut, North West Territories, and Hawaii
south_africa
: South Africa, Lesotho, and Eswatini
south_america
: Colombia, Ecuador, Peru, Chile, Argentina, and Uruguay
taiwan
: Taiwan
turkey_plus
: Turkey, Cyprus, Israel, Palestine, Greece, Armenia, and
Georgia
Identify the version of the eBird Status and Trends Data Products that this version of the R package works with. Versions are defined by the year that all model estimates are made for. In addition, the release data and end date for access of this version of the data are provided. Note that after the given access end data you will no longer be able to download this version of the data and will be required to update the R package and transition to using a newer data version.
ebirdst_version()
ebirdst_version()
A list with three components: version_year
is the year the model
estimates are made for in this version of the data, release_year
is the
year this version of the data were released, and access_end_date
is the
last date that users will be able to download this version of the data.
ebirdst_version()
ebirdst_version()
Give a vector of species codes, common names, and/or scientific names, return a vector of 6-letter eBird species codes. This function will only look up codes for species for which eBird Status and Trends results exist.
get_species(x)
get_species(x)
x |
character; vector of species codes, common names, and/or scientific names. |
A character vector of eBird species codes.
get_species(c("Black-capped Chickadee", "Poecile gambeli", "carchi"))
get_species(c("Black-capped Chickadee", "Poecile gambeli", "carchi"))
This helper function can be used to get the path to a data package for a given species.
get_species_path(species, path = ebirdst_data_dir(), check_downloaded = TRUE)
get_species_path(species, path = ebirdst_data_dir(), check_downloaded = TRUE)
species |
character; a single species given as a scientific name, common
name or six-letter species code (e.g. "woothr"). The full list of valid
species is in the ebirdst_runs data frame included in this package. To
download the example dataset, use |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
check_downloaded |
logical; raise an error if no data have been downloaded for this species. |
The path to the data package directory.
## Not run: # get the path path <- get_species_path("yebsap-example") # get the path to the full data package for yellow-bellied sapsucker # common name, scientific name, or species code can be used path <- get_species_path("Yellow-bellied Sapsucker") path <- get_species_path("Sphyrapicus varius") path <- get_species_path("yebsap") ## End(Not run)
## Not run: # get the path path <- get_species_path("yebsap-example") # get the path to the full data package for yellow-bellied sapsucker # common name, scientific name, or species code can be used path <- get_species_path("Yellow-bellied Sapsucker") path <- get_species_path("Sphyrapicus varius") path <- get_species_path("yebsap") ## End(Not run)
Sample observation data on a spacetime grid to reduce spatiotemporal bias.
grid_sample( x, coords = c("longitude", "latitude", "day_of_year"), is_lonlat = TRUE, res = c(3000, 3000, 7), jitter_grid = TRUE, sample_size_per_cell = 1, cell_sample_prop = 0.75, keep_cell_id = FALSE, grid_definition = NULL ) grid_sample_stratified( x, coords = c("longitude", "latitude", "day_of_year"), is_lonlat = TRUE, unified_grid = FALSE, keep_cell_id = FALSE, by_year = TRUE, case_control = TRUE, obs_column = "obs", sample_by = NULL, min_detection_probability = 0, maximum_ss = NULL, jitter_columns = NULL, jitter_sd = 0.1, ... )
grid_sample( x, coords = c("longitude", "latitude", "day_of_year"), is_lonlat = TRUE, res = c(3000, 3000, 7), jitter_grid = TRUE, sample_size_per_cell = 1, cell_sample_prop = 0.75, keep_cell_id = FALSE, grid_definition = NULL ) grid_sample_stratified( x, coords = c("longitude", "latitude", "day_of_year"), is_lonlat = TRUE, unified_grid = FALSE, keep_cell_id = FALSE, by_year = TRUE, case_control = TRUE, obs_column = "obs", sample_by = NULL, min_detection_probability = 0, maximum_ss = NULL, jitter_columns = NULL, jitter_sd = 0.1, ... )
x |
data frame; observations to sample, including at least the columns defining the location in space and time. Additional columns can be included such as features that will later be used in model training. |
coords |
character; names of the spatial and temporal coordinates. By
default the spatial spatial coordinates should be |
is_lonlat |
logical; if the points are in unprojected, lon-lat coordinates. In this case, the points will be projected to an equal area Eckert IV CRS prior to grid assignment. |
res |
numeric; resolution of the spatiotemporal grid in the x, y, and time dimensions. Unprojected locations are projected to an equal area coordinate system prior to sampling, and resolution should therefore be provided in units of meters. The temporal resolution should be in the native units of the time coordinate in the input data frame, typically it will be a number of days. |
jitter_grid |
logical; whether to jitter the location of the origin of the grid to introduce some randomness. |
sample_size_per_cell |
integer; number of observations to sample from each grid cell. |
cell_sample_prop |
proportion |
keep_cell_id |
logical; whether to retain a unique cell identifier,
stored in column named |
grid_definition |
list defining the spatiotemporal sampling grid as
returned by |
unified_grid |
logical; whether a single, unified spatiotemporal
sampling grid should be defined and used for all observations in |
by_year |
logical; whether the sampling should be done by year, i.e.
sampling N observations per grid cell per year, rather than across years,
i.e. N observations per grid cell regardless of year. If using sampling by
year, the input data frame |
case_control |
logical; whether to apply case control sampling whereby presence and absence are sampled independently. |
obs_column |
character; if |
sample_by |
character; additional columns in |
min_detection_probability |
proportion |
maximum_ss |
integer; the maximum sample size in the final dataset. If
the grid sampling yields more than this number of observations,
|
jitter_columns |
character; if detections are oversampled to achieve the
minimum detection probability, some observations will be duplicated, and it
can be desirable to slightly "jitter" the values of model training features
for these duplicated observations. This argument defines the column names
in |
jitter_sd |
numeric; strength of the jittering in units of standard
deviations, see |
... |
additional arguments defining the spatiotemporal grid; passed to
|
grid_sample_stratified()
performs stratified case control sampling,
independently sampling from strata defined by, for example, year and
detection/non-detection. Within each stratum, grid_sample()
is used to
sample the observations on a spatiotemporal grid. In addition, if case
control sampling is turned on, the detections are oversampled to increase the
frequecy of detections in the dataset.
The sampling grid is defined, and assignment of locations to cells occurs, in
assign_to_grid()
. Consult the help for that function for further details on
how the grid is generated and locations are assigned. Note that by providing
2-element vectors to both coords
and res
the time component of the grid
can be ignored and spatial-only subsampling is performed.
A data frame of the spatiotemporally sampled data.
set.seed(1) # generate some example observations n_obs <- 10000 checklists <- data.frame(longitude = rnorm(n_obs, sd = 0.1), latitude = rnorm(n_obs, sd = 0.1), day_of_year = sample.int(28, n_obs, replace = TRUE), year = NA_integer_, obs = rpois(n_obs, lambda = 0.1), forest_cover = runif(n_obs), island = as.integer(runif(n_obs) > 0.95)) # add a year column, giving more data to recent years checklists$year <- sample(seq(2016, 2020), size = n_obs, replace = TRUE, prob = seq(0.3, 0.7, length.out = 5)) # create several rare islands checklists$island[sample.int(nrow(checklists), 9)] <- 2:10 # basic spatiotemporal grid sampling sampled <- grid_sample(checklists) # plot original data and grid sampled data par(mar = c(0, 0, 0, 0)) plot(checklists[, c("longitude", "latitude")], pch = 19, cex = 0.3, col = "#00000033", axes = FALSE) points(sampled[, c("longitude", "latitude")], pch = 19, cex = 0.3, col = "red") # case control sampling stratified by year and island # return a maximum of 1000 checklists sampled_cc <- grid_sample_stratified(checklists, sample_by = "island", maximum_ss = 1000) # case control sampling increases the prevalence of detections mean(checklists$obs > 0) mean(sampled$obs > 0) mean(sampled_cc$obs > 0) # stratifying by island ensures all levels are retained, even rare ones table(checklists$island) # normal grid sampling loses rare island levels table(sampled$island) # stratified grid sampling retain at least one observation from each level table(sampled_cc$island)
set.seed(1) # generate some example observations n_obs <- 10000 checklists <- data.frame(longitude = rnorm(n_obs, sd = 0.1), latitude = rnorm(n_obs, sd = 0.1), day_of_year = sample.int(28, n_obs, replace = TRUE), year = NA_integer_, obs = rpois(n_obs, lambda = 0.1), forest_cover = runif(n_obs), island = as.integer(runif(n_obs) > 0.95)) # add a year column, giving more data to recent years checklists$year <- sample(seq(2016, 2020), size = n_obs, replace = TRUE, prob = seq(0.3, 0.7, length.out = 5)) # create several rare islands checklists$island[sample.int(nrow(checklists), 9)] <- 2:10 # basic spatiotemporal grid sampling sampled <- grid_sample(checklists) # plot original data and grid sampled data par(mar = c(0, 0, 0, 0)) plot(checklists[, c("longitude", "latitude")], pch = 19, cex = 0.3, col = "#00000033", axes = FALSE) points(sampled[, c("longitude", "latitude")], pch = 19, cex = 0.3, col = "red") # case control sampling stratified by year and island # return a maximum of 1000 checklists sampled_cc <- grid_sample_stratified(checklists, sample_by = "island", maximum_ss = 1000) # case control sampling increases the prevalence of detections mean(checklists$obs > 0) mean(sampled$obs > 0) mean(sampled_cc$obs > 0) # stratifying by island ensures all levels are retained, even rare ones table(checklists$island) # normal grid sampling loses rare island levels table(sampled$island) # stratified grid sampling retain at least one observation from each level table(sampled_cc$island)
Load the configuration file for an eBird Status run. This configuration file is mostly for internal use and contains a variety of parameters used in the modeling process.
load_config(species, path = ebirdst_data_dir())
load_config(species, path = ebirdst_data_dir())
species |
character; the species to load data for, given as a scientific
name, common name or six-letter species code (e.g. "woothr"). The full list
of valid species is in the ebirdst_runs data frame included in this
package. To download the example dataset, use |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
A list with the run configuration parameters.
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # load configuration parameters p <- load_config("yebsap-example") ## End(Not run)
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # load configuration parameters p <- load_config("yebsap-example") ## End(Not run)
The data coverage products are packaged as individual GeoTIFF files for each
product for each week of the year. This function loads one of the available
data products for one or more weeks into R as a
SpatRaster object. Note that data must be downloaded
using ebirdst_download_data_coverage()
prior to loading it using this
function.
load_data_coverage( product = c("spatial-coverage", "selection-probability"), weeks = NULL, path = ebirdst_data_dir() )
load_data_coverage( product = c("spatial-coverage", "selection-probability"), weeks = NULL, path = ebirdst_data_dir() )
product |
character; data coverage raster product to load: spatial coverage or site selection probability. |
weeks |
character; one or more weeks (expressed in |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
In addition to the species-specific data products, the eBird Status data products include two products providing estimates of weekly data coverage at 3 km spatial resolution:
spatial-coverage
: a spatially smoothed estimate of the proportion of the
area that was covered by eBird checklists for the given week.
selection-probability
: a modeled estimate of the probability that the
given location and habitat was sampled by eBird data in the given week.
A SpatRaster with between 1 and 52 layers for
the given product for the given weeks, where the layer names are the dates
(YYYY-MM-DD
format) of the midpoint of each week.
## Not run: # download example data if hasn't already been downloaded ebirdst_download_data_coverage() # load a single week of site selection probability data load_data_coverage("selection-probability", weeks = "01-04") # load all weeks of spatial coverage data load_data_coverage("spatial-coverage", weeks = c("01-04", "01-11")) ## End(Not run)
## Not run: # download example data if hasn't already been downloaded ebirdst_download_data_coverage() # load a single week of site selection probability data load_data_coverage("selection-probability", weeks = "01-04") # load all weeks of spatial coverage data load_data_coverage("spatial-coverage", weeks = c("01-04", "01-11")) ## End(Not run)
Get the map parameters used on the eBird Status and Trends website to optimally display the full annual cycle data. This includes bins for the abundance data, a projection, and an extent to map. The extent is the spatial extent of non-zero data across the full annual cycle and the projection is optimized for this extent.
load_fac_map_parameters(species, path = ebirdst_data_dir())
load_fac_map_parameters(species, path = ebirdst_data_dir())
species |
character; the species to load data for, given as a scientific
name, common name or six-letter species code (e.g. "woothr"). The full list
of valid species is in the ebirdst_runs data frame included in this
package. To download the example dataset, use |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
A list containing elements:
custom_projection
: a custom projection optimized for the given species'
full annual cycle
fa_extent
: a SpatExtent object storing the spatial
extent of non-zero
data for the given species in the custom projection
res
: a numeric vector with 2 elements giving the target resolution of
raster in the custom projection
fa_extent_sinu
: the extent in sinusoidal projection
weekly_bins
/weekly_labels
: weekly abundance bins and labels for the
full annual cycle
seasonal_bins
/'seasonal_labels: seasonal abundance bins and labels for
the full annual cycle
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # load configuration parameters load_fac_map_parameters(path) ## End(Not run)
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # load configuration parameters load_fac_map_parameters(path) ## End(Not run)
The eBird Status models estimate the relative importance of each
environmental predictor used in the model. These predictor importance (PI)
data are converted to ranks (with a rank of 1 being the most important)
relative to the full suite of environmental predictors. The ranks are
summarized to a 27 km resolution raster grid for each predictor, where the
cell values are the average across all models in the ensemble contributing to
that cell. These data are available in raster format provided download_pis = TRUE
was used when calling ebirdst_download_status()
. PI estimates are
available separately for both the occurrence and count sub-model and only the
30 most important predictors are distributed. Use list_available_pis()
to
see which predictors have PI data.
load_pi( species, predictor, response = c("occurrence", "count"), path = ebirdst_data_dir() ) list_available_pis(species, path = ebirdst_data_dir())
load_pi( species, predictor, response = c("occurrence", "count"), path = ebirdst_data_dir() ) list_available_pis(species, path = ebirdst_data_dir())
species |
character; the species to load data for, given as a scientific
name, common name or six-letter species code (e.g. "woothr"). The full list
of valid species is in the ebirdst_runs data frame included in this
package. To download the example dataset, use |
predictor |
character; the predictor that the PI data should be loaded
for. The list of predictors that PI data are available for varies by
species, use |
response |
character; the model (occurrence or count) that the PI data should be loaded for. |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
A SpatRaster object with the PI ranks for the
given predictor. For migrants, the estimates are weekly and the raster will
have 52 layers, where the layer names are the dates (MM-DD
format) of the
midpoint of each week. For residents, a single year round layer is
returned.
list_available_pis()
returns a data frame listing the top 30 predictors for
which PI rasters can be loaded. In addition to the predictor names, the mean
range-wide rank (rangewide_rank
) is given as well as the integer rank
(rank
) relative to the other 29 predictors.
list_available_pis()
: list the predictors that have PI information for this
species.
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example", download_pis = TRUE) # identify the top predictor top_preds <- list_available_pis("yebsap-example") print(top_preds[1, ]) # load predictor importance raster of top predictor for occurrence load_pi("yebsap-example", top_preds$predictor[1]) ## End(Not run)
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example", download_pis = TRUE) # identify the top predictor top_preds <- list_available_pis("yebsap-example") print(top_preds[1, ]) # load predictor importance raster of top predictor for occurrence load_pi("yebsap-example", top_preds$predictor[1]) ## End(Not run)
eBird Status models are evaluated against a test set of eBird data not used
during model training and a suite of predictive performance metrics (PPMs)
are calculated. The PPMs for each base model are summarized to a 27 km
resolution raster grid, where the cell values are the average across all
models in the ensemble contributing to that cell. These data are available in
raster format provided download_ppms = TRUE
was used when calling
ebirdst_download_status()
.
load_ppm( species, ppm = c("binary_f1", "binary_pr_auc", "occ_bernoulli_dev", "count_spearman", "log_count_pearson", "abd_poisson_dev", "abd_spearman", "log_abd_pearson"), path = ebirdst_data_dir() )
load_ppm( species, ppm = c("binary_f1", "binary_pr_auc", "occ_bernoulli_dev", "count_spearman", "log_count_pearson", "abd_poisson_dev", "abd_spearman", "log_abd_pearson"), path = ebirdst_data_dir() )
species |
character; the species to load data for, given as a scientific
name, common name or six-letter species code (e.g. "woothr"). The full list
of valid species is in the ebirdst_runs data frame included in this
package. To download the example dataset, use |
ppm |
character; the name of a single metric to load data for. See Details for definitions of each metric. |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
Eight predictive performance metrics are provided:
binary_f1
: F1-score comparing the model predictions converted to binary
with the observed detection/non-detection for the test checklists.
binary_pr_auc
: the area on the precision-recall curve generated by
comparing the model predictions converted to binary with the observed
detection/non-detection for the test checklists.
occ_bernoulli_dev
: Bernoulli deviance comparing the predicted occurrence
with the observed detection/non-detection for the test checklists.
count_spearman
: Spearman's rank correlation coefficient comparing the
predicted count with the observed count for the subset of test checklists
on which the species was detected.
log_count_pearson
: Pearson correlation coefficient comparing the
logarithm of the predicted count with the logarithm of the observed count
for the subset of test checklists on which the species was detected.
abd_poisson_dev
: Poisson deviance comparing the predicted relative
abundance with the observed count for the full set of test checklists.
abd_spearman
: Spearman's rank correlation coefficient comparing the
predicted relative abundance with the observed count for the full set of
test checklists.
log_abd_pearson
: Pearson correlation coefficient comparing the logarithm
of the predicted relative abundance with the logarithm of the observed
count for the full set of test checklists.
A SpatRaster object with the PPM data. For
migrants, rasters are weekly with 52 layers, where the layer names are the
dates (MM-DD
format) of the midpoint of each week. For residents, a
single year round layer is returned.
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example", download_ppms = TRUE) # load area under the precision-recall curve PPM raster load_ppm("yebsap-example", ppm = "binary_pr_auc") ## End(Not run)
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example", download_ppms = TRUE) # load area under the precision-recall curve PPM raster load_ppm("yebsap-example", ppm = "binary_pr_auc") ## End(Not run)
Range polygons are defined as the boundaries of non-zero seasonal relative
abundance estimates, which are then (optionally) smoothed to produce more
aesthetically pleasing polygons using the smoothr
package.
load_ranges( species, resolution = c("9km", "27km"), smoothed = TRUE, path = ebirdst_data_dir() )
load_ranges( species, resolution = c("9km", "27km"), smoothed = TRUE, path = ebirdst_data_dir() )
species |
character; the species to load data for, given as a scientific
name, common name or six-letter species code (e.g. "woothr"). The full list
of valid species is in the ebirdst_runs data frame included in this
package. To download the example dataset, use |
resolution |
character; the raster resolution from which the range polygons were derived. |
smoothed |
logical; whether smoothed or unsmoothed ranges should be loaded. |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
An sf
update containing the seasonal range boundaries, with each
season provided as a different feature.
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # load smoothed ranges # note that only 27 km data are provided for the example data ranges <- load_ranges("yebsap-example", resolution = "27km") ## End(Not run)
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # load smoothed ranges # note that only 27 km data are provided for the example data ranges <- load_ranges("yebsap-example", resolution = "27km") ## End(Not run)
Each of the eBird Status raster products is packaged as a GeoTIFF file
representing predictions on a regular grid. The core products are occurrence,
count, relative abundance, and proportion of population. This function loads
one of the available data products into R as a
SpatRaster object. Note that data must be downloaded
using ebirdst_download_status()
prior to loading it using this function.
load_raster( species, product = c("abundance", "count", "occurrence", "proportion-population"), period = c("weekly", "seasonal", "full-year"), metric = NULL, resolution = c("3km", "9km", "27km"), path = ebirdst_data_dir() )
load_raster( species, product = c("abundance", "count", "occurrence", "proportion-population"), period = c("weekly", "seasonal", "full-year"), metric = NULL, resolution = c("3km", "9km", "27km"), path = ebirdst_data_dir() )
species |
character; the species to load data for, given as a scientific
name, common name or six-letter species code (e.g. "woothr"). The full list
of valid species is in the ebirdst_runs data frame included in this
package. To download the example dataset, use |
product |
character; eBird Status raster product to load: occurrence, count, relative abundance, or proportion of population. See Details for a detailed explanation of each of these products. |
period |
character; temporal period of the estimation. The eBird Status models make predictions for each week of the year; however, as a convenience, data are also provided summarized at the seasonal or annual ("full-year") level. |
metric |
character; by default, the weekly products provide estimates of
the median value ( |
resolution |
character; the resolution of the raster data to load. The default is to load the native 3 km resolution data; however, for some applications 9 km or 27 km data may be suitable. |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
The core eBird Status data products provide weekly estimates across a regular spatial grid. They are packaged as rasters with 52 layers, each corresponding to estimates for a week of the year, and we refer to them as "cubes" (e.g. the "relative abundance cube"). All estimates are the median expected value for a standard 2 km, 1 hour eBird Traveling Count by an expert eBird observer at the optimal time of day and for optimal weather conditions to observe the given species. These products are:
occurrence
: the expected probability (0-1) of occurrence of a species.
count
: the expected count of a species, conditional on its occurrence at
the given location.
abundance
: the expected relative abundance of a species, computed as the
product of the probability of occurrence and the count conditional on
occurrence.
proportion-population
: the proportion of the total relative abundance
within each cell. This is a derived product calculated by dividing each cell
value in the relative abundance raster by the total abundance summed across
all cells.
In addition to these weekly data cubes, this function provides access to data
summarized over different periods. Seasonal cubes are produced by taking the
cell-wise mean or max across the weeks within each season. The boundary dates
for each season are species specific and are available in ebirdst_runs
, and
if a season failed review no associated layer will be included in the cube.
In addition, full-year summaries provide the mean or max across all weeks of
the year that fall within a season that passed review. Note that this is not
necessarily all 52 weeks of the year. For example, if the estimates for the
non-breeding season failed expert review for a given species, the full-year
summary for that species will not include the weeks that would fall within
the non-breeding season.
For the weekly cubes, a SpatRaster with 52
layers for the given product, where the layer names are the dates
(YYYY-MM-DD
format) of the midpoint of each week. Seasonal cubes will
have up to four layers named with the corresponding season. The full-year
products will have a single layer.
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # weekly relative abundance # note that only 27 km data are available for the example data abd_weekly <- load_raster("yebsap-example", "abundance", resolution = "27km") # the weeks for each layer are stored in the layer names names(abd_weekly) # they can be converted to date objects with as.Date as.Date(names(abd_weekly)) # max seasonal abundance abd_seasonal <- load_raster("yebsap-example", "abundance", period = "seasonal", metric = "max", resolution = "27km") # available seasons in stack names(abd_seasonal) # subset to just breeding season abundance abd_seasonal[["breeding"]] ## End(Not run)
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # weekly relative abundance # note that only 27 km data are available for the example data abd_weekly <- load_raster("yebsap-example", "abundance", resolution = "27km") # the weeks for each layer are stored in the layer names names(abd_weekly) # they can be converted to date objects with as.Date as.Date(names(abd_weekly)) # max seasonal abundance abd_seasonal <- load_raster("yebsap-example", "abundance", period = "seasonal", metric = "max", resolution = "27km") # available seasons in stack names(abd_seasonal) # subset to just breeding season abundance abd_seasonal[["breeding"]] ## End(Not run)
Load seasonal summary statistics for regions consisting of countries and states/provinces.
load_regional_stats(species, path = ebirdst_data_dir())
load_regional_stats(species, path = ebirdst_data_dir())
species |
character; the species to load data for, given as a scientific
name, common name or six-letter species code (e.g. "woothr"). The full list
of valid species is in the ebirdst_runs data frame included in this
package. To download the example dataset, use |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
A data frame containing regional summary statistics with columns:
species_code
: alphanumeric eBird species code.
region_type
: country
for countries or state
for states, provinces,
or other sub-national regions.
region_code
: alphanumeric code for the region.
region_name
: English name of the region.
season
: name of the season that the summary statistics were calculated
for.
abundance_mean
: mean relative abundance in the region.
total_pop_percent
: proportion of the seasonal modeled population
falling within the region.
range_percent_occupied
: the proportion of the region occupied by the
species during the given season.
range_total_percent
: the proportion of the species seasonal range
falling within the region.
range_days_occupation
: number of days of the season that the region was
occupied by this species.
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # load configuration parameters regional <- load_regional_stats("yebsap-example") ## End(Not run)
## Not run: # download example data if hasn't already been downloaded ebirdst_download_status("yebsap-example") # load configuration parameters regional <- load_regional_stats("yebsap-example") ## End(Not run)
Load the relative abundance trend estimates for a single species or a set of
species. Trends are estimated on a 27 km by 27 km grid for a single season
per species (breeding, non-breeding, or resident). Note that data must be
downloaded using ebirdst_download_trends()
prior to loading it using this
function.
load_trends(species, fold_estimates = FALSE, path = ebirdst_data_dir())
load_trends(species, fold_estimates = FALSE, path = ebirdst_data_dir())
species |
character; one or more species given as scientific names,
common names or six-letter species codes (e.g. "woothr"). The full list of
valid species can be viewed in the ebirdst_runs data frame included in
this package; species with trends estimates are indicated by the
|
fold_estimates |
logical; by default, the trends summarized across the
100-fold ensemble are returned; however, by setting |
path |
character; directory to download the data to. All downloaded
files will be placed in a sub-directory of this directory named for the
data version year, e.g. "2020" for the 2020 Status Data Products. Each
species' data package will then appear in a directory named with the eBird
species code. Defaults to a persistent data directory, which can be found
by calling |
The trends in relative abundance are estimated using a double machine
learning model. To quantify uncertainty, an ensemble of 100 estimates is made
at each location, each based on a random subsample of eBird data. The
estimated trend is the median across the ensemble, and the 80% confidence
intervals are the lower 10th and upper 90th percentiles across the ensemble.
To access estimates from the individual folds making up the ensemble use
fold_estimates = TRUE
. These fold-level estimates can be used to quantify
uncertainty, for example, when calculating the trend for a given region. For
further details on the methodology used to estimate trends consult Fink et
al. 2023.
A data frame containing the trends estimates for a set of species. The following columns are included:
species_code
: the alphanumeric eBird species code uniquely identifying
the species.
season
: season that the trend was estimated for: breeding,
nonbreeding, or resident.
start_year/end_year
: the start and end years of the trend time period.
start_date/end_date
: the start and end dates (MM-DD
format) of the
season for which the trend was estimated.
srd_id
: unique integer identifier for the grid cell.
longitude/latitude
: longitude and latitude of the grid cell center.
abd
: relative abundance estimate for the middle of the trend time
period (e.g. 2014 for a 2007-2021 trend).
abd_ppy
: the median estimated percent per year change in relative
abundance.
abd_ppy_lower/abd_ppy_upper
: the 80% confidence interval for the
estimated percent per year change in relative abundance.
abd_ppy_nonzero
: a logical (TRUE/FALSE) value indicating if the 80%
confidence limits overlap zero (FALSE) or don't overlap zero (TRUE)
abd_trend
: the median estimated cumulative change in relative
abundance over the trend time period.
abd_trend_lower/abd_trend_upper
: the 80% confidence interval for the
estimated cumulative change in relative abundance over the trend time
period.
If fold_estimates = TRUE
, a data frame of fold-level trend estimates is
returned with the following columns:
species_code
: the alphanumeric eBird species code uniquely identifying
the species.
season
: season that the trend was estimated for: breeding,
nonbreeding, or resident.
srd_id
: unique integer identifier for the grid cell.
abd
: relative abundance estimate for the middle of the trend time
period (e.g. 2014 for a 2007-2021 trend).
abd_ppy
: the estimated percent per year change in relative abundance.
Fink, D., Johnston, A., Strimas-Mackey, M., Auer, T., Hochachka, W. M., Ligocki, S., Oldham Jaromczyk, L., Robinson, O., Wood, C., Kelling, S., & Rodewald, A. D. (2023). A Double machine learning trend model for citizen science data. Methods in Ecology and Evolution, 00, 1–14. https://doi.org/10.1111/2041-210X.14186
## Not run: # download example trends data if it hasn't already been downloaded ebirdst_download_trends("yebsap-example") # load trends trends <- load_trends("yebsap-example") # load fold-level estimates trends_folds <- load_trends("yebsap-example", fold_estimates = TRUE) ## End(Not run)
## Not run: # download example trends data if it hasn't already been downloaded ebirdst_download_trends("yebsap-example") # load trends trends <- load_trends("yebsap-example") # load fold-level estimates trends_folds <- load_trends("yebsap-example", fold_estimates = TRUE) ## End(Not run)
The eBird trends data are stored in a tabular format, where each row gives
the trend estimate for a single cell in a 27 km x 27 km equal area grid. For
many applications, an explicitly spatial format is more useful. This function
uses the cell center coordinates to convert the tabular trend estimates to
raster format in terra
SpatRaster format.
rasterize_trends( trends, layers = c("abd_ppy", "abd_ppy_lower", "abd_ppy_upper"), trim = TRUE )
rasterize_trends( trends, layers = c("abd_ppy", "abd_ppy_lower", "abd_ppy_upper"), trim = TRUE )
trends |
data frame; trends data for a single species as returned by
|
layers |
character; column names in the trends data frame to rasterize. These columns will become layers in the raster that is created. |
trim |
logical; flag indicating if the returned raster should be trimmed
to remove outer rows and columns that are NA. If |
A SpatRaster object.
## Not run: # download example trends data if it hasn't already been downloaded ebirdst_download_trends("yebsap-example") # load trends trends <- load_trends("yebsap-example") # rasterize percent per year trend rasterize_trends(trends, "abd_ppy") ## End(Not run)
## Not run: # download example trends data if it hasn't already been downloaded ebirdst_download_trends("yebsap-example") # load trends trends <- load_trends("yebsap-example") # rasterize percent per year trend rasterize_trends(trends, "abd_ppy") ## End(Not run)
Accessing eBird Status and Trends data requires an access key, which can be
obtained by visiting https://ebird.org/st/request. This key must be stored as
the environment variable EBIRDST_KEY
in order for
ebirdst_download_status()
and ebirdst_download_trends()
to use it. The
easiest approach is to store the key in your .Renviron
file so it can
always be accessed in your R sessions. Use this function to set EBIRDST_KEY
in your .Renviron
file provided that it is located in the standard location
in your home directory. It is also possible to manually edit the .Renviron
file. The access key is specific to you and should never be shared or made
publicly accessible.
set_ebirdst_access_key(key, overwrite = FALSE)
set_ebirdst_access_key(key, overwrite = FALSE)
key |
character; API key obtained by filling out the form at https://ebird.org/st/request. |
overwrite |
logical; should the existing |
Edits .Renviron, then returns the path to this file invisibly.
## Not run: # save the api key, replace XXXXXX with your actual key set_ebirdst_access_key("XXXXXX") ## End(Not run)
## Not run: # save the api key, replace XXXXXX with your actual key set_ebirdst_access_key("XXXXXX") ## End(Not run)