A site index file contains information on when specific ARUs were deployed
where. This function cleans a file (csv, xlsx) or data frame in preparation
for adding these details to the output of clean_metadata()
. It can be used
to specify missing information according to date, such as GPS lon/lats and
site ids.
Usage
clean_site_index(
site_index,
name_aru_id = "aru_id",
name_site_id = "site_id",
name_date_time = "date",
name_coords = c("longitude", "latitude"),
name_extra = NULL,
resolve_overlaps = TRUE,
quiet = FALSE
)
Arguments
- site_index
(Spatial) Data frame or file path. Site index data to clean. If file path, must be to a local csv or xlsx file.
- name_aru_id
Character. Name of the column that contains ARU ids. Default
"aru_id"
.- name_site_id
Character. Name of the column that contains site ids. Default
"site_id"
.- name_date_time
Character. Column name that contains dates or date/times. Can be vector of two names if there are both 'start' and 'end' columns. Can be
NULL
to ignore dates. Default"date"
.- name_coords
Character. Column names that contain longitude and latitude (in that order). Ignored if
site_index
is spatial. Defaultc("longitude", "latitude")
- name_extra
Character. Column names for extra data to include. If a named vector, will rename the columns (see examples). Default
NULL
.- resolve_overlaps
Logical. Whether or not to resolve date overlaps by shifting the start/end dates to noon (default
TRUE
). This assumes that ARUs are generally not deployed/removed at midnight (the official start/end of a day) and so noon is used as an approximation for when an ARU was deployed or removed. If possible, use specific deployment times to avoid this issue.- quiet
Logical. Whether to suppress progress messages and other non-essential updates.
Details
Note that times are assumed to be in 'local' time and a timezone isn't used (and is removed if present, replaced with UTC). This allows sites from different timezones to be processed at the same time.
Examples
s <- clean_site_index(example_sites,
name_aru_id = "ARU",
name_site_id = "Sites",
name_date_time = c("Date_set_out", "Date_removed"),
name_coords = c("lon", "lat")
)
#> There are overlapping date ranges
#> • Shifting start/end times to 'noon'
#> • Skip this with `resolve_overlaps = FALSE`
s <- clean_site_index(example_sites,
name_aru_id = "ARU",
name_site_id = "Sites",
name_date_time = c("Date_set_out", "Date_removed"),
name_coords = c("lon", "lat"),
name_extra = c("plot" = "Plots")
)
#> There are overlapping date ranges
#> • Shifting start/end times to 'noon'
#> • Skip this with `resolve_overlaps = FALSE`
# Without dates
eg <- dplyr::select(example_sites, -Date_set_out, -Date_removed)
s <- clean_site_index(eg,
name_aru_id = "ARU",
name_site_id = "Sites",
name_date_time = NULL,
name_coords = c("lon", "lat"),
name_extra = c("plot" = "Plots")
)