Skip to contents

A site index file contains information on when specific ARUs were deployed where. This function cleans a file (csv, xlsx) or data frame in preparation for adding these details to the output of clean_metadata(). It can be used to specify missing information according to date, such as GPS lon/lats and site ids.

Usage

clean_site_index(
  site_index,
  name_aru_id = "aru_id",
  name_site_id = "site_id",
  name_date_time = "date",
  name_coords = c("longitude", "latitude"),
  name_extra = NULL,
  resolve_overlaps = TRUE,
  quiet = FALSE
)

Arguments

site_index

(Spatial) Data frame or file path. Site index data to clean. If file path, must be to a local csv or xlsx file.

name_aru_id

Character. Name of the column that contains ARU ids. Default "aru_id".

name_site_id

Character. Name of the column that contains site ids. Default "site_id".

name_date_time

Character. Column name that contains dates or date/times. Can be vector of two names if there are both 'start' and 'end' columns. Can be NULL to ignore dates. Default "date".

name_coords

Character. Column names that contain longitude and latitude (in that order). Ignored if site_index is spatial. Default c("longitude", "latitude")

name_extra

Character. Column names for extra data to include. If a named vector, will rename the columns (see examples). Default NULL.

resolve_overlaps

Logical. Whether or not to resolve date overlaps by shifting the start/end dates to noon (default TRUE). This assumes that ARUs are generally not deployed/removed at midnight (the official start/end of a day) and so noon is used as an approximation for when an ARU was deployed or removed. If possible, use specific deployment times to avoid this issue.

quiet

Logical. Whether to suppress progress messages and other non-essential updates.

Value

Standardized site index data frame

Details

Note that times are assumed to be in 'local' time and a timezone isn't used (and is removed if present, replaced with UTC). This allows sites from different timezones to be processed at the same time.

Examples


s <- clean_site_index(example_sites,
  name_aru_id = "ARU",
  name_site_id = "Sites",
  name_date_time = c("Date_set_out", "Date_removed"),
  name_coords = c("lon", "lat")
)
#> There are overlapping date ranges
#>  Shifting start/end times to 'noon'
#>  Skip this with `resolve_overlaps = FALSE`

s <- clean_site_index(example_sites,
  name_aru_id = "ARU",
  name_site_id = "Sites",
  name_date_time = c("Date_set_out", "Date_removed"),
  name_coords = c("lon", "lat"),
  name_extra = c("plot" = "Plots")
)
#> There are overlapping date ranges
#>  Shifting start/end times to 'noon'
#>  Skip this with `resolve_overlaps = FALSE`

# Without dates
eg <- dplyr::select(example_sites, -Date_set_out, -Date_removed)
s <- clean_site_index(eg,
  name_aru_id = "ARU",
  name_site_id = "Sites",
  name_date_time = NULL,
  name_coords = c("lon", "lat"),
  name_extra = c("plot" = "Plots")
)