Here we’ll cover some workflows using spatial data. For a general workflow, best take a look at the “Getting started with ARUtools” first.
We’ll start with our metadata data frame.
m <- clean_metadata(project_files = example_files)
#> Extracting ARU info...
#> Extracting Dates and Times...
m
#> # A tibble: 42 × 11
#> file_name type path aru_id manufacturer model aru_type site_id tz_offset
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 P01_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P01_1 -0400
#> 2 P01_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P01_1 -0400
#> 3 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 4 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 5 P03_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P03_1 -0400
#> 6 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 7 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 8 P05_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P05_1 -0400
#> 9 P06_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P06_1 -0400
#> 10 P07_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1 NA
#> # ℹ 32 more rows
#> # ℹ 2 more variables: date_time <dttm>, date <date>
This isn’t spatial because we don’t actually know where the sites are located. But our next step is to get our site coordinates.
Let’s assume we have a spatial data frame containing our sites and where they are located.
s <- st_as_sf(example_sites, coords = c("lon", "lat"), crs = 4326)
s
#> Simple feature collection with 10 features and 6 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS: WGS 84
#> Sites Date_set_out Date_removed ARU Plots Subplot
#> 1 P01_1 2020-05-01 2020-05-03 BARLT10962 Plot1 a
#> 2 P02_1 2020-05-03 2020-05-05 S4A01234 Plot1 a
#> 3 P03_1 2020-05-05 2020-05-06 BARLT10962 Plot2 a
#> 4 P04_1 2020-05-05 2020-05-07 BARLT11111 Plot2 a
#> 5 P05_1 2020-05-06 2020-05-07 BARLT10962 Plot3 b
#> 6 P06_1 2020-05-08 2020-05-09 BARLT10962 Plot1 a
#> 7 P07_1 2020-05-08 2020-05-10 S4A01234 Plot1 a
#> 8 P08_1 2020-05-10 2020-05-11 BARLT10962 Plot2 a
#> 9 P09_1 2020-05-10 2020-05-11 S4A02222 Plot2 a
#> 10 P10_1 2020-05-10 2020-05-11 S4A03333 Plot3 b
#> geometry
#> 1 POINT (-85.03 50.01)
#> 2 POINT (-87.45 52.68)
#> 3 POINT (-90.38 48.99)
#> 4 POINT (-85.53 45)
#> 5 POINT (-88.45 51.05)
#> 6 POINT (-90.08 52)
#> 7 POINT (-86.03 50.45)
#> 8 POINT (-84.45 48.999)
#> 9 POINT (-91.38 45)
#> 10 POINT (-90 50.01)
Similar to a non-spatial workflow, we’ll clean up this list so we can add these sites to our metadata.
sites <- clean_site_index(s,
name_aru_id = "ARU",
name_site_id = "Sites",
name_date_time = c("Date_set_out", "Date_removed")
)
#> There are overlapping date ranges
#> • Shifting start/end times to 'noon'
#> • Skip this with `resolve_overlaps = FALSE`
sites
#> Simple feature collection with 10 features and 6 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS: WGS 84
#> # A tibble: 10 × 7
#> site_id aru_id date_time_start date_time_end date_start date_end
#> * <chr> <chr> <dttm> <dttm> <date> <date>
#> 1 P01_1 BARLT1… 2020-05-01 12:00:00 2020-05-03 12:00:00 2020-05-01 2020-05-03
#> 2 P02_1 S4A012… 2020-05-03 12:00:00 2020-05-05 12:00:00 2020-05-03 2020-05-05
#> 3 P03_1 BARLT1… 2020-05-05 12:00:00 2020-05-06 12:00:00 2020-05-05 2020-05-06
#> 4 P04_1 BARLT1… 2020-05-05 12:00:00 2020-05-07 12:00:00 2020-05-05 2020-05-07
#> 5 P05_1 BARLT1… 2020-05-06 12:00:00 2020-05-07 12:00:00 2020-05-06 2020-05-07
#> 6 P06_1 BARLT1… 2020-05-08 12:00:00 2020-05-09 12:00:00 2020-05-08 2020-05-09
#> 7 P07_1 S4A012… 2020-05-08 12:00:00 2020-05-10 12:00:00 2020-05-08 2020-05-10
#> 8 P08_1 BARLT1… 2020-05-10 12:00:00 2020-05-11 12:00:00 2020-05-10 2020-05-11
#> 9 P09_1 S4A022… 2020-05-10 12:00:00 2020-05-11 12:00:00 2020-05-10 2020-05-11
#> 10 P10_1 S4A033… 2020-05-10 12:00:00 2020-05-11 12:00:00 2020-05-10 2020-05-11
#> # ℹ 1 more variable: geometry <POINT [°]>
Note that we still have a spatial data set.
Now let’s add this site-related information to our metadata.
m <- add_sites(m, sites)
#> Joining by columns `date_time_start` and `date_time_end`
m
#> Simple feature collection with 42 features and 11 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS: WGS 84
#> # A tibble: 42 × 12
#> file_name type path aru_id manufacturer model aru_type site_id tz_offset
#> * <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 P01_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P01_1 -0400
#> 2 P01_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P01_1 -0400
#> 3 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 4 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 5 P03_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P03_1 -0400
#> 6 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 7 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 8 P05_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P05_1 -0400
#> 9 P06_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P06_1 -0400
#> 10 P07_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1 NA
#> # ℹ 32 more rows
#> # ℹ 3 more variables: date_time <dttm>, date <date>, geometry <POINT [°]>
Again our output is as a spatial data set.
Let’s continue by adding times to sunrise/sunset.
m <- calc_sun(m)
m
#> Simple feature collection with 42 features and 14 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS: WGS 84
#> # A tibble: 42 × 15
#> file_name type path aru_id manufacturer model aru_type site_id tz_offset
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 P01_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P01_1 -0400
#> 2 P01_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P01_1 -0400
#> 3 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 4 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 5 P03_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P03_1 -0400
#> 6 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 7 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 8 P05_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P05_1 -0400
#> 9 P06_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P06_1 -0400
#> 10 P07_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1 NA
#> # ℹ 32 more rows
#> # ℹ 6 more variables: date_time <dttm>, date <date>, tz <chr>, t2sr <dbl>,
#> # t2ss <dbl>, geometry <POINT [°]>
All done! And we’ve retained a spatial data set the entire way.
Problems
However, sometimes spatial data sets might be trickier to use.
For example, sf spatial data sets cannot have missing coordinates,
meaning that when using the add_sites()
function, you’ll
get a warning and a data frame back if you try to add an incomplete list
of sites.
m <- clean_metadata(project_files = example_files)
#> Extracting ARU info...
#> Extracting Dates and Times...
sites <- st_as_sf(example_sites, coords = c("lon", "lat"), crs = 4326) |>
clean_site_index(
name_aru_id = "ARU",
name_site_id = "Sites",
name_date_time = c("Date_set_out", "Date_removed")
)
#> There are overlapping date ranges
#> • Shifting start/end times to 'noon'
#> • Skip this with `resolve_overlaps = FALSE`
sites <- sites[-1, ] # Omit that first site
m <- add_sites(m, sites)
#> Joining by columns `date_time_start` and `date_time_end`
#> Identified possible problems with metadata extraction:
#> ✖ Not all files were matched to a site reference (6/42)
#> • Consider adjusting the `by` argument
#> Warning in add_sites(m, sites): Cannot have missing coordinates in spatial data frames
#> • Returning non-spatial data frame
m
#> # A tibble: 42 × 13
#> file_name type path aru_id manufacturer model aru_type site_id tz_offset
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 P01_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P01_1 -0400
#> 2 P01_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P01_1 -0400
#> 3 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 4 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 5 P03_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P03_1 -0400
#> 6 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 7 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 8 P05_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P05_1 -0400
#> 9 P06_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P06_1 -0400
#> 10 P07_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1 NA
#> # ℹ 32 more rows
#> # ℹ 4 more variables: date_time <dttm>, date <date>, longitude <dbl>,
#> # latitude <dbl>
To resolve this, either add in the missing site information, or omit the files before joining.
m <- clean_metadata(project_files = example_files) |>
filter(date > "2020-05-03") # Filter out recordings that don't match a site
#> Extracting ARU info...
#> Extracting Dates and Times...
m <- add_sites(m, sites)
#> Joining by columns `date_time_start` and `date_time_end`
m
#> Simple feature collection with 36 features and 11 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS: WGS 84
#> # A tibble: 36 × 12
#> file_name type path aru_id manufacturer model aru_type site_id tz_offset
#> * <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 2 P02_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1 NA
#> 3 P03_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P03_1 -0400
#> 4 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 5 P04_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P04_1 -0400
#> 6 P05_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P05_1 -0400
#> 7 P06_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P06_1 -0400
#> 8 P07_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1 NA
#> 9 P07_1_20200… wav a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1 NA
#> 10 P08_1_20200… wav a_BA… BARLT… Frontier La… BAR-… BARLT P08_1 -0400
#> # ℹ 26 more rows
#> # ℹ 3 more variables: date_time <dttm>, date <date>, geometry <POINT [°]>
Fixed!