Skip to contents

Here we’ll cover some workflows using spatial data. For a general workflow, best take a look at the “Getting started with ARUtools” first.

We’ll start with our metadata data frame.

m <- clean_metadata(project_files = example_files)
#> Extracting ARU info...
#> Extracting Dates and Times...
m
#> # A tibble: 42 × 11
#>    file_name    type  path  aru_id manufacturer model aru_type site_id tz_offset
#>    <chr>        <chr> <chr> <chr>  <chr>        <chr> <chr>    <chr>   <chr>    
#>  1 P01_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P01_1   -0400    
#>  2 P01_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P01_1   -0400    
#>  3 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  4 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  5 P03_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P03_1   -0400    
#>  6 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  7 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  8 P05_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P05_1   -0400    
#>  9 P06_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P06_1   -0400    
#> 10 P07_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1   NA       
#> # ℹ 32 more rows
#> # ℹ 2 more variables: date_time <dttm>, date <date>

This isn’t spatial because we don’t actually know where the sites are located. But our next step is to get our site coordinates.

Let’s assume we have a spatial data frame containing our sites and where they are located.

s <- st_as_sf(example_sites, coords = c("lon", "lat"), crs = 4326)
s
#> Simple feature collection with 10 features and 6 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS:  WGS 84
#>    Sites Date_set_out Date_removed        ARU Plots Subplot
#> 1  P01_1   2020-05-01   2020-05-03 BARLT10962 Plot1       a
#> 2  P02_1   2020-05-03   2020-05-05   S4A01234 Plot1       a
#> 3  P03_1   2020-05-05   2020-05-06 BARLT10962 Plot2       a
#> 4  P04_1   2020-05-05   2020-05-07 BARLT11111 Plot2       a
#> 5  P05_1   2020-05-06   2020-05-07 BARLT10962 Plot3       b
#> 6  P06_1   2020-05-08   2020-05-09 BARLT10962 Plot1       a
#> 7  P07_1   2020-05-08   2020-05-10   S4A01234 Plot1       a
#> 8  P08_1   2020-05-10   2020-05-11 BARLT10962 Plot2       a
#> 9  P09_1   2020-05-10   2020-05-11   S4A02222 Plot2       a
#> 10 P10_1   2020-05-10   2020-05-11   S4A03333 Plot3       b
#>                 geometry
#> 1   POINT (-85.03 50.01)
#> 2   POINT (-87.45 52.68)
#> 3   POINT (-90.38 48.99)
#> 4      POINT (-85.53 45)
#> 5   POINT (-88.45 51.05)
#> 6      POINT (-90.08 52)
#> 7   POINT (-86.03 50.45)
#> 8  POINT (-84.45 48.999)
#> 9      POINT (-91.38 45)
#> 10     POINT (-90 50.01)

Similar to a non-spatial workflow, we’ll clean up this list so we can add these sites to our metadata.

sites <- clean_site_index(s,
  name_aru_id = "ARU",
  name_site_id = "Sites",
  name_date_time = c("Date_set_out", "Date_removed")
)
#> There are overlapping date ranges
#>  Shifting start/end times to 'noon'
#>  Skip this with `resolve_overlaps = FALSE`
sites
#> Simple feature collection with 10 features and 6 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS:  WGS 84
#> # A tibble: 10 × 7
#>    site_id aru_id  date_time_start     date_time_end       date_start date_end  
#>  * <chr>   <chr>   <dttm>              <dttm>              <date>     <date>    
#>  1 P01_1   BARLT1… 2020-05-01 12:00:00 2020-05-03 12:00:00 2020-05-01 2020-05-03
#>  2 P02_1   S4A012… 2020-05-03 12:00:00 2020-05-05 12:00:00 2020-05-03 2020-05-05
#>  3 P03_1   BARLT1… 2020-05-05 12:00:00 2020-05-06 12:00:00 2020-05-05 2020-05-06
#>  4 P04_1   BARLT1… 2020-05-05 12:00:00 2020-05-07 12:00:00 2020-05-05 2020-05-07
#>  5 P05_1   BARLT1… 2020-05-06 12:00:00 2020-05-07 12:00:00 2020-05-06 2020-05-07
#>  6 P06_1   BARLT1… 2020-05-08 12:00:00 2020-05-09 12:00:00 2020-05-08 2020-05-09
#>  7 P07_1   S4A012… 2020-05-08 12:00:00 2020-05-10 12:00:00 2020-05-08 2020-05-10
#>  8 P08_1   BARLT1… 2020-05-10 12:00:00 2020-05-11 12:00:00 2020-05-10 2020-05-11
#>  9 P09_1   S4A022… 2020-05-10 12:00:00 2020-05-11 12:00:00 2020-05-10 2020-05-11
#> 10 P10_1   S4A033… 2020-05-10 12:00:00 2020-05-11 12:00:00 2020-05-10 2020-05-11
#> # ℹ 1 more variable: geometry <POINT [°]>

Note that we still have a spatial data set.

Now let’s add this site-related information to our metadata.

m <- add_sites(m, sites)
#> Joining by columns `date_time_start` and `date_time_end`
m
#> Simple feature collection with 42 features and 11 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS:  WGS 84
#> # A tibble: 42 × 12
#>    file_name    type  path  aru_id manufacturer model aru_type site_id tz_offset
#>  * <chr>        <chr> <chr> <chr>  <chr>        <chr> <chr>    <chr>   <chr>    
#>  1 P01_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P01_1   -0400    
#>  2 P01_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P01_1   -0400    
#>  3 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  4 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  5 P03_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P03_1   -0400    
#>  6 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  7 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  8 P05_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P05_1   -0400    
#>  9 P06_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P06_1   -0400    
#> 10 P07_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1   NA       
#> # ℹ 32 more rows
#> # ℹ 3 more variables: date_time <dttm>, date <date>, geometry <POINT [°]>

Again our output is as a spatial data set.

Let’s continue by adding times to sunrise/sunset.

m <- calc_sun(m)
m
#> Simple feature collection with 42 features and 14 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS:  WGS 84
#> # A tibble: 42 × 15
#>    file_name    type  path  aru_id manufacturer model aru_type site_id tz_offset
#>    <chr>        <chr> <chr> <chr>  <chr>        <chr> <chr>    <chr>   <chr>    
#>  1 P01_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P01_1   -0400    
#>  2 P01_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P01_1   -0400    
#>  3 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  4 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  5 P03_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P03_1   -0400    
#>  6 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  7 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  8 P05_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P05_1   -0400    
#>  9 P06_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P06_1   -0400    
#> 10 P07_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1   NA       
#> # ℹ 32 more rows
#> # ℹ 6 more variables: date_time <dttm>, date <date>, tz <chr>, t2sr <dbl>,
#> #   t2ss <dbl>, geometry <POINT [°]>

All done! And we’ve retained a spatial data set the entire way.

Problems

However, sometimes spatial data sets might be trickier to use.

For example, sf spatial data sets cannot have missing coordinates, meaning that when using the add_sites() function, you’ll get a warning and a data frame back if you try to add an incomplete list of sites.

m <- clean_metadata(project_files = example_files)
#> Extracting ARU info...
#> Extracting Dates and Times...

sites <- st_as_sf(example_sites, coords = c("lon", "lat"), crs = 4326) |>
  clean_site_index(
    name_aru_id = "ARU",
    name_site_id = "Sites",
    name_date_time = c("Date_set_out", "Date_removed")
  )
#> There are overlapping date ranges
#>  Shifting start/end times to 'noon'
#>  Skip this with `resolve_overlaps = FALSE`

sites <- sites[-1, ] # Omit that first site

m <- add_sites(m, sites)
#> Joining by columns `date_time_start` and `date_time_end`
#> Identified possible problems with metadata extraction:
#>  Not all files were matched to a site reference (6/42)
#>  Consider adjusting the `by` argument
#> Warning in add_sites(m, sites): Cannot have missing coordinates in spatial data frames
#>  Returning non-spatial data frame
m
#> # A tibble: 42 × 13
#>    file_name    type  path  aru_id manufacturer model aru_type site_id tz_offset
#>    <chr>        <chr> <chr> <chr>  <chr>        <chr> <chr>    <chr>   <chr>    
#>  1 P01_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P01_1   -0400    
#>  2 P01_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P01_1   -0400    
#>  3 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  4 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  5 P03_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P03_1   -0400    
#>  6 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  7 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  8 P05_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P05_1   -0400    
#>  9 P06_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P06_1   -0400    
#> 10 P07_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1   NA       
#> # ℹ 32 more rows
#> # ℹ 4 more variables: date_time <dttm>, date <date>, longitude <dbl>,
#> #   latitude <dbl>

To resolve this, either add in the missing site information, or omit the files before joining.

m <- clean_metadata(project_files = example_files) |>
  filter(date > "2020-05-03") # Filter out recordings that don't match a site
#> Extracting ARU info...
#> Extracting Dates and Times...

m <- add_sites(m, sites)
#> Joining by columns `date_time_start` and `date_time_end`
m
#> Simple feature collection with 36 features and 11 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -91.38 ymin: 45 xmax: -84.45 ymax: 52.68
#> Geodetic CRS:  WGS 84
#> # A tibble: 36 × 12
#>    file_name    type  path  aru_id manufacturer model aru_type site_id tz_offset
#>  * <chr>        <chr> <chr> <chr>  <chr>        <chr> <chr>    <chr>   <chr>    
#>  1 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  2 P02_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P02_1   NA       
#>  3 P03_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P03_1   -0400    
#>  4 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  5 P04_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P04_1   -0400    
#>  6 P05_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P05_1   -0400    
#>  7 P06_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P06_1   -0400    
#>  8 P07_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1   NA       
#>  9 P07_1_20200… wav   a_S4… S4A01… Wildlife Ac… Song… SongMet… P07_1   NA       
#> 10 P08_1_20200… wav   a_BA… BARLT… Frontier La… BAR-… BARLT    P08_1   -0400    
#> # ℹ 26 more rows
#> # ℹ 3 more variables: date_time <dttm>, date <date>, geometry <POINT [°]>

Fixed!