Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve data cleaning code for BirdLife data #59

Merged
merged 6 commits into from
Aug 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ frc-data-birds-part-5.Rout
frc-data-birds-part-6.Rout
frc-data-mammals.Rout
frc-data-reptiles.Rout
iucn-species-list.csv
^aoh.R$
^customization.R$
codecov.yml
3 changes: 3 additions & 0 deletions .github/workflows/documentation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,9 @@ jobs:
run: |
result <- urlchecker::url_check()
result <- result[!startsWith(result$URL, "https://doi.org/"), , drop = FALSE]
result <- result[!startsWith(result$URL, "https://land.copernicus.eu"), , drop = FALSE]
result <- result[!startsWith(result$URL, "https://www.iucnredlist.org"), , drop = FALSE]
result <- result[!startsWith(result$URL, "https://lpdaac.usgs.gov"), , drop = FALSE]
if (nrow(result) > 0) {
print(result)
stop("Invalid URLs detected")
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ frc-data-birds-part-5.Rout
frc-data-birds-part-6.Rout
frc-data-mammals.Rout
frc-data-reptiles.Rout
iucn-species-list.csv

# system files
.directory
Expand Down
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: aoh
Type: Package
Version: 0.0.2.13
Version: 0.0.2.14
Title: Create Area of Habitat Data
Description: Create Area of Habitat data to characterize species distributions.
Data are produced following procedures outlined by Brooks et al. (2019)
Expand Down Expand Up @@ -73,7 +73,7 @@ SystemRequirements: GDAL (>= 3.0.2) (optional), PROJ (>= 7.2.0) (optional)
URL: https://prioritizr.github.io/aoh/, https://github.com/prioritizr/aoh
BugReports: https://github.com/prioritizr/aoh/issues
VignetteBuilder: knitr
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
Collate:
'internal.R'
'calc_spp_frc_data.R'
Expand Down
16 changes: 12 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,9 @@ prep_lumbierres_habitat_data: inst/scripts/lumbierres-habitat-data.R
R CMD BATCH --no-restore --no-save inst/scripts/lumbierres-habitat-data.R

# process aoh data
aoh_global_data: aoh_amphibians aoh_mammals aoh_reptiles aoh_birds
aoh_global_data: aoh_amphibians aoh_mammals aoh_reptiles aoh_birds aoh_mammals

aoh_mammals: aoh_mammals_land aoh_mammals_land_freshwater aoh_mammals_land_marine

aoh_amphibians:
R CMD BATCH --no-restore --no-save '--args amphibians' inst/scripts/aoh-data.R aoh-data-amphibians.Rout
Expand All @@ -51,8 +53,14 @@ aoh_birds:
R CMD BATCH --no-restore --no-save '--args birds-part-5' inst/scripts/aoh-data.R aoh-data-birds-part-5.Rout
R CMD BATCH --no-restore --no-save '--args birds-part-6' inst/scripts/aoh-data.R aoh-data-birds-part-6.Rout

aoh_mammals:
R CMD BATCH --no-restore --no-save '--args mammals' inst/scripts/aoh-data.R aoh-data-mammals.Rout
aoh_mammals_land:
R CMD BATCH --no-restore --no-save '--args mammals-land' inst/scripts/aoh-data.R aoh-data-mammals-land.Rout

aoh_mammals_land_freshwater:
R CMD BATCH --no-restore --no-save '--args mammals-land-freshwater' inst/scripts/aoh-data.R aoh-data-mammals-land-freshwater.Rout

aoh_mammals_land_marine:
R CMD BATCH --no-restore --no-save '--args mammals-land-marine' inst/scripts/aoh-data.R aoh-data-mammals-land-marine.Rout

aoh_reptiles:
R CMD BATCH --no-restore --no-save '--args reptiles' inst/scripts/aoh-data.R aoh-data-reptiles.Rout
Expand Down Expand Up @@ -132,4 +140,4 @@ purl_vigns:
R --slave -e "lapply(dir('vignettes', '^.*\\\\.Rmd$$'), function(x) knitr::purl(file.path('vignettes', x), gsub('.Rmd', '.R', x, fixed = TRUE)))"
rm -f Rplots.pdf

.PHONY: initc vigns clean data docs readme site test check checkwb build purl_vigns install man spellcheck examples prep_habitat_data prep_elevation_data aoh_reptiles aoh_mammals aoh_birds aoh_amphibians aoh_global_data frc_reptiles frc_mammals frc_birds frc_amphibians frc_global_data
.PHONY: initc vigns clean data docs readme site test check checkwb build purl_vigns install man spellcheck examples prep_habitat_data prep_elevation_data aoh_reptiles aoh_mammals aoh_mammals_land aoh_mammals_land_freshwater aoh_mammals_land_marine aoh_birds aoh_amphibians aoh_global_data frc_reptiles frc_mammals frc_birds frc_amphibians frc_global_data
11 changes: 11 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
# aoh 0.0.2.14

- Update `create_spp_info_data()` to make data cleaning functionality more
robust for the BirdLife species' range dataset.
- Update built-in helper script for processing area of habitat data to
include (i) mammal species with terrestrial and freshwater distributions and
(ii) mammal species with terrestrial and marine distributions
(see `inst/scripts/aoh-data.R`)
- New built-in helper script to download all species identifiers from the
IUCN Red List (see `inst/scripts/iucn-species-list.R`)

# aoh 0.0.2.13

- Update `read_spp_range_data()` and `create_spp_info_data()` to fix
Expand Down
22 changes: 16 additions & 6 deletions R/clean_spp_range_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -402,28 +402,38 @@ clean_spp_range_data <- function(x,
# step 6: convert MULTISURFACE to MULTIPOLYGON
x <- sf::st_set_precision(x, geometry_precision)
idx <- which(vapply(sf::st_geometry(x), inherits, logical(1), "MULTISURFACE"))
if (length(idx) > 0) { # nocov start
# nocov start
if (length(idx) > 0) {
g <- sf::st_geometry(x)
g2 <- lapply(g[idx], sf::st_cast, "MULTIPOLYGON")
g2 <- g[idx]
g2 <- lapply(g2, sf::st_cast, "MULTIPOLYGON")
g2 <- lapply(g2, sf::st_buffer, 0)
g2 <- lapply(g2, sf::st_make_valid)
for (i in seq_along(idx)) {
g[[idx[[i]]]] <- g2[[i]]
}
x <- sf::st_set_geometry(x, g)
rm(g, g2)
} # nocov end
}
# nocov end

# force construction of object, this seems to be needed for some reason
# that I do not understand, otherwise st_collection_extract() throws
# an error
x <- x[seq_len(nrow(x)), , drop = FALSE]
x <- suppressWarnings(sf::st_collection_extract(x, "POLYGON"))
invisible(gc())

# step 7: fix any potential geometry issues
x <- st_repair_geometry(x, geometry_precision)
invisible(gc())

# step 8: wrap geometries to dateline
x <- sf::st_set_precision(x, geometry_precision)
x <- suppressWarnings(sf::st_wrap_dateline(x,
options = c("WRAPDATELINE=YES", "DATELINEOFFSET=180"))
x <- suppressWarnings(
sf::st_wrap_dateline(
x,
options = c("WRAPDATELINE=YES", "DATELINEOFFSET=180")
)
)
invisible(gc())

Expand Down
2 changes: 1 addition & 1 deletion R/st_repair_geometry.R
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ st_repair_geometry <- function(x, geometry_precision = 1e5) {
requireNamespace("prepr", quietly = TRUE),
msg = paste(
"the \"prepr\" package needs to be installed, use: \n",
"remotes::install_github(\"dickoa/prepr\")"
"remotes::install_github(\"prioritizr/prepr\")"
)
)
### find geometries to repair
Expand Down
9 changes: 8 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,14 @@ knitr::opts_chunk$set(
[![Coverage Status](https://img.shields.io/codecov/c/github/prioritizr/aoh?label=Coverage)](https://app.codecov.io/gh/prioritizr/aoh/branch/master)

```{r, include = FALSE}
# load developmental version of package
devtools::load_all()

# check if being prepared for website
## see https://github.com/r-lib/pkgdown/blob/main/R/pkgdown.R
in_pkgdown <- function() {
identical(Sys.getenv("IN_PKGDOWN"), "true")
}
```

### Overview
Expand Down Expand Up @@ -180,7 +187,7 @@ print(spp_aoh_rasters)

Finally, let's create some maps to compare the range data with the Area of habitat data.

```{r "map", message = FALSE, warning = FALSE, results = "hide", dpi = 200, fig.width = 5.5, fig.height = 4, out.width = ifelse(isTRUE(knitr::is_html_output(excludes = c("markdown"))), "60%", "90%")}
```{r "map", message = FALSE, warning = FALSE, results = "hide", dpi = 200, fig.width = 5.5, fig.height = 4, out.width = ifelse(isTRUE(in_pkgdown()), "60%", "90%")}
# create maps
## N.B. you might need to install the ggmap package
map <-
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -354,7 +354,7 @@ map <-
print(map)
```

<img src="man/figures/README-map-1.png" width="60%" style="display: block; margin: auto;" />
<img src="man/figures/README-map-1.png" width="90%" style="display: block; margin: auto;" />

### Citation

Expand All @@ -366,7 +366,7 @@ produce Area of Habitat data.
relevant data using:

Hanson JO (2024) aoh: Create Area of Habitat Data. R package version
0.0.2.12. Available at https://github.com/prioritizr/aoh.
0.0.2.14. Available at https://github.com/prioritizr/aoh.

IUCN [insert year] IUCN Red List of Threatened Species. Version
[insert version]. Available at www.iucnredlist.org.
Expand Down
3 changes: 3 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
url: https://prioritizr.github.io/aoh

authors:
Jeffrey O Hanson:
href: http://jeffrey-hanson.com

template:
bootstrap: 5
params:
bootswatch: flatly

Expand Down
130 changes: 47 additions & 83 deletions docs/404.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading