From 3a5abaaab0145c41f84f4f74c6e66c495eb224a4 Mon Sep 17 00:00:00 2001 From: Jakob Date: Tue, 16 Jan 2024 16:05:55 +0100 Subject: [PATCH] Write subsection about analysis of distributions --- .../_freeze/report/execute-results/html.json | 6 +- .../_freeze/report/execute-results/tex.json | 4 +- .../report/figure-pdf/fig-patches-nrw-1.pdf | Bin 54288 -> 54288 bytes results/appendix/package-docs/docs.qmd | 278 ++++++++++-------- results/methods/distribution-analysis.qmd | 20 ++ results/references.bib | 17 +- results/report.qmd | 20 +- 7 files changed, 222 insertions(+), 123 deletions(-) create mode 100644 results/methods/distribution-analysis.qmd diff --git a/results/_freeze/report/execute-results/html.json b/results/_freeze/report/execute-results/html.json index 58ef6a2..d512c4f 100644 --- a/results/_freeze/report/execute-results/html.json +++ b/results/_freeze/report/execute-results/html.json @@ -1,9 +1,9 @@ { - "hash": "e1408acaec76e2d97b0a0348b71ffcbc", + "hash": "c19154abeddbb8545c54869ff7d335bf", "result": { - "markdown": "---\ntitle: \"Forest Data Analysis Report\"\noutput:\n pdf_document:\n latex_engine: xelatex\ntoc: true\ntoc-depth: 2\ntoc-title: Contents\nnumber-sections: true\nnumber-depth: 3\ndate: today\nauthor: Jakob Danel and Frederick Bruch\nbibliography: references.bib\nexecute-dir: .. \nprefer-html: true\n---\n\n\n# Introduction\n\nThis report documents the analysis of forest data for different tree species.\n\n# Methods\n\n## Data acquisition\n\nOur primary objective is to identify patches where one tree species exhibits a high level of dominance, striving to capture monocultural stands within the diverse forests of Nordrhein-Westfalia (NRW). Recognizing the practical challenges of finding true monocultures, we aim to identify patches where one species is highly dominant, enabling meaningful comparisons across different species.\n\nThe study is framed within the NRW region due to the availability of an easily accessible dataset. Our focus includes four prominent tree species in NRW: oak, beech, spruce, and pine, representing the most prevalent species in the region. To ensure the validity of our findings, we derive three patches for each species, thereby confirming that observed variables are characteristic of a particular species rather than a specific patch. Each patch is carefully selected to encompass an area of approximately 50-100 hectares and contain between 5,000 and 10,000 trees. Striking a balance between relevance and manageability, these patches avoid excessive size to enhance the likelihood of capturing varied species mixes and ensure compatibility with local hardware.\n\nSpecific Goals:\n\n1. Retrieve patches with highly dominant tree species.\n2. Minimize or eliminate the presence of human-made structures within the selected patches.\n\nTo achieve our goals, we utilized the waldmonitor dataset [@welle2014] and the map provided by [@Blickensdoerfer2022], both indicating dominant tree species in NRW. We identified patches of feasible size where both sources predicted the presence of a specific species. Further validation involved examining sentinel images of these forest regions to assess the evenness of structures, leaf color distribution, and the absence of significant human-made structures such as roads or buildings. The subsequent preprocessing steps, detailed in the following subsection, involved refining our selected patches and deriving relevant variables, such as tree distribution and density, to ensure that the chosen areas align with the desired research domains.\n\n## Preprocessing\n::: {.cell}\n\n:::\n\n\nIn this research study, the management and processing of a large dataset are crucial considerations. The dataset's substantial size necessitates careful maintenance to ensure efficient handling. Furthermore, the data should be easily processable and editable to facilitate necessary corrections and precalculations within the context of our research objectives. To achieve our goals, we have implemented a framework that automatically derives data based on a shapefile, delineating areas of interest. The processed data and results of precalculations are stored in a straightforward manner to enhance accessibility. Additionally, we have designed functions that establish a user-friendly interface, enabling the execution of algorithms on subsets of the data, such as distinct species. These interfaces are not only directly callable by users but can also be integrated into other functions to automate processes. The overarching aim is to streamline the entire preprocessing workflow using a single script, leveraging only the shapefile as a basis. This subsection details the accomplishments of our R-package in realizing these goals, outlining the preprocessing steps undertaken and justifying their necessity in the context of our research.\n\nThe data are stored in a data subdirectory of the root directory in the format `species/location-name/tile-name`. To automate the matching of areas of interest with the catalog from the Land NRW[^1], we utilize the intersecting tool developed by Heisig[^2]. This tool, allows for the automatic retrieval and placement of data downloaded from the Land NRW catalog. To enhance data accessibility, we have devised an object that incorporates species, location name, and tile name (the NRW internal identifier) for each area This object facilitates the specification of the area to be processed. Additionally, we have defined an initialization function that downloads all tiles, returning a list of tile location objects for subsequent processing. A pivotal component of the package's preprocessing functionality is the map function, which iterates over a list of tile locations (effectively the entire dataset) and accepts a processing function as an argument. The subsequent paragraph outlines the specific preprocessing steps employed, all of which are implemented within the mapping function.\n\nTo facilitate memory-handling capabilities, each of the tiles, where one area can span multiple tiles, has been split into manageable chunks. We employed a 50x50m size for each tile, resulting in the division of original 1km x 1km files into 400 tiles. These tiles are stored in our directory structure, with each tile housed in a directory named after its tile name and assigned an id as the filename. Implementation-wise, the `lidr::catalog_retile` function was instrumental in achieving this segmentation. The resulting smaller chunks allow for efficient iteration during subsequent preprocessing steps.\n\nThe next phase involves reducing our data to the actual size by intersecting the tiles with the defined area of interest. Using the `lidR::merge_spatial` function, we intersect the area derived from the shapefile, removing all point cloud items outside this region. Due to our tile-wise approach, empty tiles may arise, and in such cases, those tiles are simply deleted.\n\nFollowing the size reduction to our dataset, the next step involves correcting the `z` values. The `z` values in the data are originally relative to the ellipsoid used for referencing, but we require them to be relative to the ground. To achieve this, we utilize the `lidR::tin` function, which extrapolates a convex hull between all ground points (classified by the data provider) and calculates the z value based on this structure.\n\nSubsequently, we aim to perform segmentation for each distinct tree, marking each item of the point cloud with a tree ID. We employ the algorithm described by @li2012, using parameters `li2012(dt1 = 2, dt2 = 3, R = 2, Zu = 10, hmin = 5, speed_up = 12)`. The meanings of these parameters are elucidated in Li et al.'s work [@li2012].\n\nFinally, the last preprocessing step involves individual tree detection, seeking a single `POINT` object for each tree. The `lidR::lmf` function, an implementation of the tree data using a local maximum approach, is utilized for this purpose [@popescu2004]. The results are stored in GeoPackage files within our data structure.\n\nSee @sec-appendix-preprocessing for the implementation of the preprocessing.\n\n[^1]: https://www.opengeodata.nrw.de/produkte/geobasis/hm/3dm_l_las/3dm_l_las/, last visited 7th Dec 2023\n[^2]: https://github.com/joheisig/GEDIcalibratoR, last visited 7th Dec 2023\n\n\n\n# Results\n::: {.cell}\n\n:::\n\n## Researched areas\n\n::: {.cell}\n\n```{.r .cell-code code-fold=\"true\"}\nlibrary(ggplot2)\nsf::sf_use_s2(FALSE)\npatches <- sf::read_sf(\"research_areas.shp\") |> sf::st_centroid()\n\nde <- sf::read_sf(\"results/results/states_de/Bundesländer_2017_mit_Einwohnerzahl.shp\") # Source: https://hub.arcgis.com/datasets/esri-de-content::bundesl%C3%A4nder-2017-mit-einwohnerzahl/explore?location=51.099647%2C10.454033%2C7.43\nnrw <- de[5,] |> sf::st_geometry()\n\n\nggplot() + geom_sf(data = nrw) + \n geom_sf(data = patches, mapping = aes(col = species))\n```\n\n::: {.cell-output-display}\n![Locations of the different patches with the dominant species for that patch. The patches centroids are displayed on a basemap describing the borders from NRW.](report_files/figure-html/fig-patches-nrw-1.png){#fig-patches-nrw width=672}\n:::\n:::\nWe draw three patches for each species from different regions (see @tbl-summary-researched-areas). We download the LiDAR data for those patches and runned all preprocessing steps as described. We than checked with certain derived parameters (e.g. tree heights, tree distributions or tree density) that all patches contain valid forest data. In that step we discovered, that in one patch some forest clearance took place in the near past. This patch was removed from the dataset and was replaced with a new one. \n\nIn our research, drawing patches evenly distributed across Nordrhein-Westfalia is inherently constrained by natural factors. Consequently, the patches for oak and pine predominantly originate from the Münsterland region, as illustrated in [@fig-patches-nrw]. For spruce, the patches were derived from Sauerland, reflecting the prevalence of spruce forests in this specific region within NRW, as corroborated by Welle et al. [@welle2014] and Blickensdörfer et al. [@Blickensdoerfer2022]. Beech patches, on the other hand, were generated from diverse locations within NRW. Across all patches, no human-made objects were identified, with the exception of small paths for pedestrians and forestry vehicles.\n\nThe distribution of area and detections is notable for each four species. Beech covers 69,791.9 hectares with a total of 5,954 detections, oak spans 63,232.49 hectares with 5,354 detections, pine extends across 72,862.4 hectares with 8,912 detections, and spruce encompasses 57,940.02 hectares with 8,619 detections. Both the amount of detections and the corresponding area exhibit a relatively uniform distribution across the diverse patches, as summarized in @tbl-summary-researched-areas. \n\nWith the selected dataset described, we intentionally chose three patches for each four species that exhibit a practical and usable size for our research objectives. These carefully chosen patches align with the conditions essential for our study, providing comprehensive and representative data for in-depth analysis and meaningful insights into the characteristics of each tree species within the specified areas.\n\n\n::: {#tbl-summary-researched-areas .cell tbl-cap='Summary of researched patches grouped by species, with their location, area and the amount of detected trees.'}\n\n```{.r .cell-code code-fold=\"true\"}\nshp <- sf::read_sf(\"research_areas.shp\")\ntable <- lfa::lfa_get_all_areas()\n\nsf::sf_use_s2(FALSE)\nfor (row in 1:nrow(table)) {\n area <-\n dplyr::filter(shp, shp$species == table[row, \"specie\"] &\n shp$name == table[row, \"area\"])\n area_size <- area |> sf::st_area()\n point <- area |> sf::st_centroid() |> sf::st_coordinates()\n table[row,\"point\"] <- paste0(\"(\",round(point[1], digits = 4),\", \",round(point[2],digits = 4),\")\")\n \n table[row, \"area_size\"] = round(area_size,digits = 2) #paste0(round(area_size,digits = 2), \" m²\")\n \n amount_det <- nrow(lfa::lfa_get_detection_area(table[row, \"specie\"], table[row, \"area\"]))\n if(is.null(amount_det)){\n cat(nrow(lfa::lfa_get_detection_area(table[row, \"specie\"], table[row, \"area\"])),table[row, \"specie\"],table[row, \"area\"])\n }\n table[row, \"amount_detections\"] = amount_det\n \n # table[row, \"specie\"] <- lfa::lfa_capitalize_first_char(table[row,\"specie\"])\n table[row, \"area\"] <- lfa::lfa_capitalize_first_char(table[row,\"area\"])\n }\ntable$area <- gsub(\"_\", \" \", table$area)\ntable$area <- gsub(\"ue\", \"ü\", table$area)\ntable = table[,!names(table) %in% c(\"specie\")]\n\nknitr::kable(table, \"html\", col.names = c(\"Patch Name\",\"Location\",\"Area size (m²)\",\"Amount tree detections\" ), caption = NULL, digits = 2, escape = TRUE) |>\n kableExtra::kable_styling(\n bootstrap_options = c(\"striped\", \"hold_position\", \"bordered\",\"responsive\"),\n stripe_index = c(1:3,7:9),\n full_width = FALSE\n ) |>\n kableExtra::pack_rows(\"Beech\", 1, 3) |>\n kableExtra::pack_rows(\"Oak\", 4, 6) |>\n kableExtra::pack_rows(\"Pine\", 7, 9) |>\n kableExtra::pack_rows(\"Spruce\", 10, 12) |>\n kableExtra::column_spec(1, bold = TRUE)\n```\n\n::: {.cell-output-display}\n`````{=html}\n\n \n \n \n \n \n \n \n \n\n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n
Patch Name Location Area size (m²) Amount tree detections
Beech
Bielefeld brackwede (8.5244, 51.9902) 161410.57 1443
Billerbeck (7.3273, 51.9987) 185887.25 1732
Wülfenrath (7.0769, 51.2917) 350621.21 2779
Oak
Hamm (7.8618, 51.6639) 269397.22 2441
Münster (7.6187, 51.9174) 164116.61 1270
Rinkerode (7.6744, 51.8598) 198811.09 1643
Pine
Greffen (8.1697, 51.9913) 49418.81 513
Mesum (7.5403, 52.2573) 405072.85 5031
Telgte (7.7816, 52.0024) 274132.34 3368
Spruce
Brilon (8.5352, 51.4084) 211478.20 3342
Oberhundem (8.1861, 51.0909) 151895.53 2471
Osterwald (8.3721, 51.2151) 216026.43 2806
\n\n`````\n:::\n:::\n\n\n\n\n\n\n\n\n|specie |area | density (1/m²)|\n|:------|:-------------------|---------:|\n|beech |bielefeld_brackwede | 0.0089399|\n|beech |billerbeck | 0.0093175|\n|beech |wuelfenrath | 0.0079259|\n|oak |hamm | 0.0090610|\n|oak |muenster | 0.0077384|\n|oak |rinkerode | 0.0082641|\n|pine |greffen | 0.0103807|\n|pine |mesum | 0.0124200|\n|pine |telgte | 0.0122860|\n|spruce |brilon | 0.0158030|\n|spruce |oberhundem | 0.0162678|\n|spruce |osterwald | 0.0129892|\n\n# References\n\n::: {#refs}\n:::\n\n# Appendix\n## Script which can be used to do all preprocessing {#sec-appendix-preprocessing}\n\n::: {.cell}\n\n:::\n\n\nLoad the file with the research areas\n::: {.cell}\n\n```{.r .cell-code}\nsf <- sf::read_sf(here::here(\"research_areas.shp\"))\nprint(sf)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nSimple feature collection with 12 features and 3 fields\nGeometry type: POLYGON\nDimension: XY\nBounding box: xmin: 7.071625 ymin: 51.0895 xmax: 8.539877 ymax: 52.25983\nGeodetic CRS: WGS 84\n# A tibble: 12 × 4\n id species name geometry\n \n 1 1 oak rinkerode ((7.678922 51.85789, 7.675446 51.85752, 7.…\n 2 2 oak hamm ((7.858955 51.66699, 7.866444 51.66462, 7.…\n 3 3 oak muenster ((7.618908 51.9154, 7.617384 51.9172, 7.61…\n 4 4 pine greffen ((8.168691 51.98965, 8.167178 51.99075, 8.…\n 5 5 pine telgte ((7.779728 52.00662, 7.781616 52.00662, 7.…\n 6 6 pine mesum ((7.534424 52.25499, 7.53378 52.25983, 7.5…\n 7 7 beech bielefeld_brackwede ((8.524749 51.9921, 8.528418 51.99079, 8.5…\n 8 8 beech wuelfenrath ((7.071625 51.29256, 7.072311 51.29334, 7.…\n 9 9 beech billerbeck ((7.324729 51.99783, 7.323548 51.99923, 7.…\n10 11 spruce brilon ((8.532195 51.41029, 8.535027 51.41064, 8.…\n11 12 spruce osterwald ((8.369328 51.21693, 8.371238 51.21718, 8.…\n12 10 spruce oberhundem ((8.18082 51.08999, 8.180868 51.09143, 8.1…\n```\n:::\n:::\n\n\nInit the project\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(lfa)\nsf::sf_use_s2(FALSE)\nlocations <- lfa_init(\"research_areas.shp\")\n```\n:::\n\nDo all of the prprocessing steps\n::: {.cell}\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations,retile,check_flag = \"retile\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag retile is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_intersect_areas, ctg = NULL, areas_sf = sf,check_flag = \"intersect\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag intersect is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_ground_correction, ctg = NULL,check_flag = \"z_correction\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag z_correction is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_segmentation, ctg = NULL,check_flag = \"segmentation\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag segmentation is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_detection, catalog = NULL, write_to_file = TRUE,check_flag = \"detection\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag detection is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n:::\n\n\n\n## Documentation\n### `lfa_capitalize_first_char`\n\nCapitalize First Character of a String\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`input_string` | A single-character string to be processed.\n\n\n#### Concept\n\nString Manipulation\n\n\n#### Description\n\nThis function takes a string as input and returns the same string with the\n first character capitalized. If the first character is already capitalized,\n the function does nothing. If the first character is not from the alphabet,\n an error is thrown.\n\n\n#### Details\n\nThis function performs the following steps:\n \n\n* Checks if the input is a single-character string. \n\n* Verifies if the first character is from the alphabet (A-Z or a-z). \n\n* If the first character is not already capitalized, it capitalizes it. \n\n* Returns the modified string.\n\n\n#### Keyword\n\nalphabet\n\n\n#### Note\n\nThis function is case-sensitive and assumes ASCII characters.\n\n\n#### References\n\nNone\n\n\n#### Seealso\n\nThis function is related to the basic string manipulation functions in base R.\n\n\n#### Value\n\nA modified string with the first character capitalized if it is\n not already. If the first character is already capitalized, the original\n string is returned.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Capitalize the first character of a string\ncapitalize_first_char(\"hello\") # Returns \"Hello\"\ncapitalize_first_char(\"World\") # Returns \"World\"\n\n# Error example (non-alphabetic first character)\ncapitalize_first_char(\"123abc\") # Throws an error\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_capitalize_first_char(input_string)\n```\n:::\n\n\n\n### `lfa_check_flag`\n\nCheck if a flag is set, indicating the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being checked.\n\n\n#### Description\n\nThis function checks for the existence of a hidden flag file at a specified location within the working directory. If the flag file is found, a message is printed, and the function returns `TRUE` to indicate that the associated processing step has already been completed. If the flag file is not found, the function returns `FALSE` , indicating that further processing can proceed.\n\n\n#### Value\n\nA logical value indicating whether the flag is set ( `TRUE` ) or not ( `FALSE` ).\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Check if the flag for a process named \"data_processing\" is set\nlfa_check_flag(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_check_flag(flag_name)\n```\n:::\n\n\n\n### `lfa_create_tile_location_objects`\n\nCreate tile location objects\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function traverses a directory structure to find LAZ files and creates\n tile location objects for each file. The function looks into the the `data` \n directory of the repository/working directory. It then creates `tile_location` \n objects based on the folder structure. The folder structure should not be\n touched by hand, but created by `lfa_init_data_structure()` which builds the\n structure based on a shape file.\n\n\n#### Seealso\n\n[`tile_location`](#tilelocation)\n\n\n#### Value\n\nA vector containing tile location objects.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_tile_location_objects()\n\nlfa_create_tile_location_objects()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_tile_location_objects()\n```\n:::\n\n\n\n### `lfa_detection`\n\nPerform tree detection on a lidar catalog and optionally save the results to a file.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`catalog` | A lidar catalog containing point cloud data. If set to NULL, the function attempts to read the catalog from the specified tile location.\n`tile_location` | An object specifying the location of the lidar tile. If catalog is NULL, the function attempts to read the catalog from this tile location.\n`write_to_file` | A logical value indicating whether to save the detected tree information to a file. Default is TRUE.\n\n\n#### Description\n\nThis function utilizes lidar data to detect trees within a specified catalog. The detected tree information can be optionally saved to a file in the GeoPackage format. The function uses parallel processing to enhance efficiency.\n\n\n#### Value\n\nA sf style data frame containing information about the detected trees.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Perform tree detection on a catalog and save the results to a file\nlfa_detection(catalog = my_catalog, tile_location = my_tile_location, write_to_file = TRUE)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_detection(catalog, tile_location, write_to_file = TRUE)\n```\n:::\n\n\n\n### `lfa_download_areas`\n\nDownload areas based on spatial features\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_areas` | Spatial features representing areas to be downloaded. It must include columns like \"species\" \"name\" See details for more information.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function initiates the data structure and downloads areas based on spatial features.\n\n\n#### Details\n\nThe input data frame, `sf_areas` , must have the following columns:\n \n\n* \"species\": The species associated with the area. \n\n* \"name\": The name of the area. \n \n The function uses the `lfa_init_data_structure` function to set up the data structure\n and then iterates through the rows of `sf_areas` to download each specified area.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download_areas(sf_areas)\n\n\n# Example spatial features data frame\nsf_areas <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Must include also other attributes specialized to sf objects\n# such as geometry, for processing of the download\n)\n\nlfa_download_areas(sf_areas)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download_areas(sf_areas)\n```\n:::\n\n\n\n### `lfa_download`\n\nDownload an las file from the state NRW from a specific location\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | The species of the tree which is observed at this location\n`name` | The name of the area that is observed\n`location` | An sf object, which holds the location information for the area where the tile should be downloaded from.\n\n\n#### Description\n\nIt will download the file and save it to data/ list(list(\"html\"), list(list(\"\"))) / list(list(\"html\"), list(list(\"\"))) with the name of the tile\n\n\n#### Value\n\nThe LASCatalog object of the downloaded file\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download(species, name, location)\n```\n:::\n\n\n\n### `lfa_get_detection_area`\n\nGet Detection for an area\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | A character string specifying the target species.\n`name` | A character string specifying the name of the tile.\n\n\n#### Description\n\nRetrieves the tree detection information for a specified species and tile.\n\n\n#### Details\n\nThis function reads tree detection data from geopackage files within the specified tile location for a given species. It then combines the data into a single SF data frame and returns it. The function assumes that the tree detection files follow a naming convention with the pattern \"_detection.gpkg\".\n\n\n#### Keyword\n\nspatial\n\n\n#### References\n\nThis function is part of the LiDAR Forest Analysis (LFA) package.\n\n\n#### Seealso\n\n[`get_tile_dir`](#gettiledir)\n\n\n#### Value\n\nA Simple Features (SF) data frame containing tree detection information for the specified species and tile.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Retrieve tree detection data for species \"example_species\" in tile \"example_tile\"\ntrees_data <- lfa_get_detection_tile_location(\"example_species\", \"example_tile\")\n\n# Example usage:\ntrees_data <- lfa_get_detection_tile_location(\"example_species\", \"example_tile\")\n\n# No trees found scenario:\nempty_data <- lfa_get_detection_tile_location(\"nonexistent_species\", \"nonexistent_tile\")\n# The result will be an empty data frame if no trees are found for the specified species and tile.\n\n# Error handling:\n# In case of invalid inputs, the function may throw errors. Ensure correct species and tile names are provided.\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detection_area(species, name)\n```\n:::\n\n\n\n### `lfa_get_detections_species`\n\nRetrieve detections for a specific species.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | A character string specifying the target species.\n\n\n#### Description\n\nThis function retrieves detection data for a given species from multiple areas.\n\n\n#### Details\n\nThe function looks for detection data in the \"data\" directory for the specified species.\n It then iterates through each subdirectory (representing different areas) and consolidates the\n detection data into a single data frame.\n\n\n#### Value\n\nA data frame containing detection information for the specified species in different areas.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage:\ndetections_data <- lfa_get_detections_species(\"example_species\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections_species(species)\n```\n:::\n\n\n\n### `lfa_get_detections`\n\nRetrieve aggregated detection data for multiple species.\n\n\n#### Concept\n\ndata retrieval functions\n\n\n#### Description\n\nThis function obtains aggregated detection data for multiple species by iterating\n through the list of species obtained from [`lfa_get_species`](#lfagetspecies) . For each\n species, it calls [`lfa_get_detections_species`](#lfagetdetectionsspecies) to retrieve the\n corresponding detection data and aggregates the results into a single data frame.\n The resulting data frame includes columns for the species, tree detection data,\n and the area in which the detections occurred.\n\n\n#### Keyword\n\naggregation\n\n\n#### Seealso\n\n[`lfa_get_species`](#lfagetspecies) , [`lfa_get_detections_species`](#lfagetdetectionsspecies) \n \n Other data retrieval functions:\n [`lfa_get_species`](#lfagetspecies)\n\n\n#### Value\n\nA data frame containing aggregated detection data for multiple species.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections()\n\n# Retrieve aggregated detection data for multiple species\ndetections_data <- lfa_get_detections()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections()\n```\n:::\n\n\n\n### `lfa_get_flag_path`\n\nGet the path to a flag file indicating the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being flagged.\n\n\n#### Description\n\nThis function constructs and returns the path to a hidden flag file, which serves as an indicator that a particular processing step has been completed. The flag file is created in a designated location within the working directory.\n\n\n#### Value\n\nA character string representing the absolute path to the hidden flag file.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Get the flag path for a process named \"data_processing\"\nlfa_get_flag_path(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_flag_path(flag_name)\n```\n:::\n\n\n\n### `lfa_get_species`\n\nGet a list of species from the data directory.\n\n\n#### Concept\n\ndata retrieval functions\n\n\n#### Description\n\nThis function retrieves a list of species by scanning the \"data\" directory\n located in the current working directory.\n\n\n#### Keyword\n\ndata\n\n\n#### References\n\nThis function relies on the [`list.dirs`](#list.dirs) function for directory listing.\n\n\n#### Seealso\n\n[`list.dirs`](#list.dirs) \n \n Other data retrieval functions:\n [`lfa_get_detections`](#lfagetdetections)\n\n\n#### Value\n\nA character vector containing the names of species found in the \"data\" directory.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Retrieve the list of species\nspecies_list <- lfa_get_species()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_species()\n```\n:::\n\n\n\n### `lfa_ground_correction`\n\nCorrect the point clouds for correct ground imagery\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | An LASCatalog object. If not null, it will perform the actions on this object, if NULL inferring the catalog from the tile_location\n`tile_location` | A tile_location type object holding the information about the location of the cataog. This is used to save the catalog after processing too.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function is needed to correct the Z value of the point cloud, relative to the real\n ground height. After using this function to your catalog, the Z values can be seen as the\n real elevation about the ground. At the moment the function uses the `tin()` function from\n the `lidr` package. NOTE : The operation is inplace and can not be reverted, the old values\n of the point cloud will be deleted!\n\n\n#### Value\n\nA catalog with the corrected z values. The catalog is always stored at tile_location and\n holding only the transformed values.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_ground_correction(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_init_data_structure`\n\nInitialize data structure for species and areas\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_species` | A data frame with information about species and associated areas.\n\n\n#### Description\n\nThis function initializes the data structure for storing species and associated areas.\n\n\n#### Details\n\nThe input data frame, `sf_species` , should have at least the following columns:\n \n\n* \"species\": The names of the species for which the data structure needs to be initialized. \n\n* \"name\": The names of the associated areas. \n \n The function creates directories based on the species and area information provided in\n the `sf_species` data frame. It checks whether the directories already exist and creates\n them if they don't.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example species data frame\nsf_species <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Other necessary columns\n)\n\nlfa_init_data_structure(sf_species)\n\n# Example species data frame\nsf_species <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Other necessary columns\n)\n\nlfa_init_data_structure(sf_species)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_init_data_structure(sf_species)\n```\n:::\n\n\n\n### `lfa_init`\n\nInitialize LFA (LiDAR forest analysis) data processing\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_file` | A character string specifying the path to the shapefile containing spatial features of research areas.\n\n\n#### Description\n\nThis function initializes the LFA data processing by reading a shapefile containing\n spatial features of research areas, downloading the specified areas, and creating\n tile location objects for each area.\n\n\n#### Details\n\nThis function reads a shapefile ( `sf_file` ) using the `sf` package, which should\n contain information about research areas. It then calls the `lfa_download_areas` \n function to download the specified areas and `lfa_create_tile_location_objects` \n to create tile location objects based on Lidar data files in those areas. The\n shapefile MUST follow the following requirements:\n \n\n* Each geometry must be a single object of type polygon \n\n* Each entry must have the following attributes: \n\n* species: A string describing the tree species of the area. \n\n* name: A string describing the location of the area.\n\n\n#### Value\n\nA vector containing tile location objects.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Initialize LFA processing with the default shapefile\nlfa_init()\n\n# Initialize LFA processing with a custom shapefile\nlfa_init(\"custom_areas.shp\")\n\n# Example usage with the default shapefile\nlfa_init()\n\n# Example usage with a custom shapefile\nlfa_init(\"custom_areas.shp\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_init(sf_file = \"research_areas.shp\")\n```\n:::\n\n\n\n### `lfa_intersect_areas`\n\nIntersect Lidar Catalog with Spatial Features\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | A LAScatalog object representing the Lidar data to be processed.\n`tile_location` | A tile location object representing the specific area of interest.\n`areas_sf` | Spatial features defining areas.\n\n\n#### Description\n\nThis function intersects a Lidar catalog with a specific area defined by spatial features.\n\n\n#### Details\n\nThe function intersects the Lidar catalog specified by `ctg` with a specific area defined by\n the `tile_location` object and `areas_sf` . It removes points outside the specified area and\n returns a modified LAScatalog object.\n \n The specified area is identified based on the `species` and `name` attributes in the\n `tile_location` object. If a matching area is not found in `areas_sf` , the function\n stops with an error.\n \n The function then transforms the spatial reference of the identified area to match that of\n the Lidar catalog using `sf::st_transform` .\n \n The processing is applied to each chunk in the catalog using the `identify_area` function,\n which merges spatial information and filters out points that are not classified as inside\n the identified area. After processing, the function writes the modified LAS files back to\n the original file locations, removing points outside the specified area.\n \n If an error occurs during the processing of a chunk, a warning is issued, and the function\n continues processing the next chunks. If no points are found after filtering, a warning is\n issued, and NULL is returned.\n\n\n#### Seealso\n\nOther functions in the Lidar forest analysis (LFA) package.\n\n\n#### Value\n\nA modified LAScatalog object with points outside the specified area removed.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n\n# Example usage\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n```\n:::\n\n\n\n### `lfa_load_ctg_if_not_present`\n\nLoading the catalog if it is not present\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | Catalog object. Can be NULL\n`tile_location` | The location to look for the catalog tiles, if their are not present\n\n\n#### Description\n\nThis function checks if the catalog is `NULL` . If it is it will load the\n catalog from the `tile_location`\n\n\n#### Value\n\nThe provided ctg object if not null, else the catalog for the tiles\n of the tile_location.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_load_ctg_if_not_present(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_map_tile_locations`\n\nMap Function Over Tile Locations\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`tile_locations` | A list of tile location objects.\n`map_function` | The mapping function to be applied to each tile location.\n`...` | Additional arguments to be passed to the mapping function.\n\n\n#### Description\n\nThis function applies a specified mapping function to each tile location in a list.\n\n\n#### Details\n\nThis function iterates over each tile location in the provided list ( `tile_locations` )\n and applies the specified mapping function ( `map_function` ) to each tile location.\n The mapping function should accept a tile location object as its first argument, and\n additional arguments can be passed using the ellipsis ( `...` ) syntax.\n \n This function is useful for performing operations on multiple tile locations concurrently,\n such as loading Lidar data, processing areas, or other tasks that involve tile locations.\n\n\n#### Seealso\n\nThe mapping function provided should be compatible with the structure and requirements\n of the tile locations and the specific task being performed.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage\nlfa_map_tile_locations(tile_locations, my_mapping_function, param1 = \"value\")\n\n# Example usage\nlfa_map_tile_locations(tile_locations, my_mapping_function, param1 = \"value\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_map_tile_locations(tile_locations, map_function, check_flag = NULL, ...)\n```\n:::\n\n\n\n### `lfa_merge_and_save`\n\nMerge and Save Text Files in a Directory\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`input_directory` | The path to the input directory containing text files.\n`output_name` | The name for the output file where the merged content will be saved.\n\n\n#### Description\n\nThis function takes an input directory and an output name as arguments.\n It merges the textual content of all files in the specified directory into\n a single string, with each file's content separated by a newline character.\n The merged content is then saved into a file named after the output name\n in the same directory. After the merging is complete, all input files are\n deleted.\n\n\n#### Details\n\nThis function reads the content of each text file in the specified input directory\n and concatenates them into a single string. Each file's content is separated by a newline\n character. The merged content is then saved into a file named after the output name\n in the same directory. Finally, all input files are deleted from the directory.\n\n\n#### Seealso\n\n[`readLines`](#readlines) , [`writeLines`](#writelines) , [`file.remove`](#file.remove)\n\n\n#### Value\n\nThis function does not explicitly return any value. It prints a message\n indicating the successful completion of the merging and saving process.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Merge text files in the \"data_files\" directory and save the result in \"merged_output\"\nlfa_merge_and_save(\"data_files\", \"merged_output\")\n\n# Merge text files in the \"data_files\" directory and save the result in \"merged_output\"\nlfa_merge_and_save(\"data_files\", \"merged_output\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_merge_and_save(input_directory, output_name)\n```\n:::\n\n\n\n### `lfa_rd_to_qmd`\n\nConvert Rd File to Markdown\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`rdfile` | The path to the Rd file or a parsed Rd object.\n`outfile` | The path to the output Markdown file (including the file extension).\n`append` | Logical, indicating whether to append to an existing file (default is FALSE).\n\n\n#### Description\n\nIMPORTANT NOTE: \n This function is nearly identical to the `Rd2md::Rd2markdown` function from the `Rd2md` \n package. We needed to implement our own version of it because of various reasons:\n \n\n* The algorithm uses hardcoded header sizes (h1 and h2 in original) which is not feasible for our use-case of the markdown. \n\n* We needed to add some Quarto Markdown specifics, e.g. to make sure that the examples will not be runned. \n\n* We want to exclude certain tags from our implementation.\n\n\n#### Details\n\nFor that reason we copied the method and made changes as needed and also added this custom documentation.\n \n This function converts an Rd (R documentation) file to Markdown format (.md) and\n saves the converted file at the specified location. The function allows appending\n to an existing file or creating a new one. The resulting Markdown file includes\n sections for the function's name, title, and additional content such as examples,\n usage, arguments, and other sections present in the Rd file.\n \n The function performs the following steps:\n \n\n* Parses the Rd file using the Rd2md package. \n\n* Creates a Markdown file with sections for the function's name, title, and additional content. \n\n* Appends the content to an existing file if `append` is set to TRUE. \n\n* Saves the resulting Markdown file at the specified location.\n\n\n#### Seealso\n\n[`Rd2md::parseRd`](#rd2md::parserd)\n\n\n#### Value\n\nThis function does not explicitly return any value. It saves the converted Markdown file\n at the specified location as described in the details section.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Convert Rd file to Markdown and save it\nlfa_rd_to_md(\"path/to/your/file.Rd\", \"path/to/your/output/file.md\")\n\n# Convert Rd file to Markdown and append to an existing file\nlfa_rd_to_md(\"path/to/your/file.Rd\", \"path/to/existing/output/file.md\", append = TRUE)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_rd_to_qmd(rdfile, outfile, append = FALSE)\n```\n:::\n\n\n\n### `lfa_rd_to_results`\n\nConvert Rd Files to Markdown and Merge Results\n\n\n#### Description\n\nThis function converts all Rd (R documentation) files in the \"man\" directory\n to Markdown format (.qmd) and saves the converted files in the \"results/appendix/package-docs\" directory.\n It then merges the converted Markdown files into a single string and saves\n the merged content into a file named \"docs.qmd\" in the \"results/appendix/package-docs\" directory.\n\n\n#### Details\n\nThe function performs the following steps:\n \n\n* Removes any existing \"docs.qmd\" file in the \"results/appendix/package-docs\" directory. \n\n* Finds all Rd files in the \"man\" directory. \n\n* Converts each Rd file to Markdown format (.qmd) using the `lfa_rd_to_qmd` function. \n\n* Saves the converted Markdown files in the \"results/appendix/package-docs\" directory. \n\n* Merges the content of all converted Markdown files into a single string. \n\n* Saves the merged content into a file named \"docs.qmd\" in the \"results/appendix/package-docs\" directory.\n\n\n#### Seealso\n\n[`lfa_rd_to_qmd`](#lfardtoqmd) , [`lfa_merge_and_save`](#lfamergeandsave)\n\n\n#### Value\n\nThis function does not explicitly return any value. It performs the conversion,\n merging, and saving operations as described in the details section.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Convert Rd files to Markdown and merge the results\nlfa_rd_to_results()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_rd_to_results()\n```\n:::\n\n\n\n### `lfa_segmentation`\n\nSegment the elements of an point cloud by trees\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | An LASCatalog object. If not null, it will perform the actions on this object, if NULL inferring the catalog from the tile_location\n`tile_location` | A tile_location type object holding the information about the location of the catalog. This is used to save the catalog after processing too.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function will try to to divide the hole point cloud into unique trees.\n Therefore it is assigning for each chunk of the catalog a `treeID` for each\n point. Therefore the algorithm uses the `li2012` implementation with the\n following parameters: `li2012(dt1 = 2, dt2 = 3, R = 2, Zu = 10, hmin = 5, speed_up = 12)` \n NOTE : The operation is in place and can not be reverted, the old values\n of the point cloud will be deleted!\n\n\n#### Value\n\nA catalog where each chunk has additional `treeID` values indicating the belonging tree.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_segmentation(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_set_flag`\n\nSet a flag to indicate the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being flagged.\n\n\n#### Description\n\nThis function creates a hidden flag file at a specified location within the working directory to indicate that a particular processing step has been completed. If the flag file already exists, a warning is issued.\n\n\n#### Value\n\nThis function does not have a formal return value.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Set the flag for a process named \"data_processing\"\nlfa_set_flag(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_set_flag(flag_name)\n```\n:::\n\n\n\n", + "markdown": "---\ntitle: \"Forest Data Analysis Report\"\noutput:\n pdf_document:\n latex_engine: xelatex\ntoc: true\ntoc-depth: 2\ntoc-title: Contents\nnumber-sections: true\nnumber-depth: 3\ndate: today\nauthor:\n - name: Jakob Danel\n email: jakob.danel@uni-muenster.de\n url: https://github.com/jakobdanel\n affiliations:\n - name: Universität Münster\n city: Münster\n country: Germany\n - name: Federick Bruch\n email: f_bruc03@uni-muenster.de\n url: https://www.uni-muenster.de/Geoinformatics/institute/staff/index.php/351/Frederick_Bruch\n affiliations:\n - name: Universität Münster\n city: Münster\n country: Germany\nbibliography: references.bib\nexecute-dir: .. \nprefer-html: true\n---\n\n\n# Introduction\n\nThis report documents the analysis of forest data for different tree species.\n\n# Methods\n\n## Data acquisition\n\nOur primary objective is to identify patches where one tree species exhibits a high level of dominance, striving to capture monocultural stands within the diverse forests of Nordrhein-Westfalia (NRW). Recognizing the practical challenges of finding true monocultures, we aim to identify patches where one species is highly dominant, enabling meaningful comparisons across different species.\n\nThe study is framed within the NRW region due to the availability of an easily accessible dataset. Our focus includes four prominent tree species in NRW: oak, beech, spruce, and pine, representing the most prevalent species in the region. To ensure the validity of our findings, we derive three patches for each species, thereby confirming that observed variables are characteristic of a particular species rather than a specific patch. Each patch is carefully selected to encompass an area of approximately 50-100 hectares and contain between 5,000 and 10,000 trees. Striking a balance between relevance and manageability, these patches avoid excessive size to enhance the likelihood of capturing varied species mixes and ensure compatibility with local hardware.\n\nSpecific Goals:\n\n1. Retrieve patches with highly dominant tree species.\n2. Minimize or eliminate the presence of human-made structures within the selected patches.\n\nTo achieve our goals, we utilized the waldmonitor dataset [@welle2014] and the map provided by [@Blickensdoerfer2022], both indicating dominant tree species in NRW. We identified patches of feasible size where both sources predicted the presence of a specific species. Further validation involved examining sentinel images of these forest regions to assess the evenness of structures, leaf color distribution, and the absence of significant human-made structures such as roads or buildings. The subsequent preprocessing steps, detailed in the following subsection, involved refining our selected patches and deriving relevant variables, such as tree distribution and density, to ensure that the chosen areas align with the desired research domains.\n\n## Preprocessing\n::: {.cell}\n\n:::\n\n\nIn this research study, the management and processing of a large dataset are crucial considerations. The dataset's substantial size necessitates careful maintenance to ensure efficient handling. Furthermore, the data should be easily processable and editable to facilitate necessary corrections and precalculations within the context of our research objectives. To achieve our goals, we have implemented a framework that automatically derives data based on a shapefile, delineating areas of interest. The processed data and results of precalculations are stored in a straightforward manner to enhance accessibility. Additionally, we have designed functions that establish a user-friendly interface, enabling the execution of algorithms on subsets of the data, such as distinct species. These interfaces are not only directly callable by users but can also be integrated into other functions to automate processes. The overarching aim is to streamline the entire preprocessing workflow using a single script, leveraging only the shapefile as a basis. This subsection details the accomplishments of our R-package in realizing these goals, outlining the preprocessing steps undertaken and justifying their necessity in the context of our research.\n\nThe data are stored in a data subdirectory of the root directory in the format `species/location-name/tile-name`. To automate the matching of areas of interest with the catalog from the Land NRW[^1], we utilize the intersecting tool developed by Heisig[^2]. This tool, allows for the automatic retrieval and placement of data downloaded from the Land NRW catalog. To enhance data accessibility, we have devised an object that incorporates species, location name, and tile name (the NRW internal identifier) for each area This object facilitates the specification of the area to be processed. Additionally, we have defined an initialization function that downloads all tiles, returning a list of tile location objects for subsequent processing. A pivotal component of the package's preprocessing functionality is the map function, which iterates over a list of tile locations (effectively the entire dataset) and accepts a processing function as an argument. The subsequent paragraph outlines the specific preprocessing steps employed, all of which are implemented within the mapping function.\n\nTo facilitate memory-handling capabilities, each of the tiles, where one area can span multiple tiles, has been split into manageable chunks. We employed a 50x50m size for each tile, resulting in the division of original 1km x 1km files into 400 tiles. These tiles are stored in our directory structure, with each tile housed in a directory named after its tile name and assigned an id as the filename. Implementation-wise, the `lidr::catalog_retile` function was instrumental in achieving this segmentation. The resulting smaller chunks allow for efficient iteration during subsequent preprocessing steps.\n\nThe next phase involves reducing our data to the actual size by intersecting the tiles with the defined area of interest. Using the `lidR::merge_spatial` function, we intersect the area derived from the shapefile, removing all point cloud items outside this region. Due to our tile-wise approach, empty tiles may arise, and in such cases, those tiles are simply deleted.\n\nFollowing the size reduction to our dataset, the next step involves correcting the `z` values. The `z` values in the data are originally relative to the ellipsoid used for referencing, but we require them to be relative to the ground. To achieve this, we utilize the `lidR::tin` function, which extrapolates a convex hull between all ground points (classified by the data provider) and calculates the z value based on this structure.\n\nSubsequently, we aim to perform segmentation for each distinct tree, marking each item of the point cloud with a tree ID. We employ the algorithm described by @li2012, using parameters `li2012(dt1 = 2, dt2 = 3, R = 2, Zu = 10, hmin = 5, speed_up = 12)`. The meanings of these parameters are elucidated in Li et al.'s work [@li2012].\n\nFinally, the last preprocessing step involves individual tree detection, seeking a single `POINT` object for each tree. The `lidR::lmf` function, an implementation of the tree data using a local maximum approach, is utilized for this purpose [@popescu2004]. The results are stored in GeoPackage files within our data structure.\n\nSee @sec-appendix-preprocessing for the implementation of the preprocessing.\n\n[^1]: https://www.opengeodata.nrw.de/produkte/geobasis/hm/3dm_l_las/3dm_l_las/, last visited 7th Dec 2023\n[^2]: https://github.com/joheisig/GEDIcalibratoR, last visited 7th Dec 2023\n\n## Analysis of different distributions\n\nAnalysis of data distributions is a critical aspect of our research, with a focus on comparing two or more distributions. Our objective extends beyond evaluating the disparities between species; we also aim to assess differences within a species. To gain a comprehensive understanding of the data, we employ various visualization techniques, including histograms, QQ-Plots (Quantile-Quantile Plots), density functions, and box plots.\n\nIn tandem with visualizations, descriptive statistics, such as means, standard errors, and quantiles, are leveraged to provide key insights into the central tendency and variability of the data.\n\nFor a more quantitative analysis of distribution dissimilarity, statistical tests are employed. The Kullback-Leibler (KL) difference serves as a measure to compare the similarity of a set of distributions. This involves converting distributions into their density functions, with the standard error serving as the bandwidth. The KL difference is calculated for each pair of distributions, as it is asymmetric. For the two distributions the KL difference is defined as following [@kullback1951kullback]:\n\n$$\nD_{KL}(P \\, \\| \\, Q) = \\sum_i P(i) \\log\\left(\\frac{P(i)}{Q(i)}\\right)\n$$\n\nTo obtain a symmetric score, the Jensen-Shannon Divergence (JSD) is utilized [@briet2009properties], expressed by the formula:\n\n$$\nJS(P || Q) = \\frac{1}{2} * KL(P || M) + \\frac{1}{2} * KL(Q || M)\n$$\nHere, $M = \\frac{1}{2} * (P + Q)$. The JSD provides a balanced measure of dissimilarity between distributions.\n\nAdditionally, the Kolmogorov-Smirnov Test is implemented to assess whether two distributions significantly differ from each other. This statistical test offers a formal evaluation of the dissimilarity between empirical distribution functions.\n\n\n# Results\n::: {.cell}\n\n:::\n\n## Researched areas\n\n::: {.cell}\n\n```{.r .cell-code code-fold=\"true\"}\nlibrary(ggplot2)\nsf::sf_use_s2(FALSE)\npatches <- sf::read_sf(\"research_areas.shp\") |> sf::st_centroid()\n\nde <- sf::read_sf(\"results/results/states_de/Bundesländer_2017_mit_Einwohnerzahl.shp\") # Source: https://hub.arcgis.com/datasets/esri-de-content::bundesl%C3%A4nder-2017-mit-einwohnerzahl/explore?location=51.099647%2C10.454033%2C7.43\nnrw <- de[5,] |> sf::st_geometry()\n\n\nggplot() + geom_sf(data = nrw) + \n geom_sf(data = patches, mapping = aes(col = species))\n```\n\n::: {.cell-output-display}\n![Locations of the different patches with the dominant species for that patch. The patches centroids are displayed on a basemap describing the borders from NRW.](report_files/figure-html/fig-patches-nrw-1.png){#fig-patches-nrw width=672}\n:::\n:::\nWe draw three patches for each species from different regions (see @tbl-summary-researched-areas). We download the LiDAR data for those patches and runned all preprocessing steps as described. We than checked with certain derived parameters (e.g. tree heights, tree distributions or tree density) that all patches contain valid forest data. In that step we discovered, that in one patch some forest clearance took place in the near past. This patch was removed from the dataset and was replaced with a new one. \n\nIn our research, drawing patches evenly distributed across Nordrhein-Westfalia is inherently constrained by natural factors. Consequently, the patches for oak and pine predominantly originate from the Münsterland region, as illustrated in [@fig-patches-nrw]. For spruce, the patches were derived from Sauerland, reflecting the prevalence of spruce forests in this specific region within NRW, as corroborated by Welle et al. [@welle2014] and Blickensdörfer et al. [@Blickensdoerfer2022]. Beech patches, on the other hand, were generated from diverse locations within NRW. Across all patches, no human-made objects were identified, with the exception of small paths for pedestrians and forestry vehicles.\n\nThe distribution of area and detections is notable for each four species. Beech covers 69,791.9 hectares with a total of 5,954 detections, oak spans 63,232.49 hectares with 5,354 detections, pine extends across 72,862.4 hectares with 8,912 detections, and spruce encompasses 57,940.02 hectares with 8,619 detections. Both the amount of detections and the corresponding area exhibit a relatively uniform distribution across the diverse patches, as summarized in @tbl-summary-researched-areas. \n\nWith the selected dataset described, we intentionally chose three patches for each four species that exhibit a practical and usable size for our research objectives. These carefully chosen patches align with the conditions essential for our study, providing comprehensive and representative data for in-depth analysis and meaningful insights into the characteristics of each tree species within the specified areas.\n\n\n::: {#tbl-summary-researched-areas .cell tbl-cap='Summary of researched patches grouped by species, with their location, area and the amount of detected trees.'}\n\n```{.r .cell-code code-fold=\"true\"}\nshp <- sf::read_sf(\"research_areas.shp\")\ntable <- lfa::lfa_get_all_areas()\n\nsf::sf_use_s2(FALSE)\nfor (row in 1:nrow(table)) {\n area <-\n dplyr::filter(shp, shp$species == table[row, \"specie\"] &\n shp$name == table[row, \"area\"])\n area_size <- area |> sf::st_area()\n point <- area |> sf::st_centroid() |> sf::st_coordinates()\n table[row,\"point\"] <- paste0(\"(\",round(point[1], digits = 4),\", \",round(point[2],digits = 4),\")\")\n \n table[row, \"area_size\"] = round(area_size,digits = 2) #paste0(round(area_size,digits = 2), \" m²\")\n \n amount_det <- nrow(lfa::lfa_get_detection_area(table[row, \"specie\"], table[row, \"area\"]))\n if(is.null(amount_det)){\n cat(nrow(lfa::lfa_get_detection_area(table[row, \"specie\"], table[row, \"area\"])),table[row, \"specie\"],table[row, \"area\"])\n }\n table[row, \"amount_detections\"] = amount_det\n \n # table[row, \"specie\"] <- lfa::lfa_capitalize_first_char(table[row,\"specie\"])\n table[row, \"area\"] <- lfa::lfa_capitalize_first_char(table[row,\"area\"])\n }\ntable$area <- gsub(\"_\", \" \", table$area)\ntable$area <- gsub(\"ue\", \"ü\", table$area)\ntable = table[,!names(table) %in% c(\"specie\")]\n\nknitr::kable(table, \"html\", col.names = c(\"Patch Name\",\"Location\",\"Area size (m²)\",\"Amount tree detections\" ), caption = NULL, digits = 2, escape = TRUE) |>\n kableExtra::kable_styling(\n bootstrap_options = c(\"striped\", \"hold_position\", \"bordered\",\"responsive\"),\n stripe_index = c(1:3,7:9),\n full_width = FALSE\n ) |>\n kableExtra::pack_rows(\"Beech\", 1, 3) |>\n kableExtra::pack_rows(\"Oak\", 4, 6) |>\n kableExtra::pack_rows(\"Pine\", 7, 9) |>\n kableExtra::pack_rows(\"Spruce\", 10, 12) |>\n kableExtra::column_spec(1, bold = TRUE)\n```\n\n::: {.cell-output-display}\n`````{=html}\n\n \n \n \n \n \n \n \n \n\n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n
Patch Name Location Area size (m²) Amount tree detections
Beech
Bielefeld brackwede (8.5244, 51.9902) 161410.57 1443
Billerbeck (7.3273, 51.9987) 185887.25 1732
Wülfenrath (7.0769, 51.2917) 350621.21 2779
Oak
Hamm (7.8618, 51.6639) 269397.22 2441
Münster (7.6187, 51.9174) 164116.61 1270
Rinkerode (7.6744, 51.8598) 198811.09 1643
Pine
Greffen (8.1697, 51.9913) 49418.81 513
Mesum (7.5403, 52.2573) 405072.85 5031
Telgte (7.7816, 52.0024) 274132.34 3368
Spruce
Brilon (8.5352, 51.4084) 211478.20 3342
Oberhundem (8.1861, 51.0909) 151895.53 2471
Osterwald (8.3721, 51.2151) 216026.43 2806
\n\n`````\n:::\n:::\n\n\n\n\n\n\n\n\n|specie |area | density (1/m²)|\n|:------|:-------------------|---------:|\n|beech |bielefeld_brackwede | 0.0089399|\n|beech |billerbeck | 0.0093175|\n|beech |wuelfenrath | 0.0079259|\n|oak |hamm | 0.0090610|\n|oak |muenster | 0.0077384|\n|oak |rinkerode | 0.0082641|\n|pine |greffen | 0.0103807|\n|pine |mesum | 0.0124200|\n|pine |telgte | 0.0122860|\n|spruce |brilon | 0.0158030|\n|spruce |oberhundem | 0.0162678|\n|spruce |osterwald | 0.0129892|\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndata <- lfa::lfa_get_detections()\ntest_result <- lfa::lfa_run_test_symmetric(data,\"Z\",\"specie\",lfa::lfa_kld)\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in p/q: longer object length is not a multiple of shorter object length\n\nWarning in p/q: longer object length is not a multiple of shorter object length\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in p * log(p/q): longer object length is not a multiple of shorter\nobject length\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in p/q: longer object length is not a multiple of shorter object length\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in p * log(p/q): longer object length is not a multiple of shorter\nobject length\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in p/q: longer object length is not a multiple of shorter object length\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in p * log(p/q): longer object length is not a multiple of shorter\nobject length\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in p/q: longer object length is not a multiple of shorter object length\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in p * log(p/q): longer object length is not a multiple of shorter\nobject length\n```\n:::\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in p/q: longer object length is not a multiple of shorter object length\n```\n:::\n\n```{.r .cell-code}\nlfa::lfa_generate_result_table_tests(test_result, caption = \"This is a table\")\n```\n\n::: {.cell-output-display}\n`````{=html}\n\n\n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n
This is a table
Beech Oak Pine Spruce
Beech 0 12399 56763 24078
Oak NA 0 41647 11186
Pine NA NA 0 -22695
Spruce NA NA NA 0
\n\n`````\n:::\n:::\n\n\n\n# References\n\n::: {#refs}\n:::\n\n# Appendix\n## Script which can be used to do all preprocessing {#sec-appendix-preprocessing}\n\n::: {.cell}\n\n:::\n\n\nLoad the file with the research areas\n::: {.cell}\n\n```{.r .cell-code}\nsf <- sf::read_sf(here::here(\"research_areas.shp\"))\nprint(sf)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nSimple feature collection with 12 features and 3 fields\nGeometry type: POLYGON\nDimension: XY\nBounding box: xmin: 7.071625 ymin: 51.0895 xmax: 8.539877 ymax: 52.25983\nGeodetic CRS: WGS 84\n# A tibble: 12 × 4\n id species name geometry\n \n 1 1 oak rinkerode ((7.678922 51.85789, 7.675446 51.85752, 7.…\n 2 2 oak hamm ((7.858955 51.66699, 7.866444 51.66462, 7.…\n 3 3 oak muenster ((7.618908 51.9154, 7.617384 51.9172, 7.61…\n 4 4 pine greffen ((8.168691 51.98965, 8.167178 51.99075, 8.…\n 5 5 pine telgte ((7.779728 52.00662, 7.781616 52.00662, 7.…\n 6 6 pine mesum ((7.534424 52.25499, 7.53378 52.25983, 7.5…\n 7 7 beech bielefeld_brackwede ((8.524749 51.9921, 8.528418 51.99079, 8.5…\n 8 8 beech wuelfenrath ((7.071625 51.29256, 7.072311 51.29334, 7.…\n 9 9 beech billerbeck ((7.324729 51.99783, 7.323548 51.99923, 7.…\n10 11 spruce brilon ((8.532195 51.41029, 8.535027 51.41064, 8.…\n11 12 spruce osterwald ((8.369328 51.21693, 8.371238 51.21718, 8.…\n12 10 spruce oberhundem ((8.18082 51.08999, 8.180868 51.09143, 8.1…\n```\n:::\n:::\n\n\nInit the project\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(lfa)\nsf::sf_use_s2(FALSE)\nlocations <- lfa_init(\"research_areas.shp\")\n```\n:::\n\nDo all of the prprocessing steps\n::: {.cell}\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations,retile,check_flag = \"retile\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag retile is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_intersect_areas, ctg = NULL, areas_sf = sf,check_flag = \"intersect\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag intersect is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_ground_correction, ctg = NULL,check_flag = \"z_correction\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag z_correction is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_segmentation, ctg = NULL,check_flag = \"segmentation\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag segmentation is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_detection, catalog = NULL, write_to_file = TRUE,check_flag = \"detection\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag detection is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n:::\n\n\n\n## Documentation\n### `create_qq_plots`\n\nCreate QQ-Plots for combinations of groups in a data frame\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`data` | A data frame containing the data.\n`value_column` | The name of the column containing the values for the QQ-Plot.\n`category_column1` | The name of the first column containing categorical variables for grouping.\n`category_column2` | The name of the second column containing categorical variables for grouping.\n`title` | An optional title for the plot. If not provided, a default title is generated based on the data frame name.\n\n\n#### Description\n\nThis function generates QQ-Plots using ggplot2 based on the specified data frame and columns.\n\n\n#### Details\n\nThe function creates QQ-Plots for each combination of unique values in `category_column1` and `category_column2` \n comparing the quantiles of one category against another.\n\n\n#### Value\n\nA ggplot object representing the QQ-Plots arranged in an n x n grid.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Assuming you have a data frame 'your_data' with columns 'value', 'category1', and 'category2'\ncreate_qq_plots(your_data, \"value\", \"category1\", \"category2\", title = \"QQ-Plots\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\ncreate_qq_plots(\n data,\n value_column,\n category_column1,\n category_column2,\n title = NULL\n)\n```\n:::\n\n\n\n### `lfa_capitalize_first_char`\n\nCapitalize First Character of a String\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`input_string` | A single-character string to be processed.\n\n\n#### Concept\n\nString Manipulation\n\n\n#### Description\n\nThis function takes a string as input and returns the same string with the\n first character capitalized. If the first character is already capitalized,\n the function does nothing. If the first character is not from the alphabet,\n an error is thrown.\n\n\n#### Details\n\nThis function performs the following steps:\n \n\n* Checks if the input is a single-character string. \n\n* Verifies if the first character is from the alphabet (A-Z or a-z). \n\n* If the first character is not already capitalized, it capitalizes it. \n\n* Returns the modified string.\n\n\n#### Keyword\n\nalphabet\n\n\n#### Note\n\nThis function is case-sensitive and assumes ASCII characters.\n\n\n#### References\n\nNone\n\n\n#### Seealso\n\nThis function is related to the basic string manipulation functions in base R.\n\n\n#### Value\n\nA modified string with the first character capitalized if it is\n not already. If the first character is already capitalized, the original\n string is returned.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Capitalize the first character of a string\ncapitalize_first_char(\"hello\") # Returns \"Hello\"\ncapitalize_first_char(\"World\") # Returns \"World\"\n\n# Error example (non-alphabetic first character)\ncapitalize_first_char(\"123abc\") # Throws an error\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_capitalize_first_char(input_string)\n```\n:::\n\n\n\n### `lfa_check_flag`\n\nCheck if a flag is set, indicating the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being checked.\n\n\n#### Description\n\nThis function checks for the existence of a hidden flag file at a specified location within the working directory. If the flag file is found, a message is printed, and the function returns `TRUE` to indicate that the associated processing step has already been completed. If the flag file is not found, the function returns `FALSE` , indicating that further processing can proceed.\n\n\n#### Value\n\nA logical value indicating whether the flag is set ( `TRUE` ) or not ( `FALSE` ).\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Check if the flag for a process named \"data_processing\" is set\nlfa_check_flag(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_check_flag(flag_name)\n```\n:::\n\n\n\n### `lfa_combine_sf_obj`\n\nCombine Spatial Feature Objects from Multiple GeoPackage Files\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`paths` | A character vector containing file paths to GeoPackage files with neighbor information.\n`area_infos` | A data frame or list containing information about the corresponding detection areas, including \"area\" and \"specie\" columns.\n\n\n#### Description\n\nThis function reads spatial feature objects (sf) from multiple GeoPackage files and combines them into a single sf object.\n Each GeoPackage file is assumed to contain neighbor information for a specific detection area, and the resulting sf object\n includes additional columns indicating the corresponding area and species information.\n\n\n#### Value\n\nA combined sf object with additional columns for area and specie information.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Assuming paths and area_infos are defined\ncombined_sf <- lfa_combine_sf_obj(paths, area_infos)\n\n# Print the combined sf object\nprint(combined_sf)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_combine_sf_obj(paths, area_infos)\n```\n:::\n\n\n\n### `lfa_create_boxplot`\n\nCreate a box plot from a data frame\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`data` | A data frame containing the data.\n`value_column` | The name of the column containing the values for the box plot.\n`category_column1` | The name of the column containing the first categorical variable.\n`category_column2` | The name of the column containing the second categorical variable.\n`title` | An optional title for the plot. If not provided, a default title is generated based on the data frame name.\n\n\n#### Description\n\nThis function generates a box plot using ggplot2 based on the specified data frame and columns.\n\n\n#### Details\n\nThe function creates a box plot where the x-axis is based on the second categorical variable,\n the y-axis is based on the specified value column, and the box plots are colored based on the first\n categorical variable. The grouping of box plots is done based on the unique values in the second categorical variable.\n\n\n#### Value\n\nA ggplot object representing the box plot.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Assuming you have a data frame 'your_data' with columns 'value', 'category1', and 'category2'\ncreate_boxplot(your_data, \"value\", \"category1\", \"category2\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_boxplot(\n data,\n value_column,\n category_column1,\n category_column2,\n title = NULL\n)\n```\n:::\n\n\n\n### `lfa_create_density_plots`\n\nCreate density plots for groups in a data frame\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`data` | A data frame containing the data.\n`value_column` | The name of the column containing the values for the density plot.\n`category_column1` | The name of the column containing the categorical variable for grouping.\n`category_column2` | The name of the column containing the categorical variable for arranging plots.\n`title` | An optional title for the plot. If not provided, a default title is generated based on the data frame name.\n`xlims` | Optional limits for the x-axis. Should be a numeric vector with two elements (lower and upper bounds).\n`ylims` | Optional limits for the y-axis. Should be a numeric vector with two elements (lower and upper bounds).\n\n\n#### Description\n\nThis function generates density plots using ggplot2 based on the specified data frame and columns.\n\n\n#### Details\n\nThe function creates density plots where the x-axis is based on the specified value column,\n and the density plots are colored based on the first categorical variable. The arrangement of plots\n is done based on the unique values in the second categorical variable. The plots are arranged in a 2x2 grid.\n\n\n#### Value\n\nA ggplot object representing the density plots arranged in a 2x2 grid.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Assuming you have a data frame 'your_data' with columns 'value', 'category1', and 'category2'\ncreate_density_plots(your_data, \"value\", \"category1\", \"category2\", title = \"Density Plots\", xlims = c(0, 10), ylims = c(0, 0.5))\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_density_plots(\n data,\n value_column,\n category_column1,\n category_column2,\n title = NULL,\n xlims = NULL,\n ylims = NULL\n)\n```\n:::\n\n\n\n### `lfa_create_stacked_distributions_plot`\n\nCreate a stacked distribution plot for tree detections, visualizing the distribution\n of a specified variable on the x-axis, differentiated by another variable.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`trees` | A data frame containing tree detection data.\n`x_value` | A character string specifying the column name used for finding the values on the x-axis of the histogram.\n`fill_value` | A character string specifying the column name by which the data are differentiated in the plot.\n`bin` | An integer specifying the number of bins for the histogram. Default is 100.\n`ylab` | A character string specifying the y-axis label. Default is \"Amount trees.\"\n`xlim` | A numeric vector of length 2 specifying the x-axis limits. Default is c(0, 100).\n`ylim` | A numeric vector of length 2 specifying the y-axis limits. Default is c(0, 1000).\n`title` | The title of the plot.\n\n\n#### Description\n\nThis function generates a stacked distribution plot using the ggplot2 package,\n providing a visual representation of the distribution of a specified variable\n ( `x_value` ) on the x-axis, with differentiation based on another variable\n ( `fill_value` ). The data for the plot are derived from the provided `trees` \n data frame.\n\n\n#### Keyword\n\ndata\n\n\n#### Seealso\n\n[`ggplot2::geom_histogram`](#ggplot2::geomhistogram) , [`ggplot2::facet_wrap`](#ggplot2::facetwrap) ,\n [`ggplot2::ylab`](#ggplot2::ylab) , [`ggplot2::scale_fill_brewer`](#ggplot2::scalefillbrewer) ,\n [`ggplot2::coord_cartesian`](#ggplot2::coordcartesian)\n\n\n#### Value\n\nA ggplot object representing the stacked distribution plot.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Create a stacked distribution plot for variable \"Z,\" differentiated by \"area\"\ntrees <- lfa_get_detections()\nlfa_create_stacked_distributions_plot(trees, \"Z\", \"area\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_stacked_distributions_plot(\n trees,\n x_value,\n fill_value,\n bin = 100,\n ylab = \"Amount trees\",\n xlim = c(0, 100),\n ylim = c(0, 1000),\n title =\n \"Histograms of height distributions between species 'beech', 'oak', 'pine' and 'spruce' divided by the different areas of Interest\"\n)\n```\n:::\n\n\n\n### `lfa_create_stacked_histogram`\n\nCreate a stacked histogram for tree detections, summing up the values for each species.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`trees` | A data frame containing tree detection data.\n`x_value` | A character string specifying the column name used for finding the values on the x-axis of the histogram.\n`fill_value` | A character string specifying the column name by which the data are differentiated in the plot.\n`bin` | An integer specifying the number of bins for the histogram. Default is 30.\n`ylab` | A character string specifying the y-axis label. Default is \"Frequency.\"\n`xlim` | A numeric vector of length 2 specifying the x-axis limits. Default is c(0, 100).\n`ylim` | A numeric vector of length 2 specifying the y-axis limits. Default is NULL.\n\n\n#### Description\n\nThis function generates a stacked histogram using the ggplot2 package,\n summing up the values for each species and visualizing the distribution of\n a specified variable ( `x_value` ) on the x-axis, differentiated by another\n variable ( `fill_value` ). The data for the plot are derived from the provided\n `trees` data frame.\n\n\n#### Keyword\n\ndata\n\n\n#### Seealso\n\n[`ggplot2::geom_histogram`](#ggplot2::geomhistogram) , [`ggplot2::ylab`](#ggplot2::ylab) ,\n [`ggplot2::scale_fill_brewer`](#ggplot2::scalefillbrewer) , [`ggplot2::coord_cartesian`](#ggplot2::coordcartesian)\n\n\n#### Value\n\nA ggplot object representing the stacked histogram.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Create a stacked histogram for variable \"Z,\" differentiated by \"area\"\ntrees <- lfa_get_detections()\nlfa_create_stacked_histogram(trees, \"Z\", \"area\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_stacked_histogram(\n trees,\n x_value,\n fill_value,\n bin = 30,\n ylab = \"Frequency\",\n xlim = c(0, 100),\n ylim = NULL\n)\n```\n:::\n\n\n\n### `lfa_create_tile_location_objects`\n\nCreate tile location objects\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function traverses a directory structure to find LAZ files and creates\n tile location objects for each file. The function looks into the the `data` \n directory of the repository/working directory. It then creates `tile_location` \n objects based on the folder structure. The folder structure should not be\n touched by hand, but created by `lfa_init_data_structure()` which builds the\n structure based on a shape file.\n\n\n#### Seealso\n\n[`tile_location`](#tilelocation)\n\n\n#### Value\n\nA vector containing tile location objects.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_tile_location_objects()\n\nlfa_create_tile_location_objects()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_tile_location_objects()\n```\n:::\n\n\n\n### `lfa_detection`\n\nPerform tree detection on a lidar catalog and optionally save the results to a file.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`catalog` | A lidar catalog containing point cloud data. If set to NULL, the function attempts to read the catalog from the specified tile location.\n`tile_location` | An object specifying the location of the lidar tile. If catalog is NULL, the function attempts to read the catalog from this tile location.\n`write_to_file` | A logical value indicating whether to save the detected tree information to a file. Default is TRUE.\n\n\n#### Description\n\nThis function utilizes lidar data to detect trees within a specified catalog. The detected tree information can be optionally saved to a file in the GeoPackage format. The function uses parallel processing to enhance efficiency.\n\n\n#### Value\n\nA sf style data frame containing information about the detected trees.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Perform tree detection on a catalog and save the results to a file\nlfa_detection(catalog = my_catalog, tile_location = my_tile_location, write_to_file = TRUE)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_detection(catalog, tile_location, write_to_file = TRUE)\n```\n:::\n\n\n\n### `lfa_download_areas`\n\nDownload areas based on spatial features\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_areas` | Spatial features representing areas to be downloaded. It must include columns like \"species\" \"name\" See details for more information.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function initiates the data structure and downloads areas based on spatial features.\n\n\n#### Details\n\nThe input data frame, `sf_areas` , must have the following columns:\n \n\n* \"species\": The species associated with the area. \n\n* \"name\": The name of the area. \n \n The function uses the `lfa_init_data_structure` function to set up the data structure\n and then iterates through the rows of `sf_areas` to download each specified area.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download_areas(sf_areas)\n\n\n# Example spatial features data frame\nsf_areas <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Must include also other attributes specialized to sf objects\n# such as geometry, for processing of the download\n)\n\nlfa_download_areas(sf_areas)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download_areas(sf_areas)\n```\n:::\n\n\n\n### `lfa_download`\n\nDownload an las file from the state NRW from a specific location\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | The species of the tree which is observed at this location\n`name` | The name of the area that is observed\n`location` | An sf object, which holds the location information for the area where the tile should be downloaded from.\n\n\n#### Description\n\nIt will download the file and save it to data/ list(list(\"html\"), list(list(\"\"))) / list(list(\"html\"), list(list(\"\"))) with the name of the tile\n\n\n#### Value\n\nThe LASCatalog object of the downloaded file\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download(species, name, location)\n```\n:::\n\n\n\n### `lfa_find_n_nearest_trees`\n\nFind n Nearest Trees\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`trees` | A sf object containing tree coordinates.\n`n` | The number of nearest trees to find for each tree (default is 100).\n\n\n#### Description\n\nThis function calculates the distances to the n nearest trees for each tree in the input dataset.\n\n\n#### Value\n\nA data frame with additional columns representing the distances to the n nearest trees.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Load tree data using lfa_get_detections() (not provided)\ntree_data <- lfa_get_detections()\n\n# Filter tree data for a specific species and area\ntree_data = tree_data[tree_data$specie == \"pine\" & tree_data$area == \"greffen\", ]\n\n# Find the 100 nearest trees for each tree in the filtered dataset\ntree_data <- lfa_find_n_nearest_trees(tree_data)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_find_n_nearest_trees(trees, n = 100)\n```\n:::\n\n\n\n### `lfa_generate_result_table_tests`\n\nGenerate Result Table for Tests\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`table` | A data frame representing the result table.\n\n\n#### Description\n\nThis function generates a result table for tests using the knitr::kable function.\n\n\n#### Details\n\nThis function uses the knitr::kable function to create a formatted table, making it suitable for HTML output.\n The input table is expected to be a data frame with test results, and the resulting table will have capitalized\n row and column names with lines between columns and rows.\n\n\n#### Value\n\nA formatted table suitable for HTML output with lines between columns and rows.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Generate a result table for tests\nresult_table <- data.frame(\nTest1 = c(0.05, 0.10, 0.03),\nTest2 = c(0.02, 0.08, 0.01),\nTest3 = c(0.08, 0.12, 0.05)\n)\nformatted_table <- lfa_generate_result_table_tests(result_table)\nprint(formatted_table)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_generate_result_table_tests(table, caption = \"Table Caption\")\n```\n:::\n\n\n\n### `lfa_get_all_areas`\n\nRetrieve a data frame containing all species and corresponding areas.\n\n\n#### Description\n\nThis function scans the \"data\" directory within the current working directory to\n obtain a list of species. It then iterates through each species to retrieve the list\n of areas associated with that species. The resulting data frame contains two columns:\n \"specie\" representing the species and \"area\" representing the corresponding area.\n\n\n#### Keyword\n\ndata\n\n\n#### Seealso\n\n[`list.dirs`](#list.dirs)\n\n\n#### Value\n\nA data frame with columns \"specie\" and \"area\" containing information about\n all species and their associated areas.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Retrieve a data frame with information about all species and areas\nall_areas_df <- lfa_get_all_areas()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_all_areas()\n```\n:::\n\n\n\n### `lfa_get_detection_area`\n\nGet Detection for an area\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | A character string specifying the target species.\n`name` | A character string specifying the name of the tile.\n\n\n#### Description\n\nRetrieves the tree detection information for a specified species and tile.\n\n\n#### Details\n\nThis function reads tree detection data from geopackage files within the specified tile location for a given species. It then combines the data into a single SF data frame and returns it. The function assumes that the tree detection files follow a naming convention with the pattern \"_detection.gpkg\".\n\n\n#### Keyword\n\nspatial\n\n\n#### References\n\nThis function is part of the LiDAR Forest Analysis (LFA) package.\n\n\n#### Seealso\n\n[`get_tile_dir`](#gettiledir)\n\n\n#### Value\n\nA Simple Features (SF) data frame containing tree detection information for the specified species and tile.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Retrieve tree detection data for species \"example_species\" in tile \"example_tile\"\ntrees_data <- lfa_get_detection_tile_location(\"example_species\", \"example_tile\")\n\n# Example usage:\ntrees_data <- lfa_get_detection_tile_location(\"example_species\", \"example_tile\")\n\n# No trees found scenario:\nempty_data <- lfa_get_detection_tile_location(\"nonexistent_species\", \"nonexistent_tile\")\n# The result will be an empty data frame if no trees are found for the specified species and tile.\n\n# Error handling:\n# In case of invalid inputs, the function may throw errors. Ensure correct species and tile names are provided.\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detection_area(species, name)\n```\n:::\n\n\n\n### `lfa_get_detections_species`\n\nRetrieve detections for a specific species.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | A character string specifying the target species.\n\n\n#### Description\n\nThis function retrieves detection data for a given species from multiple areas.\n\n\n#### Details\n\nThe function looks for detection data in the \"data\" directory for the specified species.\n It then iterates through each subdirectory (representing different areas) and consolidates the\n detection data into a single data frame.\n\n\n#### Value\n\nA data frame containing detection information for the specified species in different areas.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage:\ndetections_data <- lfa_get_detections_species(\"example_species\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections_species(species)\n```\n:::\n\n\n\n### `lfa_get_detections`\n\nRetrieve aggregated detection data for multiple species.\n\n\n#### Concept\n\ndata retrieval functions\n\n\n#### Description\n\nThis function obtains aggregated detection data for multiple species by iterating\n through the list of species obtained from [`lfa_get_species`](#lfagetspecies) . For each\n species, it calls [`lfa_get_detections_species`](#lfagetdetectionsspecies) to retrieve the\n corresponding detection data and aggregates the results into a single data frame.\n The resulting data frame includes columns for the species, tree detection data,\n and the area in which the detections occurred.\n\n\n#### Keyword\n\naggregation\n\n\n#### Seealso\n\n[`lfa_get_species`](#lfagetspecies) , [`lfa_get_detections_species`](#lfagetdetectionsspecies) \n \n Other data retrieval functions:\n [`lfa_get_species`](#lfagetspecies)\n\n\n#### Value\n\nA data frame containing aggregated detection data for multiple species.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections()\n\n# Retrieve aggregated detection data for multiple species\ndetections_data <- lfa_get_detections()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections()\n```\n:::\n\n\n\n### `lfa_get_flag_path`\n\nGet the path to a flag file indicating the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being flagged.\n\n\n#### Description\n\nThis function constructs and returns the path to a hidden flag file, which serves as an indicator that a particular processing step has been completed. The flag file is created in a designated location within the working directory.\n\n\n#### Value\n\nA character string representing the absolute path to the hidden flag file.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Get the flag path for a process named \"data_processing\"\nlfa_get_flag_path(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_flag_path(flag_name)\n```\n:::\n\n\n\n### `lfa_get_neighbor_paths`\n\nGet Paths to Neighbor GeoPackage Files\n\n\n#### Description\n\nThis function retrieves the file paths to GeoPackage files containing neighbor information for each detection area.\n The GeoPackage files are assumed to be named \"neighbours.gpkg\" and organized in a directory structure under the \"data\" folder.\n\n\n#### Value\n\nA character vector containing file paths to GeoPackage files for each detection area's neighbors.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Get paths to neighbor GeoPackage files for all areas\npaths <- lfa_get_neighbor_paths()\n\n# Print the obtained file paths\nprint(paths)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_neighbor_paths()\n```\n:::\n\n\n\n### `lfa_get_species`\n\nGet a list of species from the data directory.\n\n\n#### Concept\n\ndata retrieval functions\n\n\n#### Description\n\nThis function retrieves a list of species by scanning the \"data\" directory\n located in the current working directory.\n\n\n#### Keyword\n\ndata\n\n\n#### References\n\nThis function relies on the [`list.dirs`](#list.dirs) function for directory listing.\n\n\n#### Seealso\n\n[`list.dirs`](#list.dirs) \n \n Other data retrieval functions:\n [`lfa_get_detections`](#lfagetdetections)\n\n\n#### Value\n\nA character vector containing the names of species found in the \"data\" directory.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Retrieve the list of species\nspecies_list <- lfa_get_species()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_species()\n```\n:::\n\n\n\n### `lfa_ground_correction`\n\nCorrect the point clouds for correct ground imagery\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | An LASCatalog object. If not null, it will perform the actions on this object, if NULL inferring the catalog from the tile_location\n`tile_location` | A tile_location type object holding the information about the location of the cataog. This is used to save the catalog after processing too.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function is needed to correct the Z value of the point cloud, relative to the real\n ground height. After using this function to your catalog, the Z values can be seen as the\n real elevation about the ground. At the moment the function uses the `tin()` function from\n the `lidr` package. NOTE : The operation is inplace and can not be reverted, the old values\n of the point cloud will be deleted!\n\n\n#### Value\n\nA catalog with the corrected z values. The catalog is always stored at tile_location and\n holding only the transformed values.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_ground_correction(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_init_data_structure`\n\nInitialize data structure for species and areas\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_species` | A data frame with information about species and associated areas.\n\n\n#### Description\n\nThis function initializes the data structure for storing species and associated areas.\n\n\n#### Details\n\nThe input data frame, `sf_species` , should have at least the following columns:\n \n\n* \"species\": The names of the species for which the data structure needs to be initialized. \n\n* \"name\": The names of the associated areas. \n \n The function creates directories based on the species and area information provided in\n the `sf_species` data frame. It checks whether the directories already exist and creates\n them if they don't.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example species data frame\nsf_species <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Other necessary columns\n)\n\nlfa_init_data_structure(sf_species)\n\n# Example species data frame\nsf_species <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Other necessary columns\n)\n\nlfa_init_data_structure(sf_species)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_init_data_structure(sf_species)\n```\n:::\n\n\n\n### `lfa_init`\n\nInitialize LFA (LiDAR forest analysis) data processing\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_file` | A character string specifying the path to the shapefile containing spatial features of research areas.\n\n\n#### Description\n\nThis function initializes the LFA data processing by reading a shapefile containing\n spatial features of research areas, downloading the specified areas, and creating\n tile location objects for each area.\n\n\n#### Details\n\nThis function reads a shapefile ( `sf_file` ) using the `sf` package, which should\n contain information about research areas. It then calls the `lfa_download_areas` \n function to download the specified areas and `lfa_create_tile_location_objects` \n to create tile location objects based on Lidar data files in those areas. The\n shapefile MUST follow the following requirements:\n \n\n* Each geometry must be a single object of type polygon \n\n* Each entry must have the following attributes: \n\n* species: A string describing the tree species of the area. \n\n* name: A string describing the location of the area.\n\n\n#### Value\n\nA vector containing tile location objects.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Initialize LFA processing with the default shapefile\nlfa_init()\n\n# Initialize LFA processing with a custom shapefile\nlfa_init(\"custom_areas.shp\")\n\n# Example usage with the default shapefile\nlfa_init()\n\n# Example usage with a custom shapefile\nlfa_init(\"custom_areas.shp\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_init(sf_file = \"research_areas.shp\")\n```\n:::\n\n\n\n### `lfa_intersect_areas`\n\nIntersect Lidar Catalog with Spatial Features\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | A LAScatalog object representing the Lidar data to be processed.\n`tile_location` | A tile location object representing the specific area of interest.\n`areas_sf` | Spatial features defining areas.\n\n\n#### Description\n\nThis function intersects a Lidar catalog with a specific area defined by spatial features.\n\n\n#### Details\n\nThe function intersects the Lidar catalog specified by `ctg` with a specific area defined by\n the `tile_location` object and `areas_sf` . It removes points outside the specified area and\n returns a modified LAScatalog object.\n \n The specified area is identified based on the `species` and `name` attributes in the\n `tile_location` object. If a matching area is not found in `areas_sf` , the function\n stops with an error.\n \n The function then transforms the spatial reference of the identified area to match that of\n the Lidar catalog using `sf::st_transform` .\n \n The processing is applied to each chunk in the catalog using the `identify_area` function,\n which merges spatial information and filters out points that are not classified as inside\n the identified area. After processing, the function writes the modified LAS files back to\n the original file locations, removing points outside the specified area.\n \n If an error occurs during the processing of a chunk, a warning is issued, and the function\n continues processing the next chunks. If no points are found after filtering, a warning is\n issued, and NULL is returned.\n\n\n#### Seealso\n\nOther functions in the Lidar forest analysis (LFA) package.\n\n\n#### Value\n\nA modified LAScatalog object with points outside the specified area removed.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n\n# Example usage\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n```\n:::\n\n\n\n### `lfa_jsd`\n\nJensen-Shannon Divergence Calculation\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`p` | A numeric vector representing the probability distribution P.\n`q` | A numeric vector representing the probability distribution Q.\n`epsilon` | A small positive constant added to both P and Q to avoid logarithm of zero. Default is 1e-10.\n\n\n#### Description\n\nThis function calculates the Jensen-Shannon Divergence (JSD) between two probability distributions P and Q.\n\n\n#### Details\n\nThe JSD is computed using the Kullback-Leibler Divergence (KLD) as follows:\n `sum((p * log((p + epsilon) / (m + epsilon)) + q * log((q + epsilon) / (m + epsilon))) / 2)` \n where `m = (p + q) / 2` .\n\n\n#### Seealso\n\n[`kld`](#kld) , [`sum`](#sum) , [`log`](#log)\n\n\n#### Value\n\nA numeric value representing the Jensen-Shannon Divergence between P and Q.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Calculate JSD between two probability distributions\np_distribution <- c(0.2, 0.3, 0.5)\nq_distribution <- c(0.1, 0, 0.9)\njsd_result <- jsd(p_distribution, q_distribution)\nprint(jsd_result)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_jsd(p, q, epsilon = 1e-10)\n```\n:::\n\n\n\n### `lfa_kld`\n\nKullback-Leibler Divergence Calculation\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`p` | A numeric vector representing the probability distribution P.\n`q` | A numeric vector representing the probability distribution Q.\n`epsilon` | A small positive constant added to both P and Q to avoid logarithm of zero. Default is 1e-10.\n\n\n#### Description\n\nThis function calculates the Kullback-Leibler Divergence (KLD) between two probability distributions P and Q.\n\n\n#### Details\n\nThe KLD is computed using the formula:\n `sum(p * log((p + epsilon) / (q + epsilon)))` \n This avoids issues when the denominator (Q) contains zero probabilities.\n\n\n#### Seealso\n\n[`sum`](#sum) , [`log`](#log)\n\n\n#### Value\n\nA numeric value representing the Kullback-Leibler Divergence between P and Q.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Calculate KLD between two probability distributions\np_distribution <- c(0.2, 0.3, 0.5)\nq_distribution <- c(0.1, 0, 0.9)\nkld_result <- kld(p_distribution, q_distribution)\nprint(kld_result)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_kld(p, q, epsilon = 1e-10)\n```\n:::\n\n\n\n### `lfa_ks_test`\n\nKolmogorov-Smirnov Test Wrapper Function\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`x` | A numeric vector representing the first sample.\n`y` | A numeric vector representing the second sample.\n`output_variable` | A character string specifying the output variable to extract from the ks.test result. Default is \"p.value\". Other possible values include \"statistic\" and \"alternative\".\n`...` | Additional arguments to be passed to the ks.test function.\n\n\n#### Description\n\nThis function serves as a wrapper for the Kolmogorov-Smirnov (KS) test between two samples.\n\n\n#### Details\n\nThe function uses the ks.test function to perform a two-sample KS test and returns the specified output variable.\n The default output variable is the p-value. Other possible output variables include \"statistic\" and \"alternative\".\n\n\n#### Seealso\n\n[`ks.test`](#ks.test)\n\n\n#### Value\n\nA numeric value representing the specified output variable from the KS test result.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Perform KS test and extract the p-value\nresult <- lfa_ks_test(sample1, sample2)\nprint(result)\n\n# Perform KS test and extract the test statistic\nresult_statistic <- lfa_ks_test(sample1, sample2, output_variable = \"statistic\")\nprint(result_statistic)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_ks_test(x, y, output_variable = \"p.value\", ...)\n```\n:::\n\n\n\n### `lfa_load_ctg_if_not_present`\n\nLoading the catalog if it is not present\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | Catalog object. Can be NULL\n`tile_location` | The location to look for the catalog tiles, if their are not present\n\n\n#### Description\n\nThis function checks if the catalog is `NULL` . If it is it will load the\n catalog from the `tile_location`\n\n\n#### Value\n\nThe provided ctg object if not null, else the catalog for the tiles\n of the tile_location.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_load_ctg_if_not_present(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_map_tile_locations`\n\nMap Function Over Tile Locations\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`tile_locations` | A list of tile location objects.\n`map_function` | The mapping function to be applied to each tile location.\n`...` | Additional arguments to be passed to the mapping function.\n\n\n#### Description\n\nThis function applies a specified mapping function to each tile location in a list.\n\n\n#### Details\n\nThis function iterates over each tile location in the provided list ( `tile_locations` )\n and applies the specified mapping function ( `map_function` ) to each tile location.\n The mapping function should accept a tile location object as its first argument, and\n additional arguments can be passed using the ellipsis ( `...` ) syntax.\n \n This function is useful for performing operations on multiple tile locations concurrently,\n such as loading Lidar data, processing areas, or other tasks that involve tile locations.\n\n\n#### Seealso\n\nThe mapping function provided should be compatible with the structure and requirements\n of the tile locations and the specific task being performed.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage\nlfa_map_tile_locations(tile_locations, my_mapping_function, param1 = \"value\")\n\n# Example usage\nlfa_map_tile_locations(tile_locations, my_mapping_function, param1 = \"value\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_map_tile_locations(tile_locations, map_function, check_flag = NULL, ...)\n```\n:::\n\n\n\n### `lfa_merge_and_save`\n\nMerge and Save Text Files in a Directory\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`input_directory` | The path to the input directory containing text files.\n`output_name` | The name for the output file where the merged content will be saved.\n\n\n#### Description\n\nThis function takes an input directory and an output name as arguments.\n It merges the textual content of all files in the specified directory into\n a single string, with each file's content separated by a newline character.\n The merged content is then saved into a file named after the output name\n in the same directory. After the merging is complete, all input files are\n deleted.\n\n\n#### Details\n\nThis function reads the content of each text file in the specified input directory\n and concatenates them into a single string. Each file's content is separated by a newline\n character. The merged content is then saved into a file named after the output name\n in the same directory. Finally, all input files are deleted from the directory.\n\n\n#### Seealso\n\n[`readLines`](#readlines) , [`writeLines`](#writelines) , [`file.remove`](#file.remove)\n\n\n#### Value\n\nThis function does not explicitly return any value. It prints a message\n indicating the successful completion of the merging and saving process.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Merge text files in the \"data_files\" directory and save the result in \"merged_output\"\nlfa_merge_and_save(\"data_files\", \"merged_output\")\n\n# Merge text files in the \"data_files\" directory and save the result in \"merged_output\"\nlfa_merge_and_save(\"data_files\", \"merged_output\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_merge_and_save(input_directory, output_name)\n```\n:::\n\n\n\n### `lfa_random_forest`\n\nRandom Forest Classifier with Leave-One-Out Cross-Validation\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`tree_data` | A data frame containing the tree data, including the response variable (\"specie\") and predictor variables.\n`excluded_input_columns` | A character vector specifying columns to be excluded from predictor variables.\n`response_variable` | The response variable to be predicted (default is \"specie\").\n`seed` | An integer to set the seed for reproducibility (default is 123).\n`...` | Additional parameters to be passed to the randomForest function.\n\n\n#### Description\n\nThis function performs a random forest classification using leave-one-out cross-validation for each area in the input tree data.\n It returns a list containing various results, including predicted species, confusion matrix, accuracy, and the formula used for modeling.\n\n\n#### Value\n\nA list containing the following elements:\n \n\n* `predicted_species_absolute` : A data frame with observed and predicted species for each area. \n\n* `predicted_species_relative` : A data frame wit the relative precictions per speices and areas, normalized by the total predictions in each area. \n\n* `confusion_matrix` : A confusion matrix showing the counts of predicted vs. observed species. \n\n* `accuracy` : The accuracy of the model, calculated as the sum of diagonal elements in the confusion matrix divided by the total count. \n\n* `formula` : The formula used for modeling.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Assuming tree_data is defined\nresults <- lfa_random_forest(tree_data, excluded_input_columns = c(\"column1\", \"column2\"))\n\n# Print the list of results\nprint(results)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_random_forest(\n tree_data,\n excluded_input_columns,\n response_variable = \"specie\",\n seed = 123,\n ...\n)\n```\n:::\n\n\n\n### `lfa_rd_to_qmd`\n\nConvert Rd File to Markdown\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`rdfile` | The path to the Rd file or a parsed Rd object.\n`outfile` | The path to the output Markdown file (including the file extension).\n`append` | Logical, indicating whether to append to an existing file (default is FALSE).\n\n\n#### Description\n\nIMPORTANT NOTE: \n This function is nearly identical to the `Rd2md::Rd2markdown` function from the `Rd2md` \n package. We needed to implement our own version of it because of various reasons:\n \n\n* The algorithm uses hardcoded header sizes (h1 and h2 in original) which is not feasible for our use-case of the markdown. \n\n* We needed to add some Quarto Markdown specifics, e.g. to make sure that the examples will not be runned. \n\n* We want to exclude certain tags from our implementation.\n\n\n#### Details\n\nFor that reason we copied the method and made changes as needed and also added this custom documentation.\n \n This function converts an Rd (R documentation) file to Markdown format (.md) and\n saves the converted file at the specified location. The function allows appending\n to an existing file or creating a new one. The resulting Markdown file includes\n sections for the function's name, title, and additional content such as examples,\n usage, arguments, and other sections present in the Rd file.\n \n The function performs the following steps:\n \n\n* Parses the Rd file using the Rd2md package. \n\n* Creates a Markdown file with sections for the function's name, title, and additional content. \n\n* Appends the content to an existing file if `append` is set to TRUE. \n\n* Saves the resulting Markdown file at the specified location.\n\n\n#### Seealso\n\n[`Rd2md::parseRd`](#rd2md::parserd)\n\n\n#### Value\n\nThis function does not explicitly return any value. It saves the converted Markdown file\n at the specified location as described in the details section.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Convert Rd file to Markdown and save it\nlfa_rd_to_md(\"path/to/your/file.Rd\", \"path/to/your/output/file.md\")\n\n# Convert Rd file to Markdown and append to an existing file\nlfa_rd_to_md(\"path/to/your/file.Rd\", \"path/to/existing/output/file.md\", append = TRUE)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_rd_to_qmd(rdfile, outfile, append = FALSE)\n```\n:::\n\n\n\n### `lfa_rd_to_results`\n\nConvert Rd Files to Markdown and Merge Results\n\n\n#### Description\n\nThis function converts all Rd (R documentation) files in the \"man\" directory\n to Markdown format (.qmd) and saves the converted files in the \"results/appendix/package-docs\" directory.\n It then merges the converted Markdown files into a single string and saves\n the merged content into a file named \"docs.qmd\" in the \"results/appendix/package-docs\" directory.\n\n\n#### Details\n\nThe function performs the following steps:\n \n\n* Removes any existing \"docs.qmd\" file in the \"results/appendix/package-docs\" directory. \n\n* Finds all Rd files in the \"man\" directory. \n\n* Converts each Rd file to Markdown format (.qmd) using the `lfa_rd_to_qmd` function. \n\n* Saves the converted Markdown files in the \"results/appendix/package-docs\" directory. \n\n* Merges the content of all converted Markdown files into a single string. \n\n* Saves the merged content into a file named \"docs.qmd\" in the \"results/appendix/package-docs\" directory.\n\n\n#### Seealso\n\n[`lfa_rd_to_qmd`](#lfardtoqmd) , [`lfa_merge_and_save`](#lfamergeandsave)\n\n\n#### Value\n\nThis function does not explicitly return any value. It performs the conversion,\n merging, and saving operations as described in the details section.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Convert Rd files to Markdown and merge the results\nlfa_rd_to_results()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_rd_to_results()\n```\n:::\n\n\n\n### `lfa_read_area_as_catalog`\n\nRead LiDAR data from a specified species and location as a catalog.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`specie` | A character string specifying the species of interest.\n`location_name` | A character string specifying the name of the location.\n\n\n#### Description\n\nThis function constructs the file path based on the specified `specie` and `location_name` ,\n lists the directories at that path, and reads the LiDAR data into a `lidR::LAScatalog` .\n\n\n#### Value\n\nA `lidR::LAScatalog` object containing the LiDAR data from the specified location and species.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_read_area_as_catalog(\"beech\", \"location1\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_read_area_as_catalog(specie, location_name)\n```\n:::\n\n\n\n### `lfa_run_test_asymmetric`\n\nAsymmetric Pairwise Test for Categories\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`data` | A data frame containing the relevant columns.\n`data_column` | A character string specifying the column containing the numerical data.\n`category_column` | A character string specifying the column containing the categorical variable.\n`test_function` | A function used to perform the pairwise test between two sets of data. It should accept two vectors of numeric data and additional parameters specified by `...` . The function should return a numeric value representing the test result.\n`...` | Additional parameters to be passed to the `test_function` .\n\n\n#### Description\n\nThis function performs an asymmetric pairwise test for categories using a user-defined `test_function` .\n\n\n#### Details\n\nThe function calculates the test results for each unique combination of categories using the specified\n `test_function` . The resulting table is asymmetric, containing the test results for comparisons\n from the rows to the columns.\n\n\n#### Seealso\n\n[`outer`](#outer) , [`Vectorize`](#vectorize)\n\n\n#### Value\n\nA data frame representing the results of the asymmetric pairwise tests between categories.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Define a custom test function\ncustom_test_function <- function(x, y) {\n# Your test logic here\n# Return a numeric result\nreturn(mean(x) - mean(y))\n}\n\n# Perform an asymmetric pairwise test\nresult <- lfa_run_test_asymmetric(your_data, \"numeric_column\", \"category_column\", custom_test_function)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_run_test_asymmetric(data, data_column, category_column, test_function, ...)\n```\n:::\n\n\n\n### `lfa_run_test_symmetric`\n\nSymmetric Pairwise Test for Categories\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`data` | A data frame containing the relevant columns.\n`data_column` | A character string specifying the column containing the numerical data.\n`category_column` | A character string specifying the column containing the categorical variable.\n`test_function` | A function used to perform the pairwise test between two sets of data. It should accept two vectors of numeric data and additional parameters specified by `...` . The function should return a numeric value representing the test result.\n`...` | Additional parameters to be passed to the `test_function` .\n\n\n#### Description\n\nThis function performs a symmetric pairwise test for categories using a user-defined `test_function` .\n\n\n#### Details\n\nThe function calculates the test results for each unique combination of categories using the specified\n `test_function` . The resulting table is symmetric, containing the test results for comparisons\n from the rows to the columns. The upper triangle of the matrix is filled with `NA` to avoid duplicate results.\n\n\n#### Seealso\n\n[`outer`](#outer) , [`Vectorize`](#vectorize)\n\n\n#### Value\n\nA data frame representing the results of the symmetric pairwise tests between categories.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Define a custom test function\ncustom_test_function <- function(x, y) {\n# Your test logic here\n# Return a numeric result\nreturn(mean(x) - mean(y))\n}\n\n# Perform a symmetric pairwise test\nresult <- lfa_run_test_symmetric(your_data, \"numeric_column\", \"category_column\", custom_test_function)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_run_test_symmetric(data, data_column, category_column, test_function, ...)\n```\n:::\n\n\n\n### `lfa_save_all_neighbours`\n\nSave Neighbors for All Areas\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`n` | The number of nearest trees to find for each tree (default is 100).\n\n\n#### Description\n\nThis function iterates through all detection areas, finds the n nearest trees for each tree,\n and saves the result to a GeoPackage file for each area.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Save neighbors for all areas with default value (n=100)\nlfa_save_all_neighbours()\n\n# Save neighbors for all areas with a specific value of n (e.g., n=50)\nlfa_save_all_neighbours(n = 50)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_save_all_neighbours(n = 100)\n```\n:::\n\n\n\n### `lfa_segmentation`\n\nSegment the elements of an point cloud by trees\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | An LASCatalog object. If not null, it will perform the actions on this object, if NULL inferring the catalog from the tile_location\n`tile_location` | A tile_location type object holding the information about the location of the catalog. This is used to save the catalog after processing too.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function will try to to divide the hole point cloud into unique trees.\n Therefore it is assigning for each chunk of the catalog a `treeID` for each\n point. Therefore the algorithm uses the `li2012` implementation with the\n following parameters: `li2012(dt1 = 2, dt2 = 3, R = 2, Zu = 10, hmin = 5, speed_up = 12)` \n NOTE : The operation is in place and can not be reverted, the old values\n of the point cloud will be deleted!\n\n\n#### Value\n\nA catalog where each chunk has additional `treeID` values indicating the belonging tree.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_segmentation(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_set_flag`\n\nSet a flag to indicate the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being flagged.\n\n\n#### Description\n\nThis function creates a hidden flag file at a specified location within the working directory to indicate that a particular processing step has been completed. If the flag file already exists, a warning is issued.\n\n\n#### Value\n\nThis function does not have a formal return value.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Set the flag for a process named \"data_processing\"\nlfa_set_flag(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_set_flag(flag_name)\n```\n:::\n\n\n\n", "supporting": [ - "report_files/figure-html" + "report_files" ], "filters": [ "rmarkdown/pagebreak.lua" diff --git a/results/_freeze/report/execute-results/tex.json b/results/_freeze/report/execute-results/tex.json index 751984e..b2c2fe5 100644 --- a/results/_freeze/report/execute-results/tex.json +++ b/results/_freeze/report/execute-results/tex.json @@ -1,7 +1,7 @@ { - "hash": "e1408acaec76e2d97b0a0348b71ffcbc", + "hash": "b7fee1a3103561f53a1d4910d3f0d0b4", "result": { - "markdown": "---\ntitle: \"Forest Data Analysis Report\"\noutput:\n pdf_document:\n latex_engine: xelatex\ntoc: true\ntoc-depth: 2\ntoc-title: Contents\nnumber-sections: true\nnumber-depth: 3\ndate: today\nauthor: Jakob Danel and Frederick Bruch\nbibliography: references.bib\nexecute-dir: .. \nprefer-html: true\n---\n\n\n# Introduction\n\nThis report documents the analysis of forest data for different tree species.\n\n# Methods\n\n## Data acquisition\n\nOur primary objective is to identify patches where one tree species exhibits a high level of dominance, striving to capture monocultural stands within the diverse forests of Nordrhein-Westfalia (NRW). Recognizing the practical challenges of finding true monocultures, we aim to identify patches where one species is highly dominant, enabling meaningful comparisons across different species.\n\nThe study is framed within the NRW region due to the availability of an easily accessible dataset. Our focus includes four prominent tree species in NRW: oak, beech, spruce, and pine, representing the most prevalent species in the region. To ensure the validity of our findings, we derive three patches for each species, thereby confirming that observed variables are characteristic of a particular species rather than a specific patch. Each patch is carefully selected to encompass an area of approximately 50-100 hectares and contain between 5,000 and 10,000 trees. Striking a balance between relevance and manageability, these patches avoid excessive size to enhance the likelihood of capturing varied species mixes and ensure compatibility with local hardware.\n\nSpecific Goals:\n\n1. Retrieve patches with highly dominant tree species.\n2. Minimize or eliminate the presence of human-made structures within the selected patches.\n\nTo achieve our goals, we utilized the waldmonitor dataset [@welle2014] and the map provided by [@Blickensdoerfer2022], both indicating dominant tree species in NRW. We identified patches of feasible size where both sources predicted the presence of a specific species. Further validation involved examining sentinel images of these forest regions to assess the evenness of structures, leaf color distribution, and the absence of significant human-made structures such as roads or buildings. The subsequent preprocessing steps, detailed in the following subsection, involved refining our selected patches and deriving relevant variables, such as tree distribution and density, to ensure that the chosen areas align with the desired research domains.\n\n## Preprocessing\n::: {.cell}\n\n:::\n\n\nIn this research study, the management and processing of a large dataset are crucial considerations. The dataset's substantial size necessitates careful maintenance to ensure efficient handling. Furthermore, the data should be easily processable and editable to facilitate necessary corrections and precalculations within the context of our research objectives. To achieve our goals, we have implemented a framework that automatically derives data based on a shapefile, delineating areas of interest. The processed data and results of precalculations are stored in a straightforward manner to enhance accessibility. Additionally, we have designed functions that establish a user-friendly interface, enabling the execution of algorithms on subsets of the data, such as distinct species. These interfaces are not only directly callable by users but can also be integrated into other functions to automate processes. The overarching aim is to streamline the entire preprocessing workflow using a single script, leveraging only the shapefile as a basis. This subsection details the accomplishments of our R-package in realizing these goals, outlining the preprocessing steps undertaken and justifying their necessity in the context of our research.\n\nThe data are stored in a data subdirectory of the root directory in the format `species/location-name/tile-name`. To automate the matching of areas of interest with the catalog from the Land NRW[^1], we utilize the intersecting tool developed by Heisig[^2]. This tool, allows for the automatic retrieval and placement of data downloaded from the Land NRW catalog. To enhance data accessibility, we have devised an object that incorporates species, location name, and tile name (the NRW internal identifier) for each area This object facilitates the specification of the area to be processed. Additionally, we have defined an initialization function that downloads all tiles, returning a list of tile location objects for subsequent processing. A pivotal component of the package's preprocessing functionality is the map function, which iterates over a list of tile locations (effectively the entire dataset) and accepts a processing function as an argument. The subsequent paragraph outlines the specific preprocessing steps employed, all of which are implemented within the mapping function.\n\nTo facilitate memory-handling capabilities, each of the tiles, where one area can span multiple tiles, has been split into manageable chunks. We employed a 50x50m size for each tile, resulting in the division of original 1km x 1km files into 400 tiles. These tiles are stored in our directory structure, with each tile housed in a directory named after its tile name and assigned an id as the filename. Implementation-wise, the `lidr::catalog_retile` function was instrumental in achieving this segmentation. The resulting smaller chunks allow for efficient iteration during subsequent preprocessing steps.\n\nThe next phase involves reducing our data to the actual size by intersecting the tiles with the defined area of interest. Using the `lidR::merge_spatial` function, we intersect the area derived from the shapefile, removing all point cloud items outside this region. Due to our tile-wise approach, empty tiles may arise, and in such cases, those tiles are simply deleted.\n\nFollowing the size reduction to our dataset, the next step involves correcting the `z` values. The `z` values in the data are originally relative to the ellipsoid used for referencing, but we require them to be relative to the ground. To achieve this, we utilize the `lidR::tin` function, which extrapolates a convex hull between all ground points (classified by the data provider) and calculates the z value based on this structure.\n\nSubsequently, we aim to perform segmentation for each distinct tree, marking each item of the point cloud with a tree ID. We employ the algorithm described by @li2012, using parameters `li2012(dt1 = 2, dt2 = 3, R = 2, Zu = 10, hmin = 5, speed_up = 12)`. The meanings of these parameters are elucidated in Li et al.'s work [@li2012].\n\nFinally, the last preprocessing step involves individual tree detection, seeking a single `POINT` object for each tree. The `lidR::lmf` function, an implementation of the tree data using a local maximum approach, is utilized for this purpose [@popescu2004]. The results are stored in GeoPackage files within our data structure.\n\nSee @sec-appendix-preprocessing for the implementation of the preprocessing.\n\n[^1]: https://www.opengeodata.nrw.de/produkte/geobasis/hm/3dm_l_las/3dm_l_las/, last visited 7th Dec 2023\n[^2]: https://github.com/joheisig/GEDIcalibratoR, last visited 7th Dec 2023\n\n\n\n# Results\n::: {.cell}\n\n:::\n\n## Researched areas\n\n::: {.cell}\n\n```{.r .cell-code code-fold=\"true\"}\nlibrary(ggplot2)\nsf::sf_use_s2(FALSE)\npatches <- sf::read_sf(\"research_areas.shp\") |> sf::st_centroid()\n\nde <- sf::read_sf(\"results/results/states_de/Bundesländer_2017_mit_Einwohnerzahl.shp\") # Source: https://hub.arcgis.com/datasets/esri-de-content::bundesl%C3%A4nder-2017-mit-einwohnerzahl/explore?location=51.099647%2C10.454033%2C7.43\nnrw <- de[5,] |> sf::st_geometry()\n\n\nggplot() + geom_sf(data = nrw) + \n geom_sf(data = patches, mapping = aes(col = species))\n```\n\n::: {.cell-output-display}\n![Locations of the different patches with the dominant species for that patch. The patches centroids are displayed on a basemap describing the borders from NRW.](report_files/figure-pdf/fig-patches-nrw-1.pdf){#fig-patches-nrw fig-pos='H'}\n:::\n:::\nWe draw three patches for each species from different regions (see @tbl-summary-researched-areas). We download the LiDAR data for those patches and runned all preprocessing steps as described. We than checked with certain derived parameters (e.g. tree heights, tree distributions or tree density) that all patches contain valid forest data. In that step we discovered, that in one patch some forest clearance took place in the near past. This patch was removed from the dataset and was replaced with a new one. \n\nIn our research, drawing patches evenly distributed across Nordrhein-Westfalia is inherently constrained by natural factors. Consequently, the patches for oak and pine predominantly originate from the Münsterland region, as illustrated in [@fig-patches-nrw]. For spruce, the patches were derived from Sauerland, reflecting the prevalence of spruce forests in this specific region within NRW, as corroborated by Welle et al. [@welle2014] and Blickensdörfer et al. [@Blickensdoerfer2022]. Beech patches, on the other hand, were generated from diverse locations within NRW. Across all patches, no human-made objects were identified, with the exception of small paths for pedestrians and forestry vehicles.\n\nThe distribution of area and detections is notable for each four species. Beech covers 69,791.9 hectares with a total of 5,954 detections, oak spans 63,232.49 hectares with 5,354 detections, pine extends across 72,862.4 hectares with 8,912 detections, and spruce encompasses 57,940.02 hectares with 8,619 detections. Both the amount of detections and the corresponding area exhibit a relatively uniform distribution across the diverse patches, as summarized in @tbl-summary-researched-areas. \n\nWith the selected dataset described, we intentionally chose three patches for each four species that exhibit a practical and usable size for our research objectives. These carefully chosen patches align with the conditions essential for our study, providing comprehensive and representative data for in-depth analysis and meaningful insights into the characteristics of each tree species within the specified areas.\n\n\n::: {#tbl-summary-researched-areas .cell tbl-cap='Summary of researched patches grouped by species, with their location, area and the amount of detected trees.'}\n\n```{.r .cell-code code-fold=\"true\"}\nshp <- sf::read_sf(\"research_areas.shp\")\ntable <- lfa::lfa_get_all_areas()\n\nsf::sf_use_s2(FALSE)\nfor (row in 1:nrow(table)) {\n area <-\n dplyr::filter(shp, shp$species == table[row, \"specie\"] &\n shp$name == table[row, \"area\"])\n area_size <- area |> sf::st_area()\n point <- area |> sf::st_centroid() |> sf::st_coordinates()\n table[row,\"point\"] <- paste0(\"(\",round(point[1], digits = 4),\", \",round(point[2],digits = 4),\")\")\n \n table[row, \"area_size\"] = round(area_size,digits = 2) #paste0(round(area_size,digits = 2), \" m²\")\n \n amount_det <- nrow(lfa::lfa_get_detection_area(table[row, \"specie\"], table[row, \"area\"]))\n if(is.null(amount_det)){\n cat(nrow(lfa::lfa_get_detection_area(table[row, \"specie\"], table[row, \"area\"])),table[row, \"specie\"],table[row, \"area\"])\n }\n table[row, \"amount_detections\"] = amount_det\n \n # table[row, \"specie\"] <- lfa::lfa_capitalize_first_char(table[row,\"specie\"])\n table[row, \"area\"] <- lfa::lfa_capitalize_first_char(table[row,\"area\"])\n }\ntable$area <- gsub(\"_\", \" \", table$area)\ntable$area <- gsub(\"ue\", \"ü\", table$area)\ntable = table[,!names(table) %in% c(\"specie\")]\n\nknitr::kable(table, \"html\", col.names = c(\"Patch Name\",\"Location\",\"Area size (m²)\",\"Amount tree detections\" ), caption = NULL, digits = 2, escape = TRUE) |>\n kableExtra::kable_styling(\n bootstrap_options = c(\"striped\", \"hold_position\", \"bordered\",\"responsive\"),\n stripe_index = c(1:3,7:9),\n full_width = FALSE\n ) |>\n kableExtra::pack_rows(\"Beech\", 1, 3) |>\n kableExtra::pack_rows(\"Oak\", 4, 6) |>\n kableExtra::pack_rows(\"Pine\", 7, 9) |>\n kableExtra::pack_rows(\"Spruce\", 10, 12) |>\n kableExtra::column_spec(1, bold = TRUE)\n```\n\n::: {.cell-output-display}\n`````{=html}\n\n \n \n \n \n \n \n \n \n\n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n
Patch Name Location Area size (m²) Amount tree detections
Beech
Bielefeld brackwede (8.5244, 51.9902) 161410.57 1443
Billerbeck (7.3273, 51.9987) 185887.25 1732
Wülfenrath (7.0769, 51.2917) 350621.21 2779
Oak
Hamm (7.8618, 51.6639) 269397.22 2441
Münster (7.6187, 51.9174) 164116.61 1270
Rinkerode (7.6744, 51.8598) 198811.09 1643
Pine
Greffen (8.1697, 51.9913) 49418.81 513
Mesum (7.5403, 52.2573) 405072.85 5031
Telgte (7.7816, 52.0024) 274132.34 3368
Spruce
Brilon (8.5352, 51.4084) 211478.20 3342
Oberhundem (8.1861, 51.0909) 151895.53 2471
Osterwald (8.3721, 51.2151) 216026.43 2806
\n\n`````\n:::\n:::\n\n\n\n\n\n\n\n\n|specie |area | density (1/m²)|\n|:------|:-------------------|---------:|\n|beech |bielefeld_brackwede | 0.0089399|\n|beech |billerbeck | 0.0093175|\n|beech |wuelfenrath | 0.0079259|\n|oak |hamm | 0.0090610|\n|oak |muenster | 0.0077384|\n|oak |rinkerode | 0.0082641|\n|pine |greffen | 0.0103807|\n|pine |mesum | 0.0124200|\n|pine |telgte | 0.0122860|\n|spruce |brilon | 0.0158030|\n|spruce |oberhundem | 0.0162678|\n|spruce |osterwald | 0.0129892|\n\n# References\n\n::: {#refs}\n:::\n\n# Appendix\n## Script which can be used to do all preprocessing {#sec-appendix-preprocessing}\n\n::: {.cell}\n\n:::\n\n\nLoad the file with the research areas\n::: {.cell}\n\n```{.r .cell-code}\nsf <- sf::read_sf(here::here(\"research_areas.shp\"))\nprint(sf)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nSimple feature collection with 12 features and 3 fields\nGeometry type: POLYGON\nDimension: XY\nBounding box: xmin: 7.071625 ymin: 51.0895 xmax: 8.539877 ymax: 52.25983\nGeodetic CRS: WGS 84\n# A tibble: 12 x 4\n id species name geometry\n \n 1 1 oak rinkerode ((7.678922 51.85789, 7.675446 51.85752, 7.~\n 2 2 oak hamm ((7.858955 51.66699, 7.866444 51.66462, 7.~\n 3 3 oak muenster ((7.618908 51.9154, 7.617384 51.9172, 7.61~\n 4 4 pine greffen ((8.168691 51.98965, 8.167178 51.99075, 8.~\n 5 5 pine telgte ((7.779728 52.00662, 7.781616 52.00662, 7.~\n 6 6 pine mesum ((7.534424 52.25499, 7.53378 52.25983, 7.5~\n 7 7 beech bielefeld_brackwede ((8.524749 51.9921, 8.528418 51.99079, 8.5~\n 8 8 beech wuelfenrath ((7.071625 51.29256, 7.072311 51.29334, 7.~\n 9 9 beech billerbeck ((7.324729 51.99783, 7.323548 51.99923, 7.~\n10 11 spruce brilon ((8.532195 51.41029, 8.535027 51.41064, 8.~\n11 12 spruce osterwald ((8.369328 51.21693, 8.371238 51.21718, 8.~\n12 10 spruce oberhundem ((8.18082 51.08999, 8.180868 51.09143, 8.1~\n```\n:::\n:::\n\n\nInit the project\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(lfa)\nsf::sf_use_s2(FALSE)\nlocations <- lfa_init(\"research_areas.shp\")\n```\n:::\n\nDo all of the prprocessing steps\n::: {.cell}\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations,retile,check_flag = \"retile\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag retile is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_intersect_areas, ctg = NULL, areas_sf = sf,check_flag = \"intersect\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag intersect is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_ground_correction, ctg = NULL,check_flag = \"z_correction\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag z_correction is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_segmentation, ctg = NULL,check_flag = \"segmentation\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag segmentation is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_detection, catalog = NULL, write_to_file = TRUE,check_flag = \"detection\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag detection is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n:::\n\n\n\n## Documentation\n### `lfa_capitalize_first_char`\n\nCapitalize First Character of a String\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`input_string` | A single-character string to be processed.\n\n\n#### Concept\n\nString Manipulation\n\n\n#### Description\n\nThis function takes a string as input and returns the same string with the\n first character capitalized. If the first character is already capitalized,\n the function does nothing. If the first character is not from the alphabet,\n an error is thrown.\n\n\n#### Details\n\nThis function performs the following steps:\n \n\n* Checks if the input is a single-character string. \n\n* Verifies if the first character is from the alphabet (A-Z or a-z). \n\n* If the first character is not already capitalized, it capitalizes it. \n\n* Returns the modified string.\n\n\n#### Keyword\n\nalphabet\n\n\n#### Note\n\nThis function is case-sensitive and assumes ASCII characters.\n\n\n#### References\n\nNone\n\n\n#### Seealso\n\nThis function is related to the basic string manipulation functions in base R.\n\n\n#### Value\n\nA modified string with the first character capitalized if it is\n not already. If the first character is already capitalized, the original\n string is returned.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Capitalize the first character of a string\ncapitalize_first_char(\"hello\") # Returns \"Hello\"\ncapitalize_first_char(\"World\") # Returns \"World\"\n\n# Error example (non-alphabetic first character)\ncapitalize_first_char(\"123abc\") # Throws an error\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_capitalize_first_char(input_string)\n```\n:::\n\n\n\n### `lfa_check_flag`\n\nCheck if a flag is set, indicating the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being checked.\n\n\n#### Description\n\nThis function checks for the existence of a hidden flag file at a specified location within the working directory. If the flag file is found, a message is printed, and the function returns `TRUE` to indicate that the associated processing step has already been completed. If the flag file is not found, the function returns `FALSE` , indicating that further processing can proceed.\n\n\n#### Value\n\nA logical value indicating whether the flag is set ( `TRUE` ) or not ( `FALSE` ).\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Check if the flag for a process named \"data_processing\" is set\nlfa_check_flag(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_check_flag(flag_name)\n```\n:::\n\n\n\n### `lfa_create_tile_location_objects`\n\nCreate tile location objects\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function traverses a directory structure to find LAZ files and creates\n tile location objects for each file. The function looks into the the `data` \n directory of the repository/working directory. It then creates `tile_location` \n objects based on the folder structure. The folder structure should not be\n touched by hand, but created by `lfa_init_data_structure()` which builds the\n structure based on a shape file.\n\n\n#### Seealso\n\n[`tile_location`](#tilelocation)\n\n\n#### Value\n\nA vector containing tile location objects.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_tile_location_objects()\n\nlfa_create_tile_location_objects()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_tile_location_objects()\n```\n:::\n\n\n\n### `lfa_detection`\n\nPerform tree detection on a lidar catalog and optionally save the results to a file.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`catalog` | A lidar catalog containing point cloud data. If set to NULL, the function attempts to read the catalog from the specified tile location.\n`tile_location` | An object specifying the location of the lidar tile. If catalog is NULL, the function attempts to read the catalog from this tile location.\n`write_to_file` | A logical value indicating whether to save the detected tree information to a file. Default is TRUE.\n\n\n#### Description\n\nThis function utilizes lidar data to detect trees within a specified catalog. The detected tree information can be optionally saved to a file in the GeoPackage format. The function uses parallel processing to enhance efficiency.\n\n\n#### Value\n\nA sf style data frame containing information about the detected trees.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Perform tree detection on a catalog and save the results to a file\nlfa_detection(catalog = my_catalog, tile_location = my_tile_location, write_to_file = TRUE)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_detection(catalog, tile_location, write_to_file = TRUE)\n```\n:::\n\n\n\n### `lfa_download_areas`\n\nDownload areas based on spatial features\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_areas` | Spatial features representing areas to be downloaded. It must include columns like \"species\" \"name\" See details for more information.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function initiates the data structure and downloads areas based on spatial features.\n\n\n#### Details\n\nThe input data frame, `sf_areas` , must have the following columns:\n \n\n* \"species\": The species associated with the area. \n\n* \"name\": The name of the area. \n \n The function uses the `lfa_init_data_structure` function to set up the data structure\n and then iterates through the rows of `sf_areas` to download each specified area.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download_areas(sf_areas)\n\n\n# Example spatial features data frame\nsf_areas <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Must include also other attributes specialized to sf objects\n# such as geometry, for processing of the download\n)\n\nlfa_download_areas(sf_areas)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download_areas(sf_areas)\n```\n:::\n\n\n\n### `lfa_download`\n\nDownload an las file from the state NRW from a specific location\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | The species of the tree which is observed at this location\n`name` | The name of the area that is observed\n`location` | An sf object, which holds the location information for the area where the tile should be downloaded from.\n\n\n#### Description\n\nIt will download the file and save it to data/ list(list(\"html\"), list(list(\"\"))) / list(list(\"html\"), list(list(\"\"))) with the name of the tile\n\n\n#### Value\n\nThe LASCatalog object of the downloaded file\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download(species, name, location)\n```\n:::\n\n\n\n### `lfa_get_detection_area`\n\nGet Detection for an area\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | A character string specifying the target species.\n`name` | A character string specifying the name of the tile.\n\n\n#### Description\n\nRetrieves the tree detection information for a specified species and tile.\n\n\n#### Details\n\nThis function reads tree detection data from geopackage files within the specified tile location for a given species. It then combines the data into a single SF data frame and returns it. The function assumes that the tree detection files follow a naming convention with the pattern \"_detection.gpkg\".\n\n\n#### Keyword\n\nspatial\n\n\n#### References\n\nThis function is part of the LiDAR Forest Analysis (LFA) package.\n\n\n#### Seealso\n\n[`get_tile_dir`](#gettiledir)\n\n\n#### Value\n\nA Simple Features (SF) data frame containing tree detection information for the specified species and tile.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Retrieve tree detection data for species \"example_species\" in tile \"example_tile\"\ntrees_data <- lfa_get_detection_tile_location(\"example_species\", \"example_tile\")\n\n# Example usage:\ntrees_data <- lfa_get_detection_tile_location(\"example_species\", \"example_tile\")\n\n# No trees found scenario:\nempty_data <- lfa_get_detection_tile_location(\"nonexistent_species\", \"nonexistent_tile\")\n# The result will be an empty data frame if no trees are found for the specified species and tile.\n\n# Error handling:\n# In case of invalid inputs, the function may throw errors. Ensure correct species and tile names are provided.\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detection_area(species, name)\n```\n:::\n\n\n\n### `lfa_get_detections_species`\n\nRetrieve detections for a specific species.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | A character string specifying the target species.\n\n\n#### Description\n\nThis function retrieves detection data for a given species from multiple areas.\n\n\n#### Details\n\nThe function looks for detection data in the \"data\" directory for the specified species.\n It then iterates through each subdirectory (representing different areas) and consolidates the\n detection data into a single data frame.\n\n\n#### Value\n\nA data frame containing detection information for the specified species in different areas.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage:\ndetections_data <- lfa_get_detections_species(\"example_species\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections_species(species)\n```\n:::\n\n\n\n### `lfa_get_detections`\n\nRetrieve aggregated detection data for multiple species.\n\n\n#### Concept\n\ndata retrieval functions\n\n\n#### Description\n\nThis function obtains aggregated detection data for multiple species by iterating\n through the list of species obtained from [`lfa_get_species`](#lfagetspecies) . For each\n species, it calls [`lfa_get_detections_species`](#lfagetdetectionsspecies) to retrieve the\n corresponding detection data and aggregates the results into a single data frame.\n The resulting data frame includes columns for the species, tree detection data,\n and the area in which the detections occurred.\n\n\n#### Keyword\n\naggregation\n\n\n#### Seealso\n\n[`lfa_get_species`](#lfagetspecies) , [`lfa_get_detections_species`](#lfagetdetectionsspecies) \n \n Other data retrieval functions:\n [`lfa_get_species`](#lfagetspecies)\n\n\n#### Value\n\nA data frame containing aggregated detection data for multiple species.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections()\n\n# Retrieve aggregated detection data for multiple species\ndetections_data <- lfa_get_detections()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections()\n```\n:::\n\n\n\n### `lfa_get_flag_path`\n\nGet the path to a flag file indicating the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being flagged.\n\n\n#### Description\n\nThis function constructs and returns the path to a hidden flag file, which serves as an indicator that a particular processing step has been completed. The flag file is created in a designated location within the working directory.\n\n\n#### Value\n\nA character string representing the absolute path to the hidden flag file.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Get the flag path for a process named \"data_processing\"\nlfa_get_flag_path(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_flag_path(flag_name)\n```\n:::\n\n\n\n### `lfa_get_species`\n\nGet a list of species from the data directory.\n\n\n#### Concept\n\ndata retrieval functions\n\n\n#### Description\n\nThis function retrieves a list of species by scanning the \"data\" directory\n located in the current working directory.\n\n\n#### Keyword\n\ndata\n\n\n#### References\n\nThis function relies on the [`list.dirs`](#list.dirs) function for directory listing.\n\n\n#### Seealso\n\n[`list.dirs`](#list.dirs) \n \n Other data retrieval functions:\n [`lfa_get_detections`](#lfagetdetections)\n\n\n#### Value\n\nA character vector containing the names of species found in the \"data\" directory.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Retrieve the list of species\nspecies_list <- lfa_get_species()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_species()\n```\n:::\n\n\n\n### `lfa_ground_correction`\n\nCorrect the point clouds for correct ground imagery\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | An LASCatalog object. If not null, it will perform the actions on this object, if NULL inferring the catalog from the tile_location\n`tile_location` | A tile_location type object holding the information about the location of the cataog. This is used to save the catalog after processing too.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function is needed to correct the Z value of the point cloud, relative to the real\n ground height. After using this function to your catalog, the Z values can be seen as the\n real elevation about the ground. At the moment the function uses the `tin()` function from\n the `lidr` package. NOTE : The operation is inplace and can not be reverted, the old values\n of the point cloud will be deleted!\n\n\n#### Value\n\nA catalog with the corrected z values. The catalog is always stored at tile_location and\n holding only the transformed values.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_ground_correction(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_init_data_structure`\n\nInitialize data structure for species and areas\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_species` | A data frame with information about species and associated areas.\n\n\n#### Description\n\nThis function initializes the data structure for storing species and associated areas.\n\n\n#### Details\n\nThe input data frame, `sf_species` , should have at least the following columns:\n \n\n* \"species\": The names of the species for which the data structure needs to be initialized. \n\n* \"name\": The names of the associated areas. \n \n The function creates directories based on the species and area information provided in\n the `sf_species` data frame. It checks whether the directories already exist and creates\n them if they don't.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example species data frame\nsf_species <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Other necessary columns\n)\n\nlfa_init_data_structure(sf_species)\n\n# Example species data frame\nsf_species <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Other necessary columns\n)\n\nlfa_init_data_structure(sf_species)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_init_data_structure(sf_species)\n```\n:::\n\n\n\n### `lfa_init`\n\nInitialize LFA (LiDAR forest analysis) data processing\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_file` | A character string specifying the path to the shapefile containing spatial features of research areas.\n\n\n#### Description\n\nThis function initializes the LFA data processing by reading a shapefile containing\n spatial features of research areas, downloading the specified areas, and creating\n tile location objects for each area.\n\n\n#### Details\n\nThis function reads a shapefile ( `sf_file` ) using the `sf` package, which should\n contain information about research areas. It then calls the `lfa_download_areas` \n function to download the specified areas and `lfa_create_tile_location_objects` \n to create tile location objects based on Lidar data files in those areas. The\n shapefile MUST follow the following requirements:\n \n\n* Each geometry must be a single object of type polygon \n\n* Each entry must have the following attributes: \n\n* species: A string describing the tree species of the area. \n\n* name: A string describing the location of the area.\n\n\n#### Value\n\nA vector containing tile location objects.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Initialize LFA processing with the default shapefile\nlfa_init()\n\n# Initialize LFA processing with a custom shapefile\nlfa_init(\"custom_areas.shp\")\n\n# Example usage with the default shapefile\nlfa_init()\n\n# Example usage with a custom shapefile\nlfa_init(\"custom_areas.shp\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_init(sf_file = \"research_areas.shp\")\n```\n:::\n\n\n\n### `lfa_intersect_areas`\n\nIntersect Lidar Catalog with Spatial Features\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | A LAScatalog object representing the Lidar data to be processed.\n`tile_location` | A tile location object representing the specific area of interest.\n`areas_sf` | Spatial features defining areas.\n\n\n#### Description\n\nThis function intersects a Lidar catalog with a specific area defined by spatial features.\n\n\n#### Details\n\nThe function intersects the Lidar catalog specified by `ctg` with a specific area defined by\n the `tile_location` object and `areas_sf` . It removes points outside the specified area and\n returns a modified LAScatalog object.\n \n The specified area is identified based on the `species` and `name` attributes in the\n `tile_location` object. If a matching area is not found in `areas_sf` , the function\n stops with an error.\n \n The function then transforms the spatial reference of the identified area to match that of\n the Lidar catalog using `sf::st_transform` .\n \n The processing is applied to each chunk in the catalog using the `identify_area` function,\n which merges spatial information and filters out points that are not classified as inside\n the identified area. After processing, the function writes the modified LAS files back to\n the original file locations, removing points outside the specified area.\n \n If an error occurs during the processing of a chunk, a warning is issued, and the function\n continues processing the next chunks. If no points are found after filtering, a warning is\n issued, and NULL is returned.\n\n\n#### Seealso\n\nOther functions in the Lidar forest analysis (LFA) package.\n\n\n#### Value\n\nA modified LAScatalog object with points outside the specified area removed.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n\n# Example usage\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n```\n:::\n\n\n\n### `lfa_load_ctg_if_not_present`\n\nLoading the catalog if it is not present\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | Catalog object. Can be NULL\n`tile_location` | The location to look for the catalog tiles, if their are not present\n\n\n#### Description\n\nThis function checks if the catalog is `NULL` . If it is it will load the\n catalog from the `tile_location`\n\n\n#### Value\n\nThe provided ctg object if not null, else the catalog for the tiles\n of the tile_location.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_load_ctg_if_not_present(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_map_tile_locations`\n\nMap Function Over Tile Locations\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`tile_locations` | A list of tile location objects.\n`map_function` | The mapping function to be applied to each tile location.\n`...` | Additional arguments to be passed to the mapping function.\n\n\n#### Description\n\nThis function applies a specified mapping function to each tile location in a list.\n\n\n#### Details\n\nThis function iterates over each tile location in the provided list ( `tile_locations` )\n and applies the specified mapping function ( `map_function` ) to each tile location.\n The mapping function should accept a tile location object as its first argument, and\n additional arguments can be passed using the ellipsis ( `...` ) syntax.\n \n This function is useful for performing operations on multiple tile locations concurrently,\n such as loading Lidar data, processing areas, or other tasks that involve tile locations.\n\n\n#### Seealso\n\nThe mapping function provided should be compatible with the structure and requirements\n of the tile locations and the specific task being performed.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage\nlfa_map_tile_locations(tile_locations, my_mapping_function, param1 = \"value\")\n\n# Example usage\nlfa_map_tile_locations(tile_locations, my_mapping_function, param1 = \"value\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_map_tile_locations(tile_locations, map_function, check_flag = NULL, ...)\n```\n:::\n\n\n\n### `lfa_merge_and_save`\n\nMerge and Save Text Files in a Directory\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`input_directory` | The path to the input directory containing text files.\n`output_name` | The name for the output file where the merged content will be saved.\n\n\n#### Description\n\nThis function takes an input directory and an output name as arguments.\n It merges the textual content of all files in the specified directory into\n a single string, with each file's content separated by a newline character.\n The merged content is then saved into a file named after the output name\n in the same directory. After the merging is complete, all input files are\n deleted.\n\n\n#### Details\n\nThis function reads the content of each text file in the specified input directory\n and concatenates them into a single string. Each file's content is separated by a newline\n character. The merged content is then saved into a file named after the output name\n in the same directory. Finally, all input files are deleted from the directory.\n\n\n#### Seealso\n\n[`readLines`](#readlines) , [`writeLines`](#writelines) , [`file.remove`](#file.remove)\n\n\n#### Value\n\nThis function does not explicitly return any value. It prints a message\n indicating the successful completion of the merging and saving process.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Merge text files in the \"data_files\" directory and save the result in \"merged_output\"\nlfa_merge_and_save(\"data_files\", \"merged_output\")\n\n# Merge text files in the \"data_files\" directory and save the result in \"merged_output\"\nlfa_merge_and_save(\"data_files\", \"merged_output\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_merge_and_save(input_directory, output_name)\n```\n:::\n\n\n\n### `lfa_rd_to_qmd`\n\nConvert Rd File to Markdown\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`rdfile` | The path to the Rd file or a parsed Rd object.\n`outfile` | The path to the output Markdown file (including the file extension).\n`append` | Logical, indicating whether to append to an existing file (default is FALSE).\n\n\n#### Description\n\nIMPORTANT NOTE: \n This function is nearly identical to the `Rd2md::Rd2markdown` function from the `Rd2md` \n package. We needed to implement our own version of it because of various reasons:\n \n\n* The algorithm uses hardcoded header sizes (h1 and h2 in original) which is not feasible for our use-case of the markdown. \n\n* We needed to add some Quarto Markdown specifics, e.g. to make sure that the examples will not be runned. \n\n* We want to exclude certain tags from our implementation.\n\n\n#### Details\n\nFor that reason we copied the method and made changes as needed and also added this custom documentation.\n \n This function converts an Rd (R documentation) file to Markdown format (.md) and\n saves the converted file at the specified location. The function allows appending\n to an existing file or creating a new one. The resulting Markdown file includes\n sections for the function's name, title, and additional content such as examples,\n usage, arguments, and other sections present in the Rd file.\n \n The function performs the following steps:\n \n\n* Parses the Rd file using the Rd2md package. \n\n* Creates a Markdown file with sections for the function's name, title, and additional content. \n\n* Appends the content to an existing file if `append` is set to TRUE. \n\n* Saves the resulting Markdown file at the specified location.\n\n\n#### Seealso\n\n[`Rd2md::parseRd`](#rd2md::parserd)\n\n\n#### Value\n\nThis function does not explicitly return any value. It saves the converted Markdown file\n at the specified location as described in the details section.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Convert Rd file to Markdown and save it\nlfa_rd_to_md(\"path/to/your/file.Rd\", \"path/to/your/output/file.md\")\n\n# Convert Rd file to Markdown and append to an existing file\nlfa_rd_to_md(\"path/to/your/file.Rd\", \"path/to/existing/output/file.md\", append = TRUE)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_rd_to_qmd(rdfile, outfile, append = FALSE)\n```\n:::\n\n\n\n### `lfa_rd_to_results`\n\nConvert Rd Files to Markdown and Merge Results\n\n\n#### Description\n\nThis function converts all Rd (R documentation) files in the \"man\" directory\n to Markdown format (.qmd) and saves the converted files in the \"results/appendix/package-docs\" directory.\n It then merges the converted Markdown files into a single string and saves\n the merged content into a file named \"docs.qmd\" in the \"results/appendix/package-docs\" directory.\n\n\n#### Details\n\nThe function performs the following steps:\n \n\n* Removes any existing \"docs.qmd\" file in the \"results/appendix/package-docs\" directory. \n\n* Finds all Rd files in the \"man\" directory. \n\n* Converts each Rd file to Markdown format (.qmd) using the `lfa_rd_to_qmd` function. \n\n* Saves the converted Markdown files in the \"results/appendix/package-docs\" directory. \n\n* Merges the content of all converted Markdown files into a single string. \n\n* Saves the merged content into a file named \"docs.qmd\" in the \"results/appendix/package-docs\" directory.\n\n\n#### Seealso\n\n[`lfa_rd_to_qmd`](#lfardtoqmd) , [`lfa_merge_and_save`](#lfamergeandsave)\n\n\n#### Value\n\nThis function does not explicitly return any value. It performs the conversion,\n merging, and saving operations as described in the details section.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Convert Rd files to Markdown and merge the results\nlfa_rd_to_results()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_rd_to_results()\n```\n:::\n\n\n\n### `lfa_segmentation`\n\nSegment the elements of an point cloud by trees\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | An LASCatalog object. If not null, it will perform the actions on this object, if NULL inferring the catalog from the tile_location\n`tile_location` | A tile_location type object holding the information about the location of the catalog. This is used to save the catalog after processing too.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function will try to to divide the hole point cloud into unique trees.\n Therefore it is assigning for each chunk of the catalog a `treeID` for each\n point. Therefore the algorithm uses the `li2012` implementation with the\n following parameters: `li2012(dt1 = 2, dt2 = 3, R = 2, Zu = 10, hmin = 5, speed_up = 12)` \n NOTE : The operation is in place and can not be reverted, the old values\n of the point cloud will be deleted!\n\n\n#### Value\n\nA catalog where each chunk has additional `treeID` values indicating the belonging tree.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_segmentation(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_set_flag`\n\nSet a flag to indicate the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being flagged.\n\n\n#### Description\n\nThis function creates a hidden flag file at a specified location within the working directory to indicate that a particular processing step has been completed. If the flag file already exists, a warning is issued.\n\n\n#### Value\n\nThis function does not have a formal return value.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Set the flag for a process named \"data_processing\"\nlfa_set_flag(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_set_flag(flag_name)\n```\n:::\n\n\n\n", + "markdown": "---\ntitle: \"Forest Data Analysis Report\"\noutput:\n pdf_document:\n latex_engine: xelatex\ntoc: true\ntoc-depth: 2\ntoc-title: Contents\nnumber-sections: true\nnumber-depth: 3\ndate: today\nauthor:\n - name: Jakob Danel\n email: jakob.danel@uni-muenster.de\n url: https://github.com/jakobdanel\n affiliations:\n - name: Universität Münster\n city: Münster\n country: Germany\n - name: Federick Bruch\n email: f_bruc03@uni-muenster.de\n url: https://www.uni-muenster.de/Geoinformatics/institute/staff/index.php/351/Frederick_Bruch\n affiliations:\n - name: Universität Münster\n city: Münster\n country: Germany\nbibliography: references.bib\nexecute-dir: .. \nprefer-html: true\n---\n\n\n# Introduction\n\nThis report documents the analysis of forest data for different tree species.\n\n# Methods\n\n## Data acquisition\n\nOur primary objective is to identify patches where one tree species exhibits a high level of dominance, striving to capture monocultural stands within the diverse forests of Nordrhein-Westfalia (NRW). Recognizing the practical challenges of finding true monocultures, we aim to identify patches where one species is highly dominant, enabling meaningful comparisons across different species.\n\nThe study is framed within the NRW region due to the availability of an easily accessible dataset. Our focus includes four prominent tree species in NRW: oak, beech, spruce, and pine, representing the most prevalent species in the region. To ensure the validity of our findings, we derive three patches for each species, thereby confirming that observed variables are characteristic of a particular species rather than a specific patch. Each patch is carefully selected to encompass an area of approximately 50-100 hectares and contain between 5,000 and 10,000 trees. Striking a balance between relevance and manageability, these patches avoid excessive size to enhance the likelihood of capturing varied species mixes and ensure compatibility with local hardware.\n\nSpecific Goals:\n\n1. Retrieve patches with highly dominant tree species.\n2. Minimize or eliminate the presence of human-made structures within the selected patches.\n\nTo achieve our goals, we utilized the waldmonitor dataset [@welle2014] and the map provided by [@Blickensdoerfer2022], both indicating dominant tree species in NRW. We identified patches of feasible size where both sources predicted the presence of a specific species. Further validation involved examining sentinel images of these forest regions to assess the evenness of structures, leaf color distribution, and the absence of significant human-made structures such as roads or buildings. The subsequent preprocessing steps, detailed in the following subsection, involved refining our selected patches and deriving relevant variables, such as tree distribution and density, to ensure that the chosen areas align with the desired research domains.\n\n## Preprocessing\n::: {.cell}\n\n:::\n\n\nIn this research study, the management and processing of a large dataset are crucial considerations. The dataset's substantial size necessitates careful maintenance to ensure efficient handling. Furthermore, the data should be easily processable and editable to facilitate necessary corrections and precalculations within the context of our research objectives. To achieve our goals, we have implemented a framework that automatically derives data based on a shapefile, delineating areas of interest. The processed data and results of precalculations are stored in a straightforward manner to enhance accessibility. Additionally, we have designed functions that establish a user-friendly interface, enabling the execution of algorithms on subsets of the data, such as distinct species. These interfaces are not only directly callable by users but can also be integrated into other functions to automate processes. The overarching aim is to streamline the entire preprocessing workflow using a single script, leveraging only the shapefile as a basis. This subsection details the accomplishments of our R-package in realizing these goals, outlining the preprocessing steps undertaken and justifying their necessity in the context of our research.\n\nThe data are stored in a data subdirectory of the root directory in the format `species/location-name/tile-name`. To automate the matching of areas of interest with the catalog from the Land NRW[^1], we utilize the intersecting tool developed by Heisig[^2]. This tool, allows for the automatic retrieval and placement of data downloaded from the Land NRW catalog. To enhance data accessibility, we have devised an object that incorporates species, location name, and tile name (the NRW internal identifier) for each area This object facilitates the specification of the area to be processed. Additionally, we have defined an initialization function that downloads all tiles, returning a list of tile location objects for subsequent processing. A pivotal component of the package's preprocessing functionality is the map function, which iterates over a list of tile locations (effectively the entire dataset) and accepts a processing function as an argument. The subsequent paragraph outlines the specific preprocessing steps employed, all of which are implemented within the mapping function.\n\nTo facilitate memory-handling capabilities, each of the tiles, where one area can span multiple tiles, has been split into manageable chunks. We employed a 50x50m size for each tile, resulting in the division of original 1km x 1km files into 400 tiles. These tiles are stored in our directory structure, with each tile housed in a directory named after its tile name and assigned an id as the filename. Implementation-wise, the `lidr::catalog_retile` function was instrumental in achieving this segmentation. The resulting smaller chunks allow for efficient iteration during subsequent preprocessing steps.\n\nThe next phase involves reducing our data to the actual size by intersecting the tiles with the defined area of interest. Using the `lidR::merge_spatial` function, we intersect the area derived from the shapefile, removing all point cloud items outside this region. Due to our tile-wise approach, empty tiles may arise, and in such cases, those tiles are simply deleted.\n\nFollowing the size reduction to our dataset, the next step involves correcting the `z` values. The `z` values in the data are originally relative to the ellipsoid used for referencing, but we require them to be relative to the ground. To achieve this, we utilize the `lidR::tin` function, which extrapolates a convex hull between all ground points (classified by the data provider) and calculates the z value based on this structure.\n\nSubsequently, we aim to perform segmentation for each distinct tree, marking each item of the point cloud with a tree ID. We employ the algorithm described by @li2012, using parameters `li2012(dt1 = 2, dt2 = 3, R = 2, Zu = 10, hmin = 5, speed_up = 12)`. The meanings of these parameters are elucidated in Li et al.'s work [@li2012].\n\nFinally, the last preprocessing step involves individual tree detection, seeking a single `POINT` object for each tree. The `lidR::lmf` function, an implementation of the tree data using a local maximum approach, is utilized for this purpose [@popescu2004]. The results are stored in GeoPackage files within our data structure.\n\nSee @sec-appendix-preprocessing for the implementation of the preprocessing.\n\n[^1]: https://www.opengeodata.nrw.de/produkte/geobasis/hm/3dm_l_las/3dm_l_las/, last visited 7th Dec 2023\n[^2]: https://github.com/joheisig/GEDIcalibratoR, last visited 7th Dec 2023\n\n## Analysis of different distributions\n\nAnalysis of data distributions is a critical aspect of our research, with a focus on comparing two or more distributions. Our objective extends beyond evaluating the disparities between species; we also aim to assess differences within a species. To gain a comprehensive understanding of the data, we employ various visualization techniques, including histograms, QQ-Plots (Quantile-Quantile Plots), density functions, and box plots.\n\nIn tandem with visualizations, descriptive statistics, such as means, standard errors, and quantiles, are leveraged to provide key insights into the central tendency and variability of the data.\n\nFor a more quantitative analysis of distribution dissimilarity, statistical tests are employed. The Kullback-Leibler (KL) difference serves as a measure to compare the similarity of a set of distributions. This involves converting distributions into their density functions, with the standard error serving as the bandwidth. The KL difference is calculated for each pair of distributions, as it is asymmetric. For the two distributions the KL difference is defined as following [@kullback1951kullback]:\n\n$$\nD_{KL}(P \\, \\| \\, Q) = \\sum_i P(i) \\log\\left(\\frac{P(i)}{Q(i)}\\right)\n$$\n\nTo obtain a symmetric score, the Jensen-Shannon Divergence (JSD) is utilized [@grosse2002analysis], expressed by the formula:\n\n$$\nJS(P || Q) = \\frac{1}{2} * KL(P || M) + \\frac{1}{2} * KL(Q || M)\n$$\nHere, $M = \\frac{1}{2} * (P + Q)$. The JSD provides a balanced measure of dissimilarity between distributions [@Brownlee2019Calculate]. For comparing the different scores to each other, we will use averages.\n\nAdditionally, the Kolmogorov-Smirnov Test is implemented to assess whether two distributions significantly differ from each other. This statistical test offers a formal evaluation of the dissimilarity between empirical distribution functions.\n\n\n# Results\n::: {.cell}\n\n:::\n\n## Researched areas\n\n::: {.cell}\n\n```{.r .cell-code code-fold=\"true\"}\nlibrary(ggplot2)\nsf::sf_use_s2(FALSE)\npatches <- sf::read_sf(\"research_areas.shp\") |> sf::st_centroid()\n\nde <- sf::read_sf(\"results/results/states_de/Bundesländer_2017_mit_Einwohnerzahl.shp\") # Source: https://hub.arcgis.com/datasets/esri-de-content::bundesl%C3%A4nder-2017-mit-einwohnerzahl/explore?location=51.099647%2C10.454033%2C7.43\nnrw <- de[5,] |> sf::st_geometry()\n\n\nggplot() + geom_sf(data = nrw) + \n geom_sf(data = patches, mapping = aes(col = species))\n```\n\n::: {.cell-output-display}\n![Locations of the different patches with the dominant species for that patch. The patches centroids are displayed on a basemap describing the borders from NRW.](report_files/figure-pdf/fig-patches-nrw-1.pdf){#fig-patches-nrw fig-pos='H'}\n:::\n:::\nWe draw three patches for each species from different regions (see @tbl-summary-researched-areas). We download the LiDAR data for those patches and runned all preprocessing steps as described. We than checked with certain derived parameters (e.g. tree heights, tree distributions or tree density) that all patches contain valid forest data. In that step we discovered, that in one patch some forest clearance took place in the near past. This patch was removed from the dataset and was replaced with a new one. \n\nIn our research, drawing patches evenly distributed across Nordrhein-Westfalia is inherently constrained by natural factors. Consequently, the patches for oak and pine predominantly originate from the Münsterland region, as illustrated in [@fig-patches-nrw]. For spruce, the patches were derived from Sauerland, reflecting the prevalence of spruce forests in this specific region within NRW, as corroborated by Welle et al. [@welle2014] and Blickensdörfer et al. [@Blickensdoerfer2022]. Beech patches, on the other hand, were generated from diverse locations within NRW. Across all patches, no human-made objects were identified, with the exception of small paths for pedestrians and forestry vehicles.\n\nThe distribution of area and detections is notable for each four species. Beech covers 69,791.9 hectares with a total of 5,954 detections, oak spans 63,232.49 hectares with 5,354 detections, pine extends across 72,862.4 hectares with 8,912 detections, and spruce encompasses 57,940.02 hectares with 8,619 detections. Both the amount of detections and the corresponding area exhibit a relatively uniform distribution across the diverse patches, as summarized in @tbl-summary-researched-areas. \n\nWith the selected dataset described, we intentionally chose three patches for each four species that exhibit a practical and usable size for our research objectives. These carefully chosen patches align with the conditions essential for our study, providing comprehensive and representative data for in-depth analysis and meaningful insights into the characteristics of each tree species within the specified areas.\n\n\n::: {#tbl-summary-researched-areas .cell tbl-cap='Summary of researched patches grouped by species, with their location, area and the amount of detected trees.'}\n\n```{.r .cell-code code-fold=\"true\"}\nshp <- sf::read_sf(\"research_areas.shp\")\ntable <- lfa::lfa_get_all_areas()\n\nsf::sf_use_s2(FALSE)\nfor (row in 1:nrow(table)) {\n area <-\n dplyr::filter(shp, shp$species == table[row, \"specie\"] &\n shp$name == table[row, \"area\"])\n area_size <- area |> sf::st_area()\n point <- area |> sf::st_centroid() |> sf::st_coordinates()\n table[row,\"point\"] <- paste0(\"(\",round(point[1], digits = 4),\", \",round(point[2],digits = 4),\")\")\n \n table[row, \"area_size\"] = round(area_size,digits = 2) #paste0(round(area_size,digits = 2), \" m²\")\n \n amount_det <- nrow(lfa::lfa_get_detection_area(table[row, \"specie\"], table[row, \"area\"]))\n if(is.null(amount_det)){\n cat(nrow(lfa::lfa_get_detection_area(table[row, \"specie\"], table[row, \"area\"])),table[row, \"specie\"],table[row, \"area\"])\n }\n table[row, \"amount_detections\"] = amount_det\n \n # table[row, \"specie\"] <- lfa::lfa_capitalize_first_char(table[row,\"specie\"])\n table[row, \"area\"] <- lfa::lfa_capitalize_first_char(table[row,\"area\"])\n }\ntable$area <- gsub(\"_\", \" \", table$area)\ntable$area <- gsub(\"ue\", \"ü\", table$area)\ntable = table[,!names(table) %in% c(\"specie\")]\n\nknitr::kable(table, \"html\", col.names = c(\"Patch Name\",\"Location\",\"Area size (m²)\",\"Amount tree detections\" ), caption = NULL, digits = 2, escape = TRUE) |>\n kableExtra::kable_styling(\n bootstrap_options = c(\"striped\", \"hold_position\", \"bordered\",\"responsive\"),\n stripe_index = c(1:3,7:9),\n full_width = FALSE\n ) |>\n kableExtra::pack_rows(\"Beech\", 1, 3) |>\n kableExtra::pack_rows(\"Oak\", 4, 6) |>\n kableExtra::pack_rows(\"Pine\", 7, 9) |>\n kableExtra::pack_rows(\"Spruce\", 10, 12) |>\n kableExtra::column_spec(1, bold = TRUE)\n```\n\n::: {.cell-output-display}\n`````{=html}\n\n \n \n \n \n \n \n \n \n\n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n
Patch Name Location Area size (m²) Amount tree detections
Beech
Bielefeld brackwede (8.5244, 51.9902) 161410.57 1443
Billerbeck (7.3273, 51.9987) 185887.25 1732
Wülfenrath (7.0769, 51.2917) 350621.21 2779
Oak
Hamm (7.8618, 51.6639) 269397.22 2441
Münster (7.6187, 51.9174) 164116.61 1270
Rinkerode (7.6744, 51.8598) 198811.09 1643
Pine
Greffen (8.1697, 51.9913) 49418.81 513
Mesum (7.5403, 52.2573) 405072.85 5031
Telgte (7.7816, 52.0024) 274132.34 3368
Spruce
Brilon (8.5352, 51.4084) 211478.20 3342
Oberhundem (8.1861, 51.0909) 151895.53 2471
Osterwald (8.3721, 51.2151) 216026.43 2806
\n\n`````\n:::\n:::\n\n\n\n\n\n\n\n\n|specie |area | density (1/m²)|\n|:------|:-------------------|---------:|\n|beech |bielefeld_brackwede | 0.0089399|\n|beech |billerbeck | 0.0093175|\n|beech |wuelfenrath | 0.0079259|\n|oak |hamm | 0.0090610|\n|oak |muenster | 0.0077384|\n|oak |rinkerode | 0.0082641|\n|pine |greffen | 0.0103807|\n|pine |mesum | 0.0124200|\n|pine |telgte | 0.0122860|\n|spruce |brilon | 0.0158030|\n|spruce |oberhundem | 0.0162678|\n|spruce |osterwald | 0.0129892|\n\n\n\n# References\n\n::: {#refs}\n:::\n\n# Appendix\n## Script which can be used to do all preprocessing {#sec-appendix-preprocessing}\n\n::: {.cell}\n\n:::\n\n\nLoad the file with the research areas\n::: {.cell}\n\n```{.r .cell-code}\nsf <- sf::read_sf(here::here(\"research_areas.shp\"))\nprint(sf)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nSimple feature collection with 12 features and 3 fields\nGeometry type: POLYGON\nDimension: XY\nBounding box: xmin: 7.071625 ymin: 51.0895 xmax: 8.539877 ymax: 52.25983\nGeodetic CRS: WGS 84\n# A tibble: 12 x 4\n id species name geometry\n \n 1 1 oak rinkerode ((7.678922 51.85789, 7.675446 51.85752, 7.~\n 2 2 oak hamm ((7.858955 51.66699, 7.866444 51.66462, 7.~\n 3 3 oak muenster ((7.618908 51.9154, 7.617384 51.9172, 7.61~\n 4 4 pine greffen ((8.168691 51.98965, 8.167178 51.99075, 8.~\n 5 5 pine telgte ((7.779728 52.00662, 7.781616 52.00662, 7.~\n 6 6 pine mesum ((7.534424 52.25499, 7.53378 52.25983, 7.5~\n 7 7 beech bielefeld_brackwede ((8.524749 51.9921, 8.528418 51.99079, 8.5~\n 8 8 beech wuelfenrath ((7.071625 51.29256, 7.072311 51.29334, 7.~\n 9 9 beech billerbeck ((7.324729 51.99783, 7.323548 51.99923, 7.~\n10 11 spruce brilon ((8.532195 51.41029, 8.535027 51.41064, 8.~\n11 12 spruce osterwald ((8.369328 51.21693, 8.371238 51.21718, 8.~\n12 10 spruce oberhundem ((8.18082 51.08999, 8.180868 51.09143, 8.1~\n```\n:::\n:::\n\n\nInit the project\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(lfa)\nsf::sf_use_s2(FALSE)\nlocations <- lfa_init(\"research_areas.shp\")\n```\n:::\n\nDo all of the prprocessing steps\n::: {.cell}\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations,retile,check_flag = \"retile\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag retile is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_intersect_areas, ctg = NULL, areas_sf = sf,check_flag = \"intersect\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag intersect is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_ground_correction, ctg = NULL,check_flag = \"z_correction\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag z_correction is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_segmentation, ctg = NULL,check_flag = \"segmentation\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag segmentation is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n\n```{.r .cell-code}\nlfa_map_tile_locations(locations, lfa_detection, catalog = NULL, write_to_file = TRUE,check_flag = \"detection\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNo further processing: flag detection is set!Function is already computed, no further computings here\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\nNULL\n```\n:::\n:::\n\n\n\n## Documentation\n### `lfa_capitalize_first_char`\n\nCapitalize First Character of a String\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`input_string` | A single-character string to be processed.\n\n\n#### Concept\n\nString Manipulation\n\n\n#### Description\n\nThis function takes a string as input and returns the same string with the\n first character capitalized. If the first character is already capitalized,\n the function does nothing. If the first character is not from the alphabet,\n an error is thrown.\n\n\n#### Details\n\nThis function performs the following steps:\n \n\n* Checks if the input is a single-character string. \n\n* Verifies if the first character is from the alphabet (A-Z or a-z). \n\n* If the first character is not already capitalized, it capitalizes it. \n\n* Returns the modified string.\n\n\n#### Keyword\n\nalphabet\n\n\n#### Note\n\nThis function is case-sensitive and assumes ASCII characters.\n\n\n#### References\n\nNone\n\n\n#### Seealso\n\nThis function is related to the basic string manipulation functions in base R.\n\n\n#### Value\n\nA modified string with the first character capitalized if it is\n not already. If the first character is already capitalized, the original\n string is returned.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Capitalize the first character of a string\ncapitalize_first_char(\"hello\") # Returns \"Hello\"\ncapitalize_first_char(\"World\") # Returns \"World\"\n\n# Error example (non-alphabetic first character)\ncapitalize_first_char(\"123abc\") # Throws an error\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_capitalize_first_char(input_string)\n```\n:::\n\n\n\n### `lfa_check_flag`\n\nCheck if a flag is set, indicating the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being checked.\n\n\n#### Description\n\nThis function checks for the existence of a hidden flag file at a specified location within the working directory. If the flag file is found, a message is printed, and the function returns `TRUE` to indicate that the associated processing step has already been completed. If the flag file is not found, the function returns `FALSE` , indicating that further processing can proceed.\n\n\n#### Value\n\nA logical value indicating whether the flag is set ( `TRUE` ) or not ( `FALSE` ).\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Check if the flag for a process named \"data_processing\" is set\nlfa_check_flag(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_check_flag(flag_name)\n```\n:::\n\n\n\n### `lfa_create_stacked_distributions_plot`\n\nCreate a stacked distribution plot for tree detections, visualizing the distribution\n of a specified variable on the x-axis, differentiated by another variable.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`trees` | A data frame containing tree detection data.\n`x_value` | A character string specifying the column name used for finding the values on the x-axis of the histogram.\n`fill_value` | A character string specifying the column name by which the data are differentiated in the plot.\n`bin` | An integer specifying the number of bins for the histogram. Default is 100.\n`ylab` | A character string specifying the y-axis label. Default is \"Amount trees.\"\n`xlim` | A numeric vector of length 2 specifying the x-axis limits. Default is c(0, 100).\n`ylim` | A numeric vector of length 2 specifying the y-axis limits. Default is c(0, 1000).\n`title` | The title of the plot.\n\n\n#### Description\n\nThis function generates a stacked distribution plot using the ggplot2 package,\n providing a visual representation of the distribution of a specified variable\n ( `x_value` ) on the x-axis, with differentiation based on another variable\n ( `fill_value` ). The data for the plot are derived from the provided `trees` \n data frame.\n\n\n#### Keyword\n\ndata\n\n\n#### Seealso\n\n[`ggplot2::geom_histogram`](#ggplot2::geomhistogram) , [`ggplot2::facet_wrap`](#ggplot2::facetwrap) ,\n [`ggplot2::ylab`](#ggplot2::ylab) , [`ggplot2::scale_fill_brewer`](#ggplot2::scalefillbrewer) ,\n [`ggplot2::coord_cartesian`](#ggplot2::coordcartesian)\n\n\n#### Value\n\nA ggplot object representing the stacked distribution plot.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Create a stacked distribution plot for variable \"Z,\" differentiated by \"area\"\ntrees <- lfa_get_detections()\nlfa_create_stacked_distributions_plot(trees, \"Z\", \"area\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_stacked_distributions_plot(\n trees,\n x_value,\n fill_value,\n bin = 100,\n ylab = \"Amount trees\",\n xlim = c(0, 100),\n ylim = c(0, 1000),\n title =\n \"Histograms of height distributions between species 'beech', 'oak', 'pine' and 'spruce' divided by the different areas of Interest\"\n)\n```\n:::\n\n\n\n### `lfa_create_tile_location_objects`\n\nCreate tile location objects\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function traverses a directory structure to find LAZ files and creates\n tile location objects for each file. The function looks into the the `data` \n directory of the repository/working directory. It then creates `tile_location` \n objects based on the folder structure. The folder structure should not be\n touched by hand, but created by `lfa_init_data_structure()` which builds the\n structure based on a shape file.\n\n\n#### Seealso\n\n[`tile_location`](#tilelocation)\n\n\n#### Value\n\nA vector containing tile location objects.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_tile_location_objects()\n\nlfa_create_tile_location_objects()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_create_tile_location_objects()\n```\n:::\n\n\n\n### `lfa_detection`\n\nPerform tree detection on a lidar catalog and optionally save the results to a file.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`catalog` | A lidar catalog containing point cloud data. If set to NULL, the function attempts to read the catalog from the specified tile location.\n`tile_location` | An object specifying the location of the lidar tile. If catalog is NULL, the function attempts to read the catalog from this tile location.\n`write_to_file` | A logical value indicating whether to save the detected tree information to a file. Default is TRUE.\n\n\n#### Description\n\nThis function utilizes lidar data to detect trees within a specified catalog. The detected tree information can be optionally saved to a file in the GeoPackage format. The function uses parallel processing to enhance efficiency.\n\n\n#### Value\n\nA sf style data frame containing information about the detected trees.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Perform tree detection on a catalog and save the results to a file\nlfa_detection(catalog = my_catalog, tile_location = my_tile_location, write_to_file = TRUE)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_detection(catalog, tile_location, write_to_file = TRUE)\n```\n:::\n\n\n\n### `lfa_download_areas`\n\nDownload areas based on spatial features\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_areas` | Spatial features representing areas to be downloaded. It must include columns like \"species\" \"name\" See details for more information.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function initiates the data structure and downloads areas based on spatial features.\n\n\n#### Details\n\nThe input data frame, `sf_areas` , must have the following columns:\n \n\n* \"species\": The species associated with the area. \n\n* \"name\": The name of the area. \n \n The function uses the `lfa_init_data_structure` function to set up the data structure\n and then iterates through the rows of `sf_areas` to download each specified area.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download_areas(sf_areas)\n\n\n# Example spatial features data frame\nsf_areas <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Must include also other attributes specialized to sf objects\n# such as geometry, for processing of the download\n)\n\nlfa_download_areas(sf_areas)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download_areas(sf_areas)\n```\n:::\n\n\n\n### `lfa_download`\n\nDownload an las file from the state NRW from a specific location\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | The species of the tree which is observed at this location\n`name` | The name of the area that is observed\n`location` | An sf object, which holds the location information for the area where the tile should be downloaded from.\n\n\n#### Description\n\nIt will download the file and save it to data/ list(list(\"html\"), list(list(\"\"))) / list(list(\"html\"), list(list(\"\"))) with the name of the tile\n\n\n#### Value\n\nThe LASCatalog object of the downloaded file\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_download(species, name, location)\n```\n:::\n\n\n\n### `lfa_get_all_areas`\n\nRetrieve a data frame containing all species and corresponding areas.\n\n\n#### Description\n\nThis function scans the \"data\" directory within the current working directory to\n obtain a list of species. It then iterates through each species to retrieve the list\n of areas associated with that species. The resulting data frame contains two columns:\n \"specie\" representing the species and \"area\" representing the corresponding area.\n\n\n#### Keyword\n\ndata\n\n\n#### Seealso\n\n[`list.dirs`](#list.dirs)\n\n\n#### Value\n\nA data frame with columns \"specie\" and \"area\" containing information about\n all species and their associated areas.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Retrieve a data frame with information about all species and areas\nall_areas_df <- lfa_get_all_areas()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_all_areas()\n```\n:::\n\n\n\n### `lfa_get_detection_area`\n\nGet Detection for an area\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | A character string specifying the target species.\n`name` | A character string specifying the name of the tile.\n\n\n#### Description\n\nRetrieves the tree detection information for a specified species and tile.\n\n\n#### Details\n\nThis function reads tree detection data from geopackage files within the specified tile location for a given species. It then combines the data into a single SF data frame and returns it. The function assumes that the tree detection files follow a naming convention with the pattern \"_detection.gpkg\".\n\n\n#### Keyword\n\nspatial\n\n\n#### References\n\nThis function is part of the LiDAR Forest Analysis (LFA) package.\n\n\n#### Seealso\n\n[`get_tile_dir`](#gettiledir)\n\n\n#### Value\n\nA Simple Features (SF) data frame containing tree detection information for the specified species and tile.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Retrieve tree detection data for species \"example_species\" in tile \"example_tile\"\ntrees_data <- lfa_get_detection_tile_location(\"example_species\", \"example_tile\")\n\n# Example usage:\ntrees_data <- lfa_get_detection_tile_location(\"example_species\", \"example_tile\")\n\n# No trees found scenario:\nempty_data <- lfa_get_detection_tile_location(\"nonexistent_species\", \"nonexistent_tile\")\n# The result will be an empty data frame if no trees are found for the specified species and tile.\n\n# Error handling:\n# In case of invalid inputs, the function may throw errors. Ensure correct species and tile names are provided.\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detection_area(species, name)\n```\n:::\n\n\n\n### `lfa_get_detections_species`\n\nRetrieve detections for a specific species.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`species` | A character string specifying the target species.\n\n\n#### Description\n\nThis function retrieves detection data for a given species from multiple areas.\n\n\n#### Details\n\nThe function looks for detection data in the \"data\" directory for the specified species.\n It then iterates through each subdirectory (representing different areas) and consolidates the\n detection data into a single data frame.\n\n\n#### Value\n\nA data frame containing detection information for the specified species in different areas.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage:\ndetections_data <- lfa_get_detections_species(\"example_species\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_detections_species(species)\n```\n:::\n\n\n\n### `lfa_get_flag_path`\n\nGet the path to a flag file indicating the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being flagged.\n\n\n#### Description\n\nThis function constructs and returns the path to a hidden flag file, which serves as an indicator that a particular processing step has been completed. The flag file is created in a designated location within the working directory.\n\n\n#### Value\n\nA character string representing the absolute path to the hidden flag file.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Get the flag path for a process named \"data_processing\"\nlfa_get_flag_path(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_get_flag_path(flag_name)\n```\n:::\n\n\n\n### `lfa_ground_correction`\n\nCorrect the point clouds for correct ground imagery\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | An LASCatalog object. If not null, it will perform the actions on this object, if NULL inferring the catalog from the tile_location\n`tile_location` | A tile_location type object holding the information about the location of the cataog. This is used to save the catalog after processing too.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function is needed to correct the Z value of the point cloud, relative to the real\n ground height. After using this function to your catalog, the Z values can be seen as the\n real elevation about the ground. At the moment the function uses the `tin()` function from\n the `lidr` package. NOTE : The operation is inplace and can not be reverted, the old values\n of the point cloud will be deleted!\n\n\n#### Value\n\nA catalog with the corrected z values. The catalog is always stored at tile_location and\n holding only the transformed values.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_ground_correction(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_init_data_structure`\n\nInitialize data structure for species and areas\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_species` | A data frame with information about species and associated areas.\n\n\n#### Description\n\nThis function initializes the data structure for storing species and associated areas.\n\n\n#### Details\n\nThe input data frame, `sf_species` , should have at least the following columns:\n \n\n* \"species\": The names of the species for which the data structure needs to be initialized. \n\n* \"name\": The names of the associated areas. \n \n The function creates directories based on the species and area information provided in\n the `sf_species` data frame. It checks whether the directories already exist and creates\n them if they don't.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example species data frame\nsf_species <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Other necessary columns\n)\n\nlfa_init_data_structure(sf_species)\n\n# Example species data frame\nsf_species <- data.frame(\nspecies = c(\"SpeciesA\", \"SpeciesB\"),\nname = c(\"Area1\", \"Area2\"),\n# Other necessary columns\n)\n\nlfa_init_data_structure(sf_species)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_init_data_structure(sf_species)\n```\n:::\n\n\n\n### `lfa_init`\n\nInitialize LFA (LiDAR forest analysis) data processing\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`sf_file` | A character string specifying the path to the shapefile containing spatial features of research areas.\n\n\n#### Description\n\nThis function initializes the LFA data processing by reading a shapefile containing\n spatial features of research areas, downloading the specified areas, and creating\n tile location objects for each area.\n\n\n#### Details\n\nThis function reads a shapefile ( `sf_file` ) using the `sf` package, which should\n contain information about research areas. It then calls the `lfa_download_areas` \n function to download the specified areas and `lfa_create_tile_location_objects` \n to create tile location objects based on Lidar data files in those areas. The\n shapefile MUST follow the following requirements:\n \n\n* Each geometry must be a single object of type polygon \n\n* Each entry must have the following attributes: \n\n* species: A string describing the tree species of the area. \n\n* name: A string describing the location of the area.\n\n\n#### Value\n\nA vector containing tile location objects.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Initialize LFA processing with the default shapefile\nlfa_init()\n\n# Initialize LFA processing with a custom shapefile\nlfa_init(\"custom_areas.shp\")\n\n# Example usage with the default shapefile\nlfa_init()\n\n# Example usage with a custom shapefile\nlfa_init(\"custom_areas.shp\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_init(sf_file = \"research_areas.shp\")\n```\n:::\n\n\n\n### `lfa_intersect_areas`\n\nIntersect Lidar Catalog with Spatial Features\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | A LAScatalog object representing the Lidar data to be processed.\n`tile_location` | A tile location object representing the specific area of interest.\n`areas_sf` | Spatial features defining areas.\n\n\n#### Description\n\nThis function intersects a Lidar catalog with a specific area defined by spatial features.\n\n\n#### Details\n\nThe function intersects the Lidar catalog specified by `ctg` with a specific area defined by\n the `tile_location` object and `areas_sf` . It removes points outside the specified area and\n returns a modified LAScatalog object.\n \n The specified area is identified based on the `species` and `name` attributes in the\n `tile_location` object. If a matching area is not found in `areas_sf` , the function\n stops with an error.\n \n The function then transforms the spatial reference of the identified area to match that of\n the Lidar catalog using `sf::st_transform` .\n \n The processing is applied to each chunk in the catalog using the `identify_area` function,\n which merges spatial information and filters out points that are not classified as inside\n the identified area. After processing, the function writes the modified LAS files back to\n the original file locations, removing points outside the specified area.\n \n If an error occurs during the processing of a chunk, a warning is issued, and the function\n continues processing the next chunks. If no points are found after filtering, a warning is\n issued, and NULL is returned.\n\n\n#### Seealso\n\nOther functions in the Lidar forest analysis (LFA) package.\n\n\n#### Value\n\nA modified LAScatalog object with points outside the specified area removed.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n\n# Example usage\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_intersect_areas(ctg, tile_location, areas_sf)\n```\n:::\n\n\n\n### `lfa_load_ctg_if_not_present`\n\nLoading the catalog if it is not present\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | Catalog object. Can be NULL\n`tile_location` | The location to look for the catalog tiles, if their are not present\n\n\n#### Description\n\nThis function checks if the catalog is `NULL` . If it is it will load the\n catalog from the `tile_location`\n\n\n#### Value\n\nThe provided ctg object if not null, else the catalog for the tiles\n of the tile_location.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_load_ctg_if_not_present(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_map_tile_locations`\n\nMap Function Over Tile Locations\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`tile_locations` | A list of tile location objects.\n`map_function` | The mapping function to be applied to each tile location.\n`...` | Additional arguments to be passed to the mapping function.\n\n\n#### Description\n\nThis function applies a specified mapping function to each tile location in a list.\n\n\n#### Details\n\nThis function iterates over each tile location in the provided list ( `tile_locations` )\n and applies the specified mapping function ( `map_function` ) to each tile location.\n The mapping function should accept a tile location object as its first argument, and\n additional arguments can be passed using the ellipsis ( `...` ) syntax.\n \n This function is useful for performing operations on multiple tile locations concurrently,\n such as loading Lidar data, processing areas, or other tasks that involve tile locations.\n\n\n#### Seealso\n\nThe mapping function provided should be compatible with the structure and requirements\n of the tile locations and the specific task being performed.\n\n\n#### Value\n\nNone\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Example usage\nlfa_map_tile_locations(tile_locations, my_mapping_function, param1 = \"value\")\n\n# Example usage\nlfa_map_tile_locations(tile_locations, my_mapping_function, param1 = \"value\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_map_tile_locations(tile_locations, map_function, check_flag = NULL, ...)\n```\n:::\n\n\n\n### `lfa_merge_and_save`\n\nMerge and Save Text Files in a Directory\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`input_directory` | The path to the input directory containing text files.\n`output_name` | The name for the output file where the merged content will be saved.\n\n\n#### Description\n\nThis function takes an input directory and an output name as arguments.\n It merges the textual content of all files in the specified directory into\n a single string, with each file's content separated by a newline character.\n The merged content is then saved into a file named after the output name\n in the same directory. After the merging is complete, all input files are\n deleted.\n\n\n#### Details\n\nThis function reads the content of each text file in the specified input directory\n and concatenates them into a single string. Each file's content is separated by a newline\n character. The merged content is then saved into a file named after the output name\n in the same directory. Finally, all input files are deleted from the directory.\n\n\n#### Seealso\n\n[`readLines`](#readlines) , [`writeLines`](#writelines) , [`file.remove`](#file.remove)\n\n\n#### Value\n\nThis function does not explicitly return any value. It prints a message\n indicating the successful completion of the merging and saving process.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Merge text files in the \"data_files\" directory and save the result in \"merged_output\"\nlfa_merge_and_save(\"data_files\", \"merged_output\")\n\n# Merge text files in the \"data_files\" directory and save the result in \"merged_output\"\nlfa_merge_and_save(\"data_files\", \"merged_output\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_merge_and_save(input_directory, output_name)\n```\n:::\n\n\n\n### `lfa_rd_to_qmd`\n\nConvert Rd File to Markdown\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`rdfile` | The path to the Rd file or a parsed Rd object.\n`outfile` | The path to the output Markdown file (including the file extension).\n`append` | Logical, indicating whether to append to an existing file (default is FALSE).\n\n\n#### Description\n\nIMPORTANT NOTE: \n This function is nearly identical to the `Rd2md::Rd2markdown` function from the `Rd2md` \n package. We needed to implement our own version of it because of various reasons:\n \n\n* The algorithm uses hardcoded header sizes (h1 and h2 in original) which is not feasible for our use-case of the markdown. \n\n* We needed to add some Quarto Markdown specifics, e.g. to make sure that the examples will not be runned. \n\n* We want to exclude certain tags from our implementation.\n\n\n#### Details\n\nFor that reason we copied the method and made changes as needed and also added this custom documentation.\n \n This function converts an Rd (R documentation) file to Markdown format (.md) and\n saves the converted file at the specified location. The function allows appending\n to an existing file or creating a new one. The resulting Markdown file includes\n sections for the function's name, title, and additional content such as examples,\n usage, arguments, and other sections present in the Rd file.\n \n The function performs the following steps:\n \n\n* Parses the Rd file using the Rd2md package. \n\n* Creates a Markdown file with sections for the function's name, title, and additional content. \n\n* Appends the content to an existing file if `append` is set to TRUE. \n\n* Saves the resulting Markdown file at the specified location.\n\n\n#### Seealso\n\n[`Rd2md::parseRd`](#rd2md::parserd)\n\n\n#### Value\n\nThis function does not explicitly return any value. It saves the converted Markdown file\n at the specified location as described in the details section.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Convert Rd file to Markdown and save it\nlfa_rd_to_md(\"path/to/your/file.Rd\", \"path/to/your/output/file.md\")\n\n# Convert Rd file to Markdown and append to an existing file\nlfa_rd_to_md(\"path/to/your/file.Rd\", \"path/to/existing/output/file.md\", append = TRUE)\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_rd_to_qmd(rdfile, outfile, append = FALSE)\n```\n:::\n\n\n\n### `lfa_rd_to_results`\n\nConvert Rd Files to Markdown and Merge Results\n\n\n#### Description\n\nThis function converts all Rd (R documentation) files in the \"man\" directory\n to Markdown format (.qmd) and saves the converted files in the \"results/appendix/package-docs\" directory.\n It then merges the converted Markdown files into a single string and saves\n the merged content into a file named \"docs.qmd\" in the \"results/appendix/package-docs\" directory.\n\n\n#### Details\n\nThe function performs the following steps:\n \n\n* Removes any existing \"docs.qmd\" file in the \"results/appendix/package-docs\" directory. \n\n* Finds all Rd files in the \"man\" directory. \n\n* Converts each Rd file to Markdown format (.qmd) using the `lfa_rd_to_qmd` function. \n\n* Saves the converted Markdown files in the \"results/appendix/package-docs\" directory. \n\n* Merges the content of all converted Markdown files into a single string. \n\n* Saves the merged content into a file named \"docs.qmd\" in the \"results/appendix/package-docs\" directory.\n\n\n#### Seealso\n\n[`lfa_rd_to_qmd`](#lfardtoqmd) , [`lfa_merge_and_save`](#lfamergeandsave)\n\n\n#### Value\n\nThis function does not explicitly return any value. It performs the conversion,\n merging, and saving operations as described in the details section.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Convert Rd files to Markdown and merge the results\nlfa_rd_to_results()\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_rd_to_results()\n```\n:::\n\n\n\n### `lfa_read_area_as_catalog`\n\nRead LiDAR data from a specified species and location as a catalog.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`specie` | A character string specifying the species of interest.\n`location_name` | A character string specifying the name of the location.\n\n\n#### Description\n\nThis function constructs the file path based on the specified `specie` and `location_name` ,\n lists the directories at that path, and reads the LiDAR data into a `lidR::LAScatalog` .\n\n\n#### Value\n\nA `lidR::LAScatalog` object containing the LiDAR data from the specified location and species.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_read_area_as_catalog(\"beech\", \"location1\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_read_area_as_catalog(specie, location_name)\n```\n:::\n\n\n\n### `lfa_segmentation`\n\nSegment the elements of an point cloud by trees\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`ctg` | An LASCatalog object. If not null, it will perform the actions on this object, if NULL inferring the catalog from the tile_location\n`tile_location` | A tile_location type object holding the information about the location of the catalog. This is used to save the catalog after processing too.\n\n\n#### Author\n\nJakob Danel\n\n\n#### Description\n\nThis function will try to to divide the hole point cloud into unique trees.\n Therefore it is assigning for each chunk of the catalog a `treeID` for each\n point. Therefore the algorithm uses the `li2012` implementation with the\n following parameters: `li2012(dt1 = 2, dt2 = 3, R = 2, Zu = 10, hmin = 5, speed_up = 12)` \n NOTE : The operation is in place and can not be reverted, the old values\n of the point cloud will be deleted!\n\n\n#### Value\n\nA catalog where each chunk has additional `treeID` values indicating the belonging tree.\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_segmentation(ctg, tile_location)\n```\n:::\n\n\n\n### `lfa_set_flag`\n\nSet a flag to indicate the completion of a specific process.\n\n\n#### Arguments\n\nArgument |Description\n------------- |----------------\n`flag_name` | A character string specifying the name of the flag file. It should be a descriptive and unique identifier for the process being flagged.\n\n\n#### Description\n\nThis function creates a hidden flag file at a specified location within the working directory to indicate that a particular processing step has been completed. If the flag file already exists, a warning is issued.\n\n\n#### Value\n\nThis function does not have a formal return value.\n\n\n#### Examples\n\n::: {.cell}\n\n```{.r .cell-code}\n# Set the flag for a process named \"data_processing\"\nlfa_set_flag(\"data_processing\")\n```\n:::\n\n\n#### Usage\n\n::: {.cell}\n\n```{.r .cell-code}\nlfa_set_flag(flag_name)\n```\n:::\n\n\n\n", "supporting": [ "report_files" ], diff --git a/results/_freeze/report/figure-pdf/fig-patches-nrw-1.pdf b/results/_freeze/report/figure-pdf/fig-patches-nrw-1.pdf index 903da7d0f8a2a8f83f1f41885c82ac6171fed1f9..ffa415168d9132ce7930db3ea7b3b347f682e529 100644 GIT binary patch delta 183 zcmbQRf_cIU<_!-ItD6`Y8k!lJ85o%uYjWxP=BKzMmZU0ZxL6qhg%EO^IgWJlaw7>$ zo^w$GnZJ4e#ZQd!uI4T#rfx1qu5QL|PDbX&W{#E?t_Ciq2F3=iZsvx@b_zBGmBcF8 d*>M$@Bo>ua6s4wdnHpP|8gi+sy863u0RThbFopmC delta 183 zcmbQRf_cIU<_!-Is~Z~{8Cn{en;ILKYI5oO=BKzMmZU0ZxL6q(7$M1R<~Y*H%Z(5) zn>^>D1Tuf~{)?X&2r7wH eu(RVTE=epZsVGWK<1#h2Fg4^-Rdw}u;{pJZelfHF diff --git a/results/appendix/package-docs/docs.qmd b/results/appendix/package-docs/docs.qmd index 2145462..3793bf7 100644 --- a/results/appendix/package-docs/docs.qmd +++ b/results/appendix/package-docs/docs.qmd @@ -126,6 +126,81 @@ lfa_check_flag(flag_name) +### `lfa_create_stacked_distributions_plot` + +Create a stacked distribution plot for tree detections, visualizing the distribution + of a specified variable on the x-axis, differentiated by another variable. + + +#### Arguments + +Argument |Description +------------- |---------------- +`trees` | A data frame containing tree detection data. +`x_value` | A character string specifying the column name used for finding the values on the x-axis of the histogram. +`fill_value` | A character string specifying the column name by which the data are differentiated in the plot. +`bin` | An integer specifying the number of bins for the histogram. Default is 100. +`ylab` | A character string specifying the y-axis label. Default is "Amount trees." +`xlim` | A numeric vector of length 2 specifying the x-axis limits. Default is c(0, 100). +`ylim` | A numeric vector of length 2 specifying the y-axis limits. Default is c(0, 1000). +`title` | The title of the plot. + + +#### Description + +This function generates a stacked distribution plot using the ggplot2 package, + providing a visual representation of the distribution of a specified variable + ( `x_value` ) on the x-axis, with differentiation based on another variable + ( `fill_value` ). The data for the plot are derived from the provided `trees` + data frame. + + +#### Keyword + +data + + +#### Seealso + +[`ggplot2::geom_histogram`](#ggplot2::geomhistogram) , [`ggplot2::facet_wrap`](#ggplot2::facetwrap) , + [`ggplot2::ylab`](#ggplot2::ylab) , [`ggplot2::scale_fill_brewer`](#ggplot2::scalefillbrewer) , + [`ggplot2::coord_cartesian`](#ggplot2::coordcartesian) + + +#### Value + +A ggplot object representing the stacked distribution plot. + + +#### Examples + +```{r} +#| eval: false +# Create a stacked distribution plot for variable "Z," differentiated by "area" +trees <- lfa_get_detections() +lfa_create_stacked_distributions_plot(trees, "Z", "area") +``` + + +#### Usage + +```{r} +#| eval: false +lfa_create_stacked_distributions_plot( + trees, + x_value, + fill_value, + bin = 100, + ylab = "Amount trees", + xlim = c(0, 100), + ylim = c(0, 1000), + title = + "Histograms of height distributions between species 'beech', 'oak', 'pine' and 'spruce' divided by the different areas of Interest" +) +``` + + + ### `lfa_create_tile_location_objects` Create tile location objects @@ -318,6 +393,53 @@ lfa_download(species, name, location) +### `lfa_get_all_areas` + +Retrieve a data frame containing all species and corresponding areas. + + +#### Description + +This function scans the "data" directory within the current working directory to + obtain a list of species. It then iterates through each species to retrieve the list + of areas associated with that species. The resulting data frame contains two columns: + "specie" representing the species and "area" representing the corresponding area. + + +#### Keyword + +data + + +#### Seealso + +[`list.dirs`](#list.dirs) + + +#### Value + +A data frame with columns "specie" and "area" containing information about + all species and their associated areas. + + +#### Examples + +```{r} +#| eval: false +# Retrieve a data frame with information about all species and areas +all_areas_df <- lfa_get_all_areas() +``` + + +#### Usage + +```{r} +#| eval: false +lfa_get_all_areas() +``` + + + ### `lfa_get_detection_area` Get Detection for an area @@ -436,64 +558,6 @@ lfa_get_detections_species(species) -### `lfa_get_detections` - -Retrieve aggregated detection data for multiple species. - - -#### Concept - -data retrieval functions - - -#### Description - -This function obtains aggregated detection data for multiple species by iterating - through the list of species obtained from [`lfa_get_species`](#lfagetspecies) . For each - species, it calls [`lfa_get_detections_species`](#lfagetdetectionsspecies) to retrieve the - corresponding detection data and aggregates the results into a single data frame. - The resulting data frame includes columns for the species, tree detection data, - and the area in which the detections occurred. - - -#### Keyword - -aggregation - - -#### Seealso - -[`lfa_get_species`](#lfagetspecies) , [`lfa_get_detections_species`](#lfagetdetectionsspecies) - - Other data retrieval functions: - [`lfa_get_species`](#lfagetspecies) - - -#### Value - -A data frame containing aggregated detection data for multiple species. - - -#### Examples - -```{r} -#| eval: false -lfa_get_detections() - -# Retrieve aggregated detection data for multiple species -detections_data <- lfa_get_detections() -``` - - -#### Usage - -```{r} -#| eval: false -lfa_get_detections() -``` - - - ### `lfa_get_flag_path` Get the path to a flag file indicating the completion of a specific process. @@ -534,63 +598,6 @@ lfa_get_flag_path(flag_name) -### `lfa_get_species` - -Get a list of species from the data directory. - - -#### Concept - -data retrieval functions - - -#### Description - -This function retrieves a list of species by scanning the "data" directory - located in the current working directory. - - -#### Keyword - -data - - -#### References - -This function relies on the [`list.dirs`](#list.dirs) function for directory listing. - - -#### Seealso - -[`list.dirs`](#list.dirs) - - Other data retrieval functions: - [`lfa_get_detections`](#lfagetdetections) - - -#### Value - -A character vector containing the names of species found in the "data" directory. - - -#### Examples - -```{r} -#| eval: false -# Retrieve the list of species -species_list <- lfa_get_species() -``` - - -#### Usage - -```{r} -#| eval: false -lfa_get_species() -``` - - - ### `lfa_ground_correction` Correct the point clouds for correct ground imagery @@ -1145,6 +1152,47 @@ lfa_rd_to_results() +### `lfa_read_area_as_catalog` + +Read LiDAR data from a specified species and location as a catalog. + + +#### Arguments + +Argument |Description +------------- |---------------- +`specie` | A character string specifying the species of interest. +`location_name` | A character string specifying the name of the location. + + +#### Description + +This function constructs the file path based on the specified `specie` and `location_name` , + lists the directories at that path, and reads the LiDAR data into a `lidR::LAScatalog` . + + +#### Value + +A `lidR::LAScatalog` object containing the LiDAR data from the specified location and species. + + +#### Examples + +```{r} +#| eval: false +lfa_read_area_as_catalog("beech", "location1") +``` + + +#### Usage + +```{r} +#| eval: false +lfa_read_area_as_catalog(specie, location_name) +``` + + + ### `lfa_segmentation` Segment the elements of an point cloud by trees diff --git a/results/methods/distribution-analysis.qmd b/results/methods/distribution-analysis.qmd new file mode 100644 index 0000000..8db6bf6 --- /dev/null +++ b/results/methods/distribution-analysis.qmd @@ -0,0 +1,20 @@ +## Analysis of different distributions + +Analysis of data distributions is a critical aspect of our research, with a focus on comparing two or more distributions. Our objective extends beyond evaluating the disparities between species; we also aim to assess differences within a species. To gain a comprehensive understanding of the data, we employ various visualization techniques, including histograms, density functions, and box plots. + +In tandem with visualizations, descriptive statistics, such as means, standard errors, and quantiles, are leveraged to provide key insights into the central tendency and variability of the data. + +For a more quantitative analysis of distribution dissimilarity, statistical tests are employed. The Kullback-Leibler (KL) difference serves as a measure to compare the similarity of a set of distributions. This involves converting distributions into their density functions, with the standard error serving as the bandwidth. The KL difference is calculated for each pair of distributions, as it is asymmetric. For the two distributions the KL difference is defined as following [@kullback1951kullback]: + +$$ +D_{KL}(P \, \| \, Q) = \sum_i P(i) \log\left(\frac{P(i)}{Q(i)}\right) +$$ + +To obtain a symmetric score, the Jensen-Shannon Divergence (JSD) is utilized [@grosse2002analysis], expressed by the formula: + +$$ +JS(P || Q) = \frac{1}{2} * KL(P || M) + \frac{1}{2} * KL(Q || M) +$$ +Here, $M = \frac{1}{2} * (P + Q)$. The JSD provides a balanced measure of dissimilarity between distributions [@Brownlee2019Calculate]. For comparing the different scores to each other, we will use averages. + +Additionally, the Kolmogorov-Smirnov Test is implemented to assess whether two distributions significantly differ from each other. This statistical test offers a formal evaluation of the dissimilarity between empirical distribution functions. diff --git a/results/references.bib b/results/references.bib index 00c74ca..af03977 100644 --- a/results/references.bib +++ b/results/references.bib @@ -34,4 +34,19 @@ @article{popescu2004 doi = {10.14358/PERS.70.5.589} } @misc{Blickensdoerfer2022, title={Dominant tree species for Germany (2017/2018)}, url={https://atlas.thuenen.de/layers/geonode:Dominant_Species_Class}, journal={Waldatlas- Wald und Waldnutzung}, publisher={Thünen Atlas}, author={Blickensdoerfer, Lukas}, year={2022}, month={Dec}} - +@misc{kullback1951kullback, + title={Kullback-leibler divergence}, + author={Kullback, Solomon}, + year={1951} +} +@article{grosse2002analysis, + title={Analysis of symbolic sequences using the Jensen-Shannon divergence}, + author={Grosse, Ivo and Bernaola-Galv{\'a}n, Pedro and Carpena, Pedro and Rom{\'a}n-Rold{\'a}n, Ram{\'o}n and Oliver, Jose and Stanley, H Eugene}, + journal={Physical Review E}, + volume={65}, + number={4}, + pages={041905}, + year={2002}, + publisher={APS} +} +@misc{Brownlee2019Calculate, title={How to calculate the KL divergence for Machine Learning}, url={https://machinelearningmastery.com/divergence-between-probability-distributions/}, journal={MachineLearningMastery.com}, author={Brownlee, Jason}, year={2019}, month={Oct}} diff --git a/results/report.qmd b/results/report.qmd index a9cef1d..d10046a 100644 --- a/results/report.qmd +++ b/results/report.qmd @@ -9,7 +9,21 @@ toc-title: Contents number-sections: true number-depth: 3 date: today -author: Jakob Danel and Frederick Bruch +author: + - name: Jakob Danel + email: jakob.danel@uni-muenster.de + url: https://github.com/jakobdanel + affiliations: + - name: Universität Münster + city: Münster + country: Germany + - name: Federick Bruch + email: f_bruc03@uni-muenster.de + url: https://www.uni-muenster.de/Geoinformatics/institute/staff/index.php/351/Frederick_Bruch + affiliations: + - name: Universität Münster + city: Münster + country: Germany bibliography: references.bib execute-dir: .. prefer-html: true @@ -23,7 +37,7 @@ This report documents the analysis of forest data for different tree species. {{< include methods/data-aquisition.qmd >}} {{< include methods/preprocessing.qmd >}} - +{{< include methods/distribution-analysis.qmd >}} # Results {{< include results/researched-areas.qmd >}} @@ -43,6 +57,8 @@ This report documents the analysis of forest data for different tree species. |spruce |oberhundem | 0.0162678| |spruce |osterwald | 0.0129892| + + # References ::: {#refs}