Skip to content
This repository has been archived by the owner on Apr 11, 2022. It is now read-only.

Substitute functions from raster package for terra #4

Open
Jo-Schie opened this issue Oct 18, 2021 · 10 comments
Open

Substitute functions from raster package for terra #4

Jo-Schie opened this issue Oct 18, 2021 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@Jo-Schie
Copy link
Member

Jo-Schie commented Oct 18, 2021

Terra package is the successor of raster and is supposedly better to handle large spatial datasets. I suggest that we substitute all raster functions with terra functions. This might enable us to also process large data without being reliant on GRASS and the rather complicated installation of all the dependencies. I created a small test where I benchmark the performances. You can see that terraseems to even outperform GRASSwhich is quite surprising to me, however I did just compare them for simple zonal statistics.

library(raster)
library(terra)
library(rgrass7)

set.seed(10)

## COMPARE RASTER TO RASTER ZONAL
# zonal statistics using the raster package
r <- raster::raster(ncols=4000, nrows=4000)
raster::values(r) <- runif(ncell(r)) * 1:ncell(r)
z <- r
raster::values(z) <- rep(1:5, each=raster::ncell(z)/5)
# for large files, use a character value rather than a function
system.time({ raster::zonal(r, z, 'sum') })

# zonal statistics using the terra package
x <- terra::rast(nrows=4000, ncols=4000)
terra::values(x) <- runif(ncell(x)) * 1:ncell(x)
zz <- x
terra::values(zz) <- rep(1:5, each=raster::ncell(z)/5)
# calculate zonal
system.time({ terra::zonal(x, zz, 'sum') })

terra::writeRaster(x,"test_raster.tif",overwrite=T)
terra::writeRaster(zz,"test_zonal.tif",overwrite=T,datatype="INT4U")

# zonal statistics using GRASS
initGRASS(gisBase = "/usr/lib/grass78/",
          home = "./",
          addon_base = "./data-raw/addons", 
          override = T)

# Import raster to GRASS
execGRASS(
  "r.in.gdal",
  flags = c("overwrite", "o"),
  parameters = list(input = "test_raster.tif",
                    output = "myraster")
)
# set grass region from raster
execGRASS("g.region",
          parameters = list(raster = "myraster"))


execGRASS(
  "r.in.gdal",
  flags = c("overwrite", "o"),
  parameters = list(input = "test_zonal.tif",
                    output = "myzones")
)

system.time({
execGRASS(
  "r.stats.zonal",
  flags = c("overwrite"),
  parameters = list(base = "myzones",
                    cover = "myraster",
                    method = "sum", output = "mystats")
)
})

@goergen95 : I will try to do a first draft converting some of the scripts when I find a suitable time-spot in the upcoming days. You might want to confirm, nevertheless, the benchmarks and then eventually help and check if I get stuck.

@Ohm-Np : This is just for your information. You can use the package as is in the meantime.

@Jo-Schie Jo-Schie added the enhancement New feature or request label Oct 18, 2021
@Jo-Schie Jo-Schie self-assigned this Oct 18, 2021
@Jo-Schie
Copy link
Member Author

Just for your information @goergen95 . @Ohm-Np and me are currently working on this. Hope to get it ready soon to test it on big data. @Ohm-Np . Please let us know progress or eventual issues in this threat here...

@goergen95
Copy link
Member

Please consider #6 and #7 when reworking the routine for the CO2 layer calculation.

@Jo-Schie
Copy link
Member Author

Jo-Schie commented Dec 3, 2021

@Ohm-Np and @goergen95 : Hi Om, you do not need to consider #6 and #7 for reworking the Terra routines.

@Ohm-Np
Copy link

Ohm-Np commented Dec 3, 2021

Hi @Jo-Schie,

AreaCalc is working fine now.

But I got some error with LossCalc while running this script:

# calculation of forest loss area
loss_area = LossCalc(inputForestMap = treeCover_yearly,
                     inputLossMap = lossYear,
                     studysite = result_latlon,
                     latlon = TRUE,
                     polyName = "WDPAID",
                     ncores = 4,
                     years = 2000:2020)
# the error
Error in names(x) <- value : 
  'names' attribute [21] must be the same length as the vector [1]

# error in this part of the script LossCalc.R
colnames(LossStats) = paste0("loss_",unis+2000)

What could be the case?
If you want to test, you can log in to R with my account.

@Jo-Schie
Copy link
Member Author

Jo-Schie commented Dec 5, 2021

Hi @Ohm-Np . I was able to identify the error and fix it. LossCalc should work now. I already pushed the few changes from to the branch. CO2Calc currently does not work but that will require anyways changes from @goergen95 side on the download functions AFAIK.

Could you do two things for me

  • test it on windows (script see below)
  • check for validity of the results (maybe use qgis or something like that to very quickly check whether areaCalc and lossCalc produce correct results.
library(sf)
library(terra)
remotes::install_github("mapme-initiative/mapme.forest",ref = "terra")
library(mapme.forest)


# ---- test download function and prepare data -----
# read in polygons of interest
aoi = st_read(system.file("extdata", "aoi_polys.gpkg", package = "mapme.forest"))

# download GFW data for the area of interest
# raster_files = downloadfGFW(shape = aoi,
#                             basename = "pkgTest",
#                             dataset = "GFC-2018-v1.6",
#                             outdir = "../../datalake/tempdir/pkg_test",
#                             keepTmpFiles = T)
raster_files<-rast(paste0("../../datalake/tempdir/pkg_test/",list.files("../../datalake/tempdir/pkg_test/")))

# we need to reproject the aoi object
aoi = st_transform(aoi, crs = st_crs(raster_files))

# crop the rasters to the extent of the aoi
rasters=terra::crop(raster_files, aoi)

# assigning proper names to the layers and simply plot the results
names(rasters) = c("co2_emission_FAKE", "lossyear", "treecover")
plot(rasters)


#----- test prepTC -----
prep_treeCover = prepTC(inputForestMap = rasters$treecover,
                        thresholdCover = 75,
                        thresholdClump = 20)

# ----- test Yearly forest mask -----
yearlyForestMask = getTM(inputForestMap = prep_treeCover,
                         inputLossMap = rasters$lossyear,
                         years = 2001:2018)

# ---- test Area Calc -----
result_latlon = AreaCalc(yearlyForestMask,
                         studysite = aoi,
                         latlon=TRUE,
                         polyName = "id",
                         ncores = 1,
                         years = 2001:2018)

# ----- test LossCalc -----
loss_area = LossCalc(inputForestMap = yearlyForestMask,
                     inputLossMap = rasters$lossyear,
                     studysite = result_latlon,
                     latlon = TRUE,
                     polyName = "id",
                     ncores = 1,
                     years = 2001:2018)

# ----- test CO2Calc ----
# currently not working
# co2_emission = CO2Calc(inputForestMap = yearlyForestMask,
#                        inputLossMap = rasters$lossyear,
#                        inputCO2Map = rasters$co2_emission_FAKE,
#                        studysite = result_latlon,
#                        polyName = "id",
#                        ncores = 1,
#                        years = 2001:2018)

@Jo-Schie
Copy link
Member Author

Jo-Schie commented Dec 5, 2021

i just realized on my private pc that the download function did not work properly there. So if you cannot download and open the data you might want to put this on a hold for a while and tell us here. Maybe we need to merge this branch first to main to get the changes that darius made to the download functions...

@goergen95
Copy link
Member

Hi, you still have errrors with the download function on your PC? I think there is an is sie relativ to Windows so please let me know if it works fo you?

@Jo-Schie
Copy link
Member Author

Hi Darius. no... I fixed it (see my pushes to this branch from Yesterday). However I am not able to fix an issue in the PrepTC function. It is somehow something with the reclassification of the raster in the very beginning of the script. It seems that the terra function does not produce the same output as the raster functions. You may wanna have a look at it. Otherwise all other functions seem to work. If the PrepTC issue can be fixed then I guess we can merge and substitute raster for terra and then do some benchmarks.

@Jo-Schie
Copy link
Member Author

The problem seems to be in the preprocess script in line 64

 # create classification matrix to apply thresholdCover value
    f = function(x) ifelse(x>=thresholdCover, 1, 0)
    inputForestMap = app(inputForestMap, f, cores=ncores)

@Jo-Schie
Copy link
Member Author

I already thought of using maybe the classify function from terra instead but I was not a 100% sure what this part of the script actually does so I didn't try it out yet.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants