-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Omniscape crashing with AssertionError: norm(G * v .- curr) / norm(curr) < 1.0e-6 #127
Comments
@mir123 This due to Circuitscape having some solution errors over the acceptable tolerance. Is this the full stacktrace? Omniscape should be printing information on the row and column at which the failure occurred when an error is thrown. Something about the source and resistance within a specific moving window solve is giving Circuitscape a hard time. cc @ranjanan |
I did get the following errors before:
After the Plotting the points shows they end up all in a line (here with resistance surface as background, yellow circle is radius) |
I am having a similar issue (I think) -- a similar error message ( nested task error: AssertionError: norm(matrix * lhs[:, i] .- rhs[:, i]) / norm(rhs[:, i]) < 1.0e-6) Julia version: Version 1.6.1
Unlike mir123, when I look at the location of the error-causing cells there doesn't appear to be any clear spatial pattern other than perhaps some moving windows containing a lot of no-data cells? |
@ptfreeman-csp unfortunately this is indeed a problem with Circuitscape. From what I'm seeing, it does seem to happen more when people are using large moving windows. You might consider cross posting this to the Circuitscape repo as an issue over there? |
@vlandau - ah well shoot. Thanks for taking the time to reply! I will look into cross posting this to Circuitscape and/or trying to reduce my moving window size to see if that ameliorates the issue. |
I'll add, this also isn't necessarily an issue with Circuitscape, but rather the linear solvers it's using. It's strange that you're getting these errors when using cholmod because that's supposed to be a direct brute-force solver. |
Cross-posting back here that I submitted an issue to Circuitscape here. I will say that I updated to the latest versions of Omniscape and Circuitscape and did get a slightly different error message but I don't have the wherewithal to understand it. Reposting here in case there's anything obvious I'm missing:
All inputs are here. I can give you access as needed. |
Another update on this and hoping that this might help someone ID what the source of the problem is. I examined the values in my source and resistance layers at the row/column combinations that are throwing the "moving window failed at" messages and the vast majority (but not all of them) are occurring where source values are NaN -- could this be a potential problem? I thought that these no-data/nan values were set to zero? And for what's it's worth -- almost all of these "failure points" are occurring in areas of relatively high resistance in our resistance surfaces (which in this case almost exclusively correspond to areas that are roads)
|
Good find @ptfreeman-csp. Thanks for taking a deeper look. I'll try to take a look this weekend. You may be onto something here. |
1 similar comment
Good find @ptfreeman-csp. Thanks for taking a deeper look. I'll try to take a look this weekend. You may be onto something here. |
Hey @vlandau & @ptfreeman-csp - Funny seeing you both chatting about this recently. I am re-doing some big runs in Omniscape and getting similar errors. I am going to keep digging but figure I'll post the trace here in case it's helpful
|
@gagecarto yup -- that is basically the same exact error I'm getting (I think) and it's an issue with the solver employed by Circuitscape but I don't know how to "hack it" to have a higher error tolerance. I am experimenting with shrinking my moving window sizes to see if/where I manage to get a successful run but it's been a bit of a shot in the dark. If @ranjanan and @vlandau have any suggestions on how to overcome the error in the solver component I would be thrilled. @gagecarto do your source inputs have a lot of missing/masked areas (e.g. over things like urban areas) that also have corresponding relatively high resistance values? Just curious if that's a common factor here... In general I also only get the crash right before the Omniscape run is done 😢 |
@ptfreeman-csp - Yep, our highly resistant areas are not null, but have been given high values
Now that I am further reviewing our input data, the ocean should be null, not high. We do want to leave our lakes and rivers as high but removing all those unnecessary ocean pixels will be my next step before a rerun |
@gagecarto We seem to have generally the same idea -- although I will say that I have plenty of areas that are "no-data" - we're working in the desert in southern California but our buffer area does extend out to no-data ocean and I've still been running into issues. Have you plotted the particular cells that Omniscape is failing at? Out of curiosity are the 'no-data' areas in your raster labeled as true NAs or are they NaN? |
@ptfreeman-csp - I've not plotted the problem cells yet. My no data values are -3.4028234663852886e+38
|
@gagecarto I have no idea if this is a potential issue or not but I am wondering if there's something funky going on with the no-data values. Can I ask how you generated your source and resistance layers (i.e. in R, GEE, something else?)? |
@ptfreeman-csp - Rasters were generated and export using Terrra in R. I've now had some successful runs with smaller subsets of the data. I am slowly increasing the extent and adjusting parameters, hoping to find out what's crashing the bigger runs. |
I generated my layers using the gdal raster calculator. |
@ptfreeman-csp @mir123 @vlandau - I struggled with these errors for many days. I then remembered I had as saved EC2 image from previous Omniscape runs in 2021. I fired this image up which was running Julia 1.6.2, Omniscape and other dependencies installed at that time. Everything is now working how it should. I pulled all the package folders etc if anyone wants a copy. |
I was able to re-create the environment/project from @gagecarto using his project.toml and manifest.toml files and I also got everything to run properly. |
Hi @gagecarto I am having a similar issue and would love the files to recreate the older environment if possible! |
Hi @coport - below is the contents of my Manifest.toml and Project.toml PROJECT.TOML
MANIFEST.TOML
|
Hi, sorry, I'm not the most proficient with programming. How does one use these files to install the correct versions in Julia? |
@coport I followed the section of this tutorial to recreate the environment that @gagecarto was working with. |
Confirmed that using @gagecarto 's environment wih Julia 1.6.2 the job runs successfuly. As posted on the issue in Circuitscape, the latest version of Circuitscape does not fix the issue. |
Have there been any developments on this issue? I'm running my job on a university supercomputer, and therefore changing the environment to match @gagecarto 's may be quite convoluted. Is there any way to access old versions without manually doing so? |
Hello |
We have had some luck getting rid of this error by removing "islands" of permeable pixels (>=1) that were surrounded by a "moat" of impervious pixels (NoData/source strength 0). In some cases the radius of the moving window seemed to get stuck in these islands and was causing this error. We changed our rasters to avoid creating islands and haven't had the error since. Sorry I don't have a good reproducible test case handy |
That certainly could square with the issues we were having given that we
had many permeable pixel islands surrounded by moats of impervious pixels!
…On Thu, Aug 10, 2023 at 3:28 PM Max ***@***.***> wrote:
We have had some luck getting rid of this error by removing "islands" of
permeable pixels (>1) that were surrounded by a "moat" of impervious pixels
(NoData/source strength 0). In some cases the radius of the moving window
seemed to get stuck in these islands and was causing this error. We changed
our rasters to avoid creating islands and haven't had the error since.
Sorry I don't have a good reproducible test case handy
—
Reply to this email directly, view it on GitHub
<#127 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AUE56Z7IDUNP4KK6BJMQYRTXUVOAJANCNFSM6AAAAAASLSAFGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Does anyone have advice on locating and mitigating these permeable pixel islands? With quite large rasters, I can't comb through values efficiently or effectively. |
Ultimately this issue lies with Circuitscape, so I'm going to close this given that this issue has been discussed over there. If someone can create a minimum working example of the "pixel island" issue in a single Advanced mode Circuitscape run, that may confirm whether this is the root cause, and will likely help speed up any potential resolution to the issue. The reason that this wasn't showing up in older versions of Omniscape is because those were using older versions of Circuitscape, which didn't include this assert check (which basically us used to confirm that the solution arrived at by the linear solver is close to correct). Once Omniscape's dep on Circuitscape was bumped up to the version that included those checks, this issue came to the surface. |
Have just run into this same issue (julia/1.9.4) for runs with 25 or 50 mile windows using resistance surfaces that are fine for smaller windows. Tinkered with small changes in block size, hoping it might nudge the windows past some problematic pixel arrangement, but to no effect. Am not relishing the idea of having to mess around with a painfully arrived at series of window/block cell sizes. Hopefully I can identify the cells and try @slamanders method of fixing them individually... |
@BortEdwards There has been an attempted fix in Circuitscape.jl, just released today. You'll see above your comment that one of the recent patches to Circuitscape mentioned this issue. Try updating Circuitscape to v5.13.3. |
This seems similar but not quite like #108 and #100, also see Circuitscape/Circuitscape.jl#305
Omniscape crashed at 91% of a run with a TaskFailedException. This is a fairly large job, with a 224M cells raster (resolution 30m) and a 2200 cell radius (block size 221). Memory usage with 32 cores topped at around 360 GB.
I had a successful run with the same resistance and source surfaces using a 200 cell radius (block size 21). Reducing resolution of surfaces to 90m with a radius of 777 also runs successfully.
Julia: 1.8.3
Omniscape: 0.5.8
config:
Keeping the default cg+amg solver.
Resistance file
Source file
Error message:
The text was updated successfully, but these errors were encountered: