Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AHM Analysis Performance #318

Open
lars-petter-hauge opened this issue Apr 22, 2021 · 6 comments
Open

AHM Analysis Performance #318

lars-petter-hauge opened this issue Apr 22, 2021 · 6 comments

Comments

@lars-petter-hauge
Copy link
Contributor

The work in #231 will introduce a workflow with some possible caveats in regards to performance.

  1. The workflow runs the update step n*2 amount of times, where n represents the number of observations. This will take a significant amount of time for cases with a high number of observations.

  2. We are loading the grid into a pandas dataframe and applying functions. For large grid files this could result in memory issues.

  3. The Field parameters are also not available in current ERT API, thus the job needs to export parameters to disk and load them.

@lars-petter-hauge
Copy link
Contributor Author

To address the amount of observations, we could consider allowing the user to specify a list of observations which they would like to include in the analysis (the remaining observations could all be included or excluded based on preference)

@mareco701
Copy link
Contributor

We could also give the choice to the user to include or not Field parameter in the evaluation to at least get access to the other parameters results if FIELD parameter makes the script to fail.

@oyvindeide
Copy link
Contributor

Now that this has been tested a while, are there any observations in terms of the performance we should address @mareco701?

@mareco701
Copy link
Contributor

Hei, yes the memory issue is quite a problem and makes the script fail for cases with Field parameters in quite some cases (for instance for the Drogon synthetic case where several field parameters are used and also if a large grid file is used). The way the script is today all the field grids parameters for all the n*2+1 observations are stored in one dataframe.

@berland
Copy link
Collaborator

berland commented Dec 13, 2023

@dafeda , do you have any possible input on whether there is a reason to keep this issue open?

@dafeda
Copy link
Contributor

dafeda commented Dec 13, 2023

Performance of the AHM analysis was recently discussed so this might still be relevant. Any thoughts @oyvindeide ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants