Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem in HPC+warning: chol(): given matrix is not symmetric #371

Closed
AbdollahiAz opened this issue Jan 21, 2024 · 5 comments
Closed

Problem in HPC+warning: chol(): given matrix is not symmetric #371

AbdollahiAz opened this issue Jan 21, 2024 · 5 comments

Comments

@AbdollahiAz
Copy link

Hello,

I use SHAPR package in my project and run it on HPC, I see an error when I set n_batches=10, and memory =100 Gb and 1 node :
warning: chol(): given matrix is not symmetric

Also, I set n_batches=10, and memory =200 Gb and 10 nodes, again the error occurred:

Error in unserialize(node$con) : 
  MultisessionFuture (future_lapply-2) failed to receive message results from cluster RichSOCKnode #2 (PID 61198 on localhost 'localhost'). The reason reported was 'error reading from connection'. Post-mortem diagnostic: No process exists with this PID, i.e. the localhost worker is no longer alive. Detected a non-exportable reference ('externalptr') in one of the globals ('future.call.arguments' of class 'DotDotDotList') used in the future expression. The total size of the 8 globals exported is 364.10 KiB. The three largest globals are 'future.call.arguments' (98.45 KiB of class 'list'), '...future.FUN' (87.62 KiB of class 'function') and 'compute_preds' (62.97 KiB of class 'function')
Calls: explain ... resolved -> resolved.ClusterFuture -> receiveMessageFromWorker
Execution halted

In my code, I put/remove this library but the problem is not solved :

library(future)
future::plan(multisession, workers = 1)

Please let me know what is wrong in my code?

Thanks

@martinju
Copy link
Member

Hard to say. If you put future::plan(sequential), parallelization is disabled so at least the "Error in unserialize(node$con)" should not occur. The chol() warning suggests that some of your columns are (close to) linearly dependent on other columns (you may check this by checking whether some of the eigenvalues of your training data matrix are negative or close to zero.

@AbdollahiAz
Copy link
Author

Hi Martin,

1- I remove this library "future" and try to run it on HPC with 12 cores and 200Gb memory. It cannot give me an output. Do you suggest we should have HPCs with more than 200GB memory?
I think n_combinations has more effect on my code. I consider it 10000 according to your suggestion.
2- Furthermore, I try to solve problem according issue #226 . Please see my comments in #226 .
3- Also, I have another problem, I don't have NA data, but I see this note when I running code. If it is possible I send my data as private?

Note: Feature classes extracted from the model contains NA.
Assuming feature classes from the data are correct.

Thanks in advance

@martinju
Copy link
Member

200gb RAM is more than enough. There is probably some other issue with your data. You can send them to me on [email protected] and might be able to take a look at it some time next week. I promises, though.

@AbdollahiAz
Copy link
Author

Dear @martinju,

Hope you are doing well;
Please check your email. I have sent you my email with the attached dataset.

Kind regards,
Az

@martinju
Copy link
Member

I can't reproduce this with the latest version of shapr from github. Everything works as intended. Results and script sent by personal email.
The note is nothing to worry about, it is not an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants