Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: #721

Open
awb99 opened this issue Nov 26, 2024 · 8 comments
Open

[Bug]: #721

awb99 opened this issue Nov 26, 2024 · 8 comments
Labels
bug Something isn't working triage

Comments

@awb99
Copy link

awb99 commented Nov 26, 2024

What version of Datahike are you using?

0.6.1594

What version of Java are you using?

openjdk 19.0.2 2023-01-17

What operating system are you using?

GUIX

What database EDN configuration are you using?

{:store {:backend :file
:path "data/datahike-db"}
:keep-history? false
:schema-flexibility :write }

Describe the bug

This is what I did:

  1. Backup datahike db datums to cbor. Backup konserve db to edn.
  2. update dependencies
    io.replikativ/konserve {:mvn/version "0.7.311"} ==> io.replikativ/konserve {:mvn/version "0.7.319"}
    io.replikativ/datahike {:mvn/version "0.6.1542"} ==> io.replikativ/datahike {:mvn/version "0.6.1594"}
  3. Then I started the import of the cbor datums dump, and then got this exception:

I guess this exception is good, as it means that datahike is much faster, and so I must hit some kind
of filesytem limit. I will investigate how to fix the filesystem limit.

2024-11-26T20:25:20.039Z nuc12 ERROR [datahike.writer:129] - Writer thread shutting down because of commit error. #error {
 :cause "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
 :via
 [{:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type java.nio.file.FileSystemException
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :at [sun.nio.fs.UnixException translateToIOException "UnixException.java" 100]}]
 :trace
 [[sun.nio.fs.UnixException translateToIOException "UnixException.java" 100]
  [sun.nio.fs.UnixException rethrowAsIOException "UnixException.java" 106]
  [sun.nio.fs.UnixException rethrowAsIOException "UnixException.java" 111]
  [sun.nio.fs.UnixFileSystemProvider newAsynchronousFileChannel "UnixFileSystemProvider.java" 200]
  [java.nio.channels.AsynchronousFileChannel open "AsynchronousFileChannel.java" 259]
  [java.nio.channels.AsynchronousFileChannel open "AsynchronousFileChannel.java" 323]
  [konserve.filestore.BackingFilestore _create_blob "filestore.clj" 103]
  [konserve.impl.defaults$update_blob$fn__33460$state_machine__23898__auto____33503$fn__33506 invoke "defaults.cljc" 89]
  [konserve.impl.defaults$update_blob$fn__33460$state_machine__23898__auto____33503 invoke "defaults.cljc" 58]
  [clojure.core.async.impl.ioc_macros$run_state_machine invokeStatic "ioc_macros.clj" 972]
  [clojure.core.async.impl.ioc_macros$run_state_machine invoke "ioc_macros.clj" 971]
  [clojure.core.async.impl.ioc_macros$run_state_machine_wrapped invokeStatic "ioc_macros.clj" 976]
  [clojure.core.async.impl.ioc_macros$run_state_machine_wrapped invoke "ioc_macros.clj" 974]
  [konserve.impl.defaults$update_blob$fn__33460 invoke "defaults.cljc" 58]
  [clojure.lang.AFn run "AFn.java" 22]
  [java.util.concurrent.ThreadPoolExecutor runWorker "ThreadPoolExecutor.java" 1144]
  [java.util.concurrent.ThreadPoolExecutor$Worker run "ThreadPoolExecutor.java" 642]
  [clojure.core.async.impl.concurrent$counted_thread_factory$reify__18508$fn__18509 invoke "concurrent.clj" 29]
  [clojure.lang.AFn run "AFn.java" 22]
  [java.lang.Thread run "Thread.java" 1589]]}
2024-11-26T20:25:20.039Z nuc12 ERROR [modular.system:22] - Exception running run-fn:  #error {
 :cause "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
 :via
 [{:type clojure.lang.ExceptionInfo
   :message "clojure.lang.ExceptionInfo: data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files {}"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type java.util.concurrent.ExecutionException
   :message "clojure.lang.ExceptionInfo: data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files {}"
   :at [java.util.concurrent.CompletableFuture reportGet "CompletableFuture.java" 396]}
  {:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type clojure.lang.ExceptionInfo
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :data {}
   :at [superv.async$throw_if_exception_ invokeStatic "async.cljc" 93]}
  {:type java.nio.file.FileSystemException
   :message "data/datahike-db/16db52a2-633d-58bf-a451-acb5cca5d960.ksv.new: Too many open files"
   :at [sun.nio.fs.UnixException translateToIOException "UnixException.java" 100]}]
 :trace
 [[sun.nio.fs.UnixException translateToIOException "UnixException.java" 100]
  [sun.nio.fs.UnixException rethrowAsIOException "UnixException.java" 106]
  [sun.nio.fs.UnixException rethrowAsIOException "UnixException.java" 111]
  [sun.nio.fs.UnixFileSystemProvider newAsynchronousFileChannel "UnixFileSystemProvider.java" 200]
  [java.nio.channels.AsynchronousFileChannel open "AsynchronousFileChannel.java" 259]
  [java.nio.channels.AsynchronousFileChannel open "AsynchronousFileChannel.java" 323]
  [konserve.filestore.BackingFilestore _create_blob "filestore.clj" 103]
  [konserve.impl.defaults$update_blob$fn__33460$state_machine__23898__auto____33503$fn__33506 invoke "defaults.cljc" 89]
  [konserve.impl.defaults$update_blob$fn__33460$state_machine__23898__auto____33503 invoke "defaults.cljc" 58]
  [clojure.core.async.impl.ioc_macros$run_state_machine invokeStatic "ioc_macros.clj" 972]
  [clojure.core.async.impl.ioc_macros$run_state_machine invoke "ioc_macros.clj" 971]
  [clojure.core.async.impl.ioc_macros$run_state_machine_wrapped invokeStatic "ioc_macros.clj" 976]
  [clojure.core.async.impl.ioc_macros$run_state_machine_wrapped invoke "ioc_macros.clj" 974]
  [konserve.impl.defaults$update_blob$fn__33460 invoke "defaults.cljc" 58]
  [clojure.lang.AFn run "AFn.java" 22]
  [java.util.concurrent.ThreadPoolExecutor runWorker "ThreadPoolExecutor.java" 1144]
  [java.util.concurrent.ThreadPoolExecutor$Worker run "ThreadPoolExecutor.java" 642]
  [clojure.core.async.impl.concurrent$counted_thread_factory$reify__18508$fn__18509 invoke "concurrent.clj" 29]
  [clojure.lang.AFn run "AFn.java" 22]
  [java.lang.Thread run "Thread.java" 1589]]}

What is the expected behaviour?

no crash

How can the behaviour be reproduced?

I can share a private repo with the dataset that produces the error.

@awb99 awb99 added bug Something isn't working triage labels Nov 26, 2024
@awb99
Copy link
Author

awb99 commented Nov 26, 2024

ulimit -n
returns 1024.

@awb99
Copy link
Author

awb99 commented Nov 26, 2024

testing if ulimit -n 2048 fixes it.

ulimit -Hn
returns 4096.

running with ulimit -n 4096 does not fix the issue.

@awb99
Copy link
Author

awb99 commented Nov 26, 2024

I am now at the hard limit of open files of a user, which I cannot easily change.
I am pretty sure this is an issue on datahikes side.

@whilo
Copy link
Member

whilo commented Nov 27, 2024

Hey @awb99. How big is your database? Probably the transact calls in migrate should be chunked.

@awb99
Copy link
Author

awb99 commented Nov 27, 2024

The eavt dump is 12.8MB. The import on the old datahike version worked fine.
I will refactor to import junks. The how many datoms do you think is a reasonably size?
1000? 10000? 100000? Thanks!

@whilo
Copy link
Member

whilo commented Nov 28, 2024

The branching factor is 512, which should approximately yield one file per index, so with history 6, without 3. (6/512)*100000=1171.875, so this would be too much for ulimit 1024. I think 10k should be safe and fast enough.

@awb99
Copy link
Author

awb99 commented Dec 2, 2024

I got it to work now! batching solved it! Thanks @whilo

BTW: datahikes migrate prints that it does batches, but in reality it does not.

When Importing from cbor, is it still necessary to set :max-eiid and :max-tx ?
I did do some logging, and it seems that the tx reports that I get back update this automatically.

@whilo
Copy link
Member

whilo commented Dec 2, 2024

Great! I am not sure about :max-eid :max-tx. @yflim do you remember why you put it there? In case you have a modified import function feel free to open a PR, I think we can stick to 10k batching as a default for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants