You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a library that formats scientific data into a JSON schema called the Allotrope Standard Model (ASM)
The validation schemas are fairly large and complicated compared to other schemas I've seen in discussion boards, and are very modular, meaning there are a lot of references. In allotropy we store the ASM schemas directly, and remove all remote references, replacing them with local references under $defs.
We are finding that validating against the schemas using jsonschema version 4.18.0 takes ~20x longer than 4.17.0.
Hey there, I'm happy to have a look at this at some point, but is there a reason you're benchmarking against such an old version? Lots has changed since 4.18, so it'd be good if you shared numbers which were on 4.23.
Sorry, I didn't mention that I tested on every version between 4.18 and 4.23 to see if any had better performance. None of the versions past 4.18 improve the performance noticeably.
We have also experienced similar performance issue in one of our tool after switching from RefResolver to this library. This is the commit in our library: PolusAI/sophios#287
Hello!
I have a library that formats scientific data into a JSON schema called the
Allotrope Standard Model
(ASM)The validation schemas are fairly large and complicated compared to other schemas I've seen in discussion boards, and are very modular, meaning there are a lot of references. In
allotropy
we store the ASM schemas directly, and remove all remote references, replacing them with local references under$defs
.We are finding that validating against the schemas using
jsonschema
version4.18.0
takes ~20x longer than4.17.0
.As a concrete example:
Validating this data: https://raw.githubusercontent.com/Benchling-Open-Source/allotropy/refs/heads/main/tests/parsers/moldev_softmax_pro/testdata/MD_SMP_luminescence_endpoint_example08.json
Against this schema: https://github.com/Benchling-Open-Source/allotropy/blob/main/src/allotropy/allotrope/schemas/adm/plate-reader/REC/2024/06/plate-reader.schema.json
takes
~3.5s
on4.17.0
and~55s
on4.18.0
This translates to a runtime for all 26 tests in
tests/parsers/moldev_softmax_pro
of~30s
in4.17.0
to~6m
in4.18.0
The text was updated successfully, but these errors were encountered: