Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from CogStack:master #93

Open
wants to merge 1,112 commits into
base: master
Choose a base branch
from
Open

[pull] master from CogStack:master #93

wants to merge 1,112 commits into from

Conversation

pull[bot]
Copy link

@pull pull bot commented Apr 9, 2021

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot added ⤵️ pull merge-conflict Resolve conflicts manually labels Apr 9, 2021
antsh3k and others added 27 commits February 1, 2023 16:36
snomed refset processing update
Bumps [django](https://github.com/django/django) from 3.2.16 to 3.2.17.
- [Release notes](https://github.com/django/django/releases)
- [Commits](django/django@3.2.16...3.2.17)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
…ango-3.2.17

Bump django from 3.2.16 to 3.2.17 in /webapp/webapp
* CU-33g0f3w Pin aiohttp dependency version for docs

* CU-33g0f3w Pin aiohttp dependency version for docs (#303)

* CU-33g0f3w Pin aiohttp dependency version for docs in setup.py

* Read the docs build failures (#304)

* CU-33g0f3w Pin aiohttp dependency version for docs

* CU-33g0f3w Pin aiohttp dependency version for docs in setup.py

* CU-33g0f3w Pin blis dependency version for docs in setup.py
* CU-8677aud63 add options for loading meta models and addl NERs
* CU-8677aud63 reduce memory usage during test
…s into categories (#301)

* CU-862j5by9q Add metadata to regression suite, loaded from model card if/when specified. A model can be specified upon creation to get the model card from.

* CU-862j5by9q Remove f-string from string with no placeholders

* CU-862j5by9q Make regression case hashable

* CU-862j5by9q Add category separation to regression test suite along with automated tests and test example

* CU-862j5by9q Add missing docstringgs to category separation

* CU-862j5by9q Add saving to category separator and a convenience method for separation based on regression test YAML file and categories YAML file

* CU-862j5by9q Add missing docstrings to new methods

* CU-862j5by9q Fix typo in class name

* CU-862j5by9q Fix saving issue for separation results

* CU-862j5by9q Add runnable category separator

* CU-862j5by9q Separate some file location constants in separation tests

* CU-862j5by9q Add test for separation that checks that no information gets lost (in the specific situation)

* CU-862j5by9q Add an anything-goes category description

* CU-862j5by9q Fix anything-goes option

* CU-862j5by9q Add tests for anything-goes category description

* CU-862j5by9q Add possibility of using an overflow category when separating regression suite

* CU-862j5by9q Add use of the overflow category to the runnable

* CU-862j5by9q Fix linting and typing issues

* CU-862j5by9q Add test for each individual separated suite

* CU-862j5by9q Fix minor abstract class issues

* CU-862j5by9q Rename categoryseparation module as category_separation

* CU-862j5by9q Add docstrings to category_separator
Bumps [django](https://github.com/django/django) from 3.2.17 to 3.2.18.
- [Release notes](https://github.com/django/django/releases)
- [Commits](django/django@3.2.17...3.2.18)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
…ango-3.2.18

Bump django from 3.2.17 to 3.2.18 in /webapp/webapp
Add confusion matrix to meta model evaluation
* CU-862j7b9jc Add abstract base class to regression converting strategy where necessary

* CU-862j7b9jc Bump mypy to version 1.0.0
* CU-862j7b9jc Fix issue with duplicate imports

* CU-862j7b9jc Fix issue with no whitespace after keyword (E275)

* CU-862j7b9jc Remove unnecessary brackets from if statement
Make transformer_ner continue processing other entities after the first non-matching
* Expose example model card version in metadata test

* Add version detection along with tests

* Move to a more comprehensive version string parser (regex)

* Add more comprehensive versioning tests

* Move MedCAT unzip to a separate method

* Separate getting semantic version from string

* Add new CDB with version information and use that with versioning tests

* Add methods to get version info from CDB dump and model pack zip/folder

* Exposing CDB file name and adding custom dev patch version support

* Fix config.linking.filters.cuis - from empty dict to empty set

* Add logging to versioning

* Fix f-strings instead of (intended) r-strings

* Add creating model pack archive to versioning CDB fix

* Fix logger initialising

* Making versioning a runnable module that allows fixing the config

* Add docstrings to CLI methods

* CU-8677ge6j8 Make explicit check regards to empty dict when fixing config

* CU-8677ge6j8 Add tests regarding versioning changes

* CU-8677ge6j8 Add missing return type hint

* CU-8677ge6j8 Simplify action handling for CLI input

* CU-8677ge6j8 Simplifying archive making method
* NO-TICKET pin down transformers for the de-id model
* Added function to remove CUI from cdb

* Unit test for remove_cui
* Added function to remove CUI from cdb

---------

Co-authored-by: antsh3k <[email protected]>
Bumps [django](https://github.com/django/django) from 3.2.18 to 3.2.19.
- [Commits](django/django@3.2.18...3.2.19)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
…ango-3.2.19

Bump django from 3.2.18 to 3.2.19 in /webapp/webapp
CU-862jr8wkk Pin pydantic dependency to avoid conflicts with v2.0 (#318)
* CU-863gntc58 Add parent to child relationship getter to UMLS preprocessing

* CU-863gntc58 Only use ISA relationships

* Make sure parents do not have themselves as children

* CU-863gntc58 Only keep preferred names

* CU-863gntc58 Fix typing issues

* CU-863gntc58 Fix child-parent relationships being saved instea

* Better system for avoiding parent-child being the same
* Issue-325 Add check for old/new spacy; fix code for nested entities

* Issue-325 Fix a typing issue

* Issue-325 Improve nested entity extraction in _doc_to_out; add type hint for individual entities

* Issue-325 Remove unneccessary whitespace

* Issue-325 Move spacy version detection from cat to utils.helpers
* CU-2wgnqg5 Add javadoc to a method

* CU-2wgnqg5 Fix issues with typing

* CU-2wgnqg5 Add (potential) progress bar to regression testing

* CU-2wgnqg5 Add runnable regression checker with command line arguments

* CU-2wgnqg5 Add better help message for a CLI argument

* CU-2wgnqg5 Fix import to use proper namespace

* CU-2wgnqg5 Add parent-child functionality for filters

* CU-2wgnqg5 Add cui and children option to the config example

* Revert "CU-2wgnqg5 Fix import to use proper namespace"

This reverts commit 882be44.

* CU-2wgnqg5 Add default / empty children to translation layer

* CU-2wgnqg5 Remove use of deprecated warning method

* CU-2wgnqg5 Add new default test case that checks for 'heart rate' and its children 4 deep

* CU-2wgnqg5 Remove unneccessary TODO comment

* CU-2wgnqg5 Add possibility of using result reporting for regression checks

* CU-2wgnqg5 Fix issue with delegations not shown for reports

* CU-2wgnqg5 Add possibility of using reports for CLI regression testing

* CU-2wgnqg5 Fix minor typing issues

* CU-2wgnqg5 Fix typo in default regression config

* CU-2wgnqg5 Make sure imports work both when running directly as well as when using as part of the project

* CU-2wgnqg5 Add a new test case with the ANY strategy

* CU-2wgnqg5 Fixing imports so that absolute imports are used

* CU-2wgnqg5 Add new package to setup.py

* CU-2wgnqg5 Fix typing issues

* CU-2wgnqg5 Fix report output formating

* CU-2vzhd93 Remove logging tutorials (move to MedCATtutorials)

* CU-2wgnqg5 Move to a simpler filter design

* CU-2wgnqg5 Add (optional) per-phrase results to results/reporting

* CU-2wgnqg5 Add per-phrase information toggle to CLI

* CU-2wgnqg5 Fix method signature changes between inherited classes

* CU-2q50k3c: add contact email address.

* added latest release news / accepted paper

* Update README.md

* CU-2zj4czk Move to a class based linking filter approach

* CU-2zj4czk Move to identifier based linking filter access

* CU-2zj4czk Use MCT filters when training supervised

* New UMLS Full Model

* CU-2zj4czk Make sure excluded CUIs are always specified (even if by an empty set)

* CU-2zj4czk Add possibility of creating a copy of linking filters

* CU-2zj4czk Use copies of linking.filters in train_supervised and _print_stats

* CU-2zj4czk Add linking.filters merging functionality

* CU-2zj4czk Add parameter to retain MCT filters within train_supervised

* CU-2zj4czk Rename filters variable within print_stats method for better consistency and readability

* CU-2zj4czk Consolidate some duplicate code between train_supervised and _print_stats

* CU-2zj4czk Fix multi-project detection

* CU-2zj4czk Fix linking filter merging

* CU-2zj4czk Add tests for retaining filters from MCT along with a test-trainer export

* CU-2zj4czk Remove debug print outputs from some tests

* CU-2wgnqg5 Separate some of the regression code into different modules

* Add URL of paper for Dutch model (#275)

* CU-2wgnqg5 Add serialisation code along with tests

* CU-2wgnqg5 Fix regression checker and case serialisation and add tests

* CU-2wgnqg5 Add conversion code from MCT export to regression YAML along with tests

* CU-2wgnqg5 Fix minor import and typing issues

* CU-2wgnqg5 Add runnable to convert from MedCATtrainer to regression YAML

* CU-2wgnqg5 Add for number of cases read from MCT export

* CU-2wgnqg5 Add context selectors for conversion from MCT

* CU-2wgnqg5 Add use of context selector to converter

* CU-2wgnqg5 Add use of context selector to runnable

* CU-2wgnqg5 Fix issue with typing

* CU-2wgnqg5 Add regression case based progress bar in case the total of sub-cases is unknown

* CU-2wgnqg5 Make sure (and test) that only 1 replacement '%s' is in each phrase for regression tests

* CU-2wgnqg5 Add test cases for '%' replacement in context and some minor optimisation

* CU-2wgnqg5 Add option to not show empty cases in report

* CU-2wgnqg5 Fix verbose output mode/logging

* CU-2wgnqg5 Fix name clashes in test cases

* CU-2wgnqg5 Make conversion filter for both CUI and NAME

* CU-2wgnqg5 Use different approach for generating targets for regression cases

* CU-2wgnqg5 Add warning when no parent-child information is present (but continue to run)

* Fix issue with typing

* Add TODO comment regarding more comprehensive reporting

* Fix whitespace issue

* CU-2wgnqg5 Translation layer now able to confirm if a set of CUIs has a parent or child of a specified one

* CU-2wgnqg5 Add reasons for failure of a regression case

* CU-2wgnqg5 Make hiding failures a possibility from the CLI

* CU-2wgnqg5 Use better report output for failures with summary

* CU-2wgnqg5 Fix typing issues

* CU-2wgnqg5 Add description to failed cases where applicable

* CU-2wgnqg5 Fix successes not being reported on

* CU-2wgnqg5 Rename some fail reasons for better readability

* CU-2wgnqg5 Add test cases for specifeid CUI and name if/when none are found from the CDB

* CU-2wgnqg5 Add extra information (names) in case of failure becasue name not in CDB

* CU-2wgnqg5 Make converter consolidate different test cases with identical filters (CUI and name) into one with multiple phrases

* CU-2wgnqg5 Remove use of TargetInfo and using a tuple instead

* CU-2wgnqg5 Fix remnant targetinfo

* CU-2wgnqg5 Fix remnant targetinfo stuff

* CU-2wgnqg5 Fix remnant targetinfo in docstrings

* CU-2wgnqg5 Fix missing argumnet in docstrings

* CU-2wgnqg5 Allow only reports in regression checker

* CU-2wgnqg5 Add medcat.utils.regression level parent logger

* CU-2wgnqg5 Use medcat.utils.regression parent logger for verbose output in regression checker

* CU-2wgnqg5 Move from logger.warn to logger.warning

* CU-2wgnqg5 Fix issue with wrong targets being generated

* CU-2wgnqg5 Fix checking tests

* CU-2wgnqg5 Add dunder init to test (utils) packages to make the tests within discoverable

* CU-2wgnqg5 Fix serialisation tests (add missing argument)

* CU-2wgnqg5 Fix regression results tests (change method owner)

* CU-2wgnqg5 Fix regression results tests (make names ordered)

* CU-2wgnqg5 Remove unnecessary print output in test

* CU-2wgnqg5 Update conversion code to not use target info

* CU-2wgnqg5 Attempt to fix automated build on github actions (bin sklearn version)

* CU-2wgnqg5 Move from sklearn to scikit-learn dependency

* CU-2wgnqg5 Separate some code in converting, add docs

* CU-2wgnqg5 Make yaml dumping save for yaml representation of regression checker

* CU-2wgnqg5 Add initial editing code with some simple tests

* CU-2wgnqg5 Add possibility for combinations to ignore identicals

* CU-2wgnqg5 Add docs to the editing/combining methods

* CU-2wgnqg5 Add runnable python file for combining different regression YAMLs

* CU-2wgnqg5 Minor codebase improvements

* CU-2wgnqg5 Make FailReasons serializable

* CU-2wgnqg5 Add json output to regression checking

* Make stats reporting not have np.nan values on empty train count  (#277)

* CU-327vb66 make stats reporting not have np.nan values on empty train count
* CU-327vb66 start using scikit-learn instead of deprecated sklearn

* Bump django from 3.2.15 to 3.2.16 in /webapp/webapp

Bumps [django](https://github.com/django/django) from 3.2.15 to 3.2.16.
- [Release notes](https://github.com/django/django/releases)
- [Commits](django/django@3.2.15...3.2.16)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update ReadMe.md to show Licence change

Updated News Section

* CU-2wgnqg5 Add docstring to fail descriptor getter method

* CU-2wgnqg5 Removed handled TODO

* CU-33g09h4 Make strides towards PEP 257. Make all docstrings use triple double quotes; remove preceding whitespace from docstrings; remove raw-string docstrings where applicable; remove empty docstrings

* CU-2zj4czk Add documentation regarding config.linking.filters

* CU-2zj4czk Add test for leakage of extra_cui_filters

* CU-33g09h4 Remove leftover whitespace from start of docstring

* include joblib dep

* CU-2zj4czk Add parameter to retain extra_cui_filters (instead of MCT filters). Make sure tests pass.

* CU-33g09h4 Some docstring unification for config(s)

* CU-33g09h4 Some docstring unification for pipe, meta_cat and vocab

* CU-33g09h4 Some docstring unification for cdb

* CU-33g09h4 Some docstring unification for cdb maker

* CU-33g09h4 Some docstring unification for cdb and maker (Return: to Returns:)

* CU-33g09h4 Some docstring unification for cat

* CU-33g09h4 Fix typo in docstring

* CU-33g09h4 Some docstring unification for utils

* CU-33g09h4 Some docstring unification for tokenizers

* CU-33g09h4 Some docstring unification for preprocessors

* CU-33g09h4 Some docstring unification for NER parts

* CU-33g09h4 Some docstring unification for NEO parts

* CU-33g09h4 Some docstring unification for linking parts

* CU-33g09h4 Some docstring unification for cogstack connection part

* CU-33g09h4 Remove some leftover backticks from docstring types

* CU-33g09h4 Remove some leftover 'Return:' -> 'Returns:' changes

* CU-33g09h4 Fix typo in a return type name

* CU-384mewq match post release branches in the production workflow (#283)

* CU-346mpxm Add new JSON based (faster) serialization for CDB along with tests

* CU-346mpxm Add new package to setup.py; add logger and docstrings to serializer; remove dead code and comments

* CU-346mpxm Remove leftover codel; Fix type safety regarding optinal json path

* CU-346mpxm Add logging on writing to serializer

* CU-346mpxm Add logging on reading to serializer

* CU-346mpxm Make deserializing consistent with previous CDB deserialising

* CU-346mpxm Add JSON serialisation to CDB

* CU-346mpxm Remove issue with circular imports

* CU-346mpxm Make sure json files end with .json

* CU-346mpxm Add json type format to modelpack creation

* CU-346mpxm Add tests for json format modelpack creation

* CU-346mpxm Add logging output to model pack creation and loading

* CU-346mpxm Add model pack converter / runnable

* Update README.md

* CU-862hyd5wx Unify rosalind/vocab downloading in tests, identify and fail meaningfully in case of 503

* CU-862hyd5wx Remove unused imports in tests due to last commit

* CU-862hyd5wx Add possibility of generating and using a simply vocab when Rosalind is down

* CU-862hyd5wx Fix small typo in tests

* Loosen dependency restrictions (#289)

Signed-off-by: zethson <[email protected]>

Signed-off-by: zethson <[email protected]>

* bug found in snomed2OPCS func

* markdown improvements

* Mapping icd10 and opcs complete

* get all children func added

* pep8 fixes

* Update README.md

* Add confusion matrix to meta model evaluation

* CU-862j0jcdu / CU-862j0jd2n Cdb json (#295)

* CU-862j0jcdu Rename format parameter in model creation to specify it only applys to the CDB

* CU-862j0jd2n Add addl_info to be JSON serialised when required

* CU-862j0jd2n Add addl_info to docstring of CDB serializer

* CU-38g55wn / CU-39cmv82 Support for python3.11 (and 3.10) (#285)

* CU-38g55wn Move dependencies to (hopefully) support python 3.11 on Ubuntu

* CU-38g55wn Attempt to fix dependencies for github dependency (gensim)

* CU-38g55wn Attempt to fix dependencies for github dependency (gensim) x2

* CU-38g55wn Attempt to fix dependencies for github dependency (gensim) x3

* CU-38g55wn Attempt to fix dependencies for github dependency (gensim) x4

* CU-38g55wn Attempt to fix dependencies for github dependency (gensim) x5 - fix missing comma

* CU-38g55wn Remove errorenous package from setup.py

* CU-38g55wn Bump spacy version so as to (hopefully) fix pydantic issues

* CU-38g55wn Bump spacy en_core_web_md version so as to (hopefully) fix requirements issues

* CU-38g55wn Fix test typo that was fixed on newere en_core_web_md

* CU-38g55wn Fix small issue in NER test

* CU-38g55wn Fix small issue with NER test (int conversion)

* CU-38g55wn Mark some places as ignore where newer mypy complains

* CU-38g55wn Bump mypy dev requirement version

* CU-38g55wn Add python 3.11 and 3.10 to workflow

* CU-38g55wn Trying to install gensim over https rather tha ssh

* CU-38g55wn Make python versions strings in GH worfklow so 3.10 doesn't get 'rounded' to 3.10 when read

* CU-38g55wn Remove python 3.7 from workflow since it's not compatible with required versions of numpy and scipy

* CU-38g55wn Universally fixing NER test regarding the 'movar~viruse' -> 'movar~virus' thing

* CU-38g55wn Bump gensim version to 4.3.0 - the first to support 3.11

* CU-862hyd5wx Unify rosalind/vocab downloading in tests, identify and fail meaningfully in case of 503

* CU-862hyd5wx Remove unused imports in tests due to last commit

* CU-862hyd5wx Add possibility of generating and using a simply vocab when Rosalind is down

* CU-862hyd5wx Remove python 3.7 and add 3.10/3.11 to classifiers

* CU-862hyd5wx Reorder python versions in GitHub workflow

* CU-862hyd5wx Attempt to fix GHA by importing unittest.mock explicitly

* CU-39cmvru Faster hashing (#286)

* CU-39cmvru Add marking of CDB dirty if/when concepts change. Avoid calculating its hash separately if it hasn't been dirtied. Add tests to
verify behaviour.

* CU-39cmvru Add possibility to force recalculation of hash for CDB (inlcuding when getting hash for CAT)

* CU-39cmvru Add possibility to force recalculation of hash for CDB through modelcat creation (new parameter, propageting through _versioning)

* CU-39cmvru Remove previous hash from influencing hashing of CDB to produce consistent hash on every recalculation
Add tests to make sure that is the case on the CDB level as well as the CAT/modelpack level.

* CU-39cmvru Add logging around the (re)calclulation of the CDB hash

* CU-39cmvru Fix typo in log message

* CU-39cmvru Add test to make sure the CDB hash is saved to disk and loaded from disk

* CU-39cmvru Add possibility to calculate hash upon saving of CDB if/when the hash is unknown (i.e when saving outside a model pack)

* CU-39cmvru Add CDB dirty flag to all other methods that modify the CDB

* Change confusion matrix to DF and add labels

* Fix model config

* CU-86777ey74 No elastic dependency (#298)

* Removed elastic dependency

* CU-86777ey74 Remove module that depends on elastic (cogstack/cogstack_conn)

* CU-86777ey74 Remove medcat.cogstack package from setup.py packages

* Docstring updated to google-style docstring

* CU-2e77a2k Remove unused utility modules

* CU-2e77a2k Remove deprecated utils

* Bump django from 3.2.16 to 3.2.17 in /webapp/webapp

Bumps [django](https://github.com/django/django) from 3.2.16 to 3.2.17.
- [Release notes](https://github.com/django/django/releases)
- [Commits](django/django@3.2.16...3.2.17)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* CU-33g0f3w Read the docs build failures (#306)

* CU-33g0f3w Pin aiohttp dependency version for docs

* CU-33g0f3w Pin aiohttp dependency version for docs (#303)

* CU-33g0f3w Pin aiohttp dependency version for docs in setup.py

* Read the docs build failures (#304)

* CU-33g0f3w Pin aiohttp dependency version for docs

* CU-33g0f3w Pin aiohttp dependency version for docs in setup.py

* CU-33g0f3w Pin blis dependency version for docs in setup.py

* Add options for loading meta models and additional NERs (#300)

* CU-8677aud63 add options for loading meta models and addl NERs
* CU-8677aud63 reduce memory usage during test

* Style fix

* NO-TICKET reduce the false positives on pushing to test pypi (#307)

* CU-862j5by9q Regression touchup - metadata and ability to split suites into categories (#301)

* CU-862j5by9q Add metadata to regression suite, loaded from model card if/when specified. A model can be specified upon creation to get the model card from.

* CU-862j5by9q Remove f-string from string with no placeholders

* CU-862j5by9q Make regression case hashable

* CU-862j5by9q Add category separation to regression test suite along with automated tests and test example

* CU-862j5by9q Add missing docstringgs to category separation

* CU-862j5by9q Add saving to category separator and a convenience method for separation based on regression test YAML file and categories YAML file

* CU-862j5by9q Add missing docstrings to new methods

* CU-862j5by9q Fix typo in class name

* CU-862j5by9q Fix saving issue for separation results

* CU-862j5by9q Add runnable category separator

* CU-862j5by9q Separate some file location constants in separation tests

* CU-862j5by9q Add test for separation that checks that no information gets lost (in the specific situation)

* CU-862j5by9q Add an anything-goes category description

* CU-862j5by9q Fix anything-goes option

* CU-862j5by9q Add tests for anything-goes category description

* CU-862j5by9q Add possibility of using an overflow category when separating regression suite

* CU-862j5by9q Add use of the overflow category to the runnable

* CU-862j5by9q Fix linting and typing issues

* CU-862j5by9q Add test for each individual separated suite

* CU-862j5by9q Fix minor abstract class issues

* CU-862j5by9q Rename categoryseparation module as category_separation

* CU-862j5by9q Add docstrings to category_separator

* CU-8677craqe make transformer_ner continue processing other entities after the first non-matching

* Bump django from 3.2.17 to 3.2.18 in /webapp/webapp

Bumps [django](https://github.com/django/django) from 3.2.17 to 3.2.18.
- [Release notes](https://github.com/django/django/releases)
- [Commits](django/django@3.2.17...3.2.18)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* CU-862j7b9jc Mypy full release - 1.0.0 (#308)

* CU-862j7b9jc Add abstract base class to regression converting strategy where necessary

* CU-862j7b9jc Bump mypy to version 1.0.0

* CU-862j7b9jc Mypy abc hotfix (#311)

* CU-862j7b9jc Fix issue with duplicate imports

* CU-862j7b9jc Fix issue with no whitespace after keyword (E275)

* CU-862j7b9jc Remove unnecessary brackets from if statement

* CU-8677ge6j8 Version identification and updating (#313)

* Expose example model card version in metadata test

* Add version detection along with tests

* Move to a more comprehensive version string parser (regex)

* Add more comprehensive versioning tests

* Move MedCAT unzip to a separate method

* Separate getting semantic version from string

* Add new CDB with version information and use that with versioning tests

* Add methods to get version info from CDB dump and model pack zip/folder

* Exposing CDB file name and adding custom dev patch version support

* Fix config.linking.filters.cuis - from empty dict to empty set

* Add logging to versioning

* Fix f-strings instead of (intended) r-strings

* Add creating model pack archive to versioning CDB fix

* Fix logger initialising

* Making versioning a runnable module that allows fixing the config

* Add docstrings to CLI methods

* CU-8677ge6j8 Make explicit check regards to empty dict when fixing config

* CU-8677ge6j8 Add tests regarding versioning changes

* CU-8677ge6j8 Add missing return type hint

* CU-8677ge6j8 Simplify action handling for CLI input

* CU-8677ge6j8 Simplifying archive making method

* Pin down transformers for the de-identification model (#314)

* NO-TICKET pin down transformers for the de-id model

* Added function to remove CUI from cdb (#316)

* Added function to remove CUI from cdb

* Unit test for remove_cui

* CU-862jjprjw Fix github actions failures (#317)

* Added function to remove CUI from cdb

---------

Co-authored-by: antsh3k <[email protected]>

* CU-862jr8wkk Pin pydantic dependency to avoid conflicts with v2.0 (#318)

* Bump django from 3.2.18 to 3.2.19 in /webapp/webapp

Bumps [django](https://github.com/django/django) from 3.2.18 to 3.2.19.
- [Commits](django/django@3.2.18...3.2.19)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* CU-863gntc58 Umlspt2ch (#322)

* CU-863gntc58 Add parent to child relationship getter to UMLS preprocessing

* CU-863gntc58 Only use ISA relationships

* Make sure parents do not have themselves as children

* CU-863gntc58 Only keep preferred names

* CU-863gntc58 Fix typing issues

* CU-863gntc58 Fix child-parent relationships being saved instea

* Better system for avoiding parent-child being the same

* CU-86783u6d9 Add wrapper to simplify De-ID model usage

* CU-86783u6d9 Add wrapper to simplify De-ID model usage

* CU-86783u6d9 Fix typoe (nod vs not)

* CU-86783u6d9 Fix typo in docstring

* CU-86783u6d9 Change loading method name to match CAT

* CU-86783u6d9 Separate NER model from DeID model

* Better separation of NER models from DeID models

* CU-86783u6d9 Move deid method from helpers module to deid model and deprecated the use of the wrappers in the helpers module

* Fix imports in deid model

* Fix deid training method return value

* CU-86783u6d9 Fix dunder call defaults for redaction

* CU-86783u6d9 Add a few simple tests for the DeID model

* CU-86783u6d9 Add redaction test for the DeID model

* CU-86783u6d9 Add remove senitive data

* CU-86783u6d9 Fix deid model validation

* CU-86783u6d9 Add ChatGPT generated DeId trian data

* CU-86783u6d9 Add Warning regarding deid training data

* CU-86783u6d9 Fix model issue with multiple NER models

* CU-86783u6d9 Fix merge conflict in docstring

* CU-86783u6d9 Try and fix keyword argument duplication

* CU-86783u6d9 Ignore mypy where needed

* CU-86783u6d9 Fix issue with NER model being returned when loading a DeID model

* CU-86783u6d9 Remove unused import

* CU-86783u6d9 Update training data with some more examples

* CU-86783u6d9 Add type hints and doc string to deid method

* CU-86783u6d9 Add comment regarding deid_text method being outside the model class

* CU-86783u6d9 Add missing return type

* CU-86783u6d9 Expose get_entities in NER model

* CU-86783u6d9 Expose dunder call in NER model

* CU-86783u6d9 Remove dunder call in override in deid model

* CU-86783u6d9 Fix deid model tests

* CU-86783u6d9 Fix a few typos in docstrings

* CU-86783u6d9 Fix a method name in docstrings

---------

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: zethson <[email protected]>
Co-authored-by: tomolopolis <[email protected]>
Co-authored-by: Zeljko <[email protected]>
Co-authored-by: Sander Tan <[email protected]>
Co-authored-by: Xi Bai <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anthony Shek <[email protected]>
Co-authored-by: Lukas Heumos <[email protected]>
Co-authored-by: antsh3k <[email protected]>
Co-authored-by: James Brandreth <[email protected]>
Co-authored-by: Xi Bai <[email protected]>
adam-sutton-1992 and others added 30 commits August 27, 2024 16:14
* CU-86956du3q: Move to placeholder-based replacement

* CU-86956du3q: Update regression tests to a more reasonable state.

Make sure to compare the correct annotation, not just hoping for any CUI annotated to match the one we are looking for.
Output the specifics of the type of match that was found:
 - Identical
 - Bigger / smaller span
 - Random overlap
 - Parents / grandparetns, or children
Add strictness options to summary (success / failure).

* CU-86956du3q: Further fixes for regression checking:

Remove 'Failure reason' and 'Failre descriptor' - now using Finding instead.
Remove simplified success/failure metrics wherever relevant.
Fix tests that relied on old logic and fix test-time replacement/cui location.

* CU-86956du3q: Add documentation for new clases and methods

* CU-86956du3q: Rename enum constant (SPAN_OVERLAP -> PARTIAL_OVERLAP)

* CU-86956du3q: Add matching for partially overlapping children

* CU-86956du3q: Add tests for partially overlapping children

* CU-86956du3q: Update regression checking to generate multiple sub-cases for multiple placeholders

* CU-86956du3q: Update some tests for new format

* CU-86956du3q: Remove old / unused / irrelevant tests and test-code

* CU-86956du3q: Some renaming (filter -> placeholders)

* CU-86956du3q: Add some additional fail safes for option set

* CU-86956du3q: Fix option set for only 1 placeholder

* CU-86956du3q: Fix targeting

* CU-86956du3q: Add tests for targeting

* CU-86956du3q: Remove MCT export conversion (at least for now)

* CU-86956du3q: Remove MCT export conversion tests (at least for now)

* CU-86956du3q: Remove suite editing (at least for now)

* CU-86956du3q: Remove category separation (at least for now)

* CU-86956du3q: Remove unused regression utils (at least for now)

* CU-86956du3q: Remove serialisation tests (at least for now)

* CU-86956du3q: Improve quality of default regression test set

* CU-86956du3q: Improve exceptions in targeting

* CU-86956du3q: Fix docstring issue regarding exceptions

* CU-86956du3q: Update test with correct exceptions

* CU-86956du3q: Add utils for partial substitutions and corresponding tests

* CU-86956du3q: Allow multiple of the same placeholder in a phrase.

And more specifically, treat each one as their own sub-case

* CU-86956du3q: Add relevant tests for multi-placeholder checking

* CU-86956du3q: Allow changing of multiple pre-processing placeholders

* CU-86956du3q: Fix 1-placeholder sub-case yielding

* CU-86956du3q: Remove debug output

* CU-86956du3q: Replace separator (~) with whitespace when checking

* CU-86956du3q: Add utility method to limit string length for output

* CU-86956du3q: Improve string length limiting method

* CU-86956du3q: Add a few tests for string length limiting method

* CU-86956du3q: Add an ANYTHING strictness (mostly for example disbaling)

* CU-86956du3q: Add storage of examples (of a certain strictness) as well as relevant output

* CU-86956du3q: Fix type (missing ending bracket) in report output

* CU-86956du3q: Fix examples header appearing for every example

* CU-86956du3q: Print the same phrase fewer times for examples

* CU-86956du3q: Update fake CDB with (default) config

* CU-86956du3q: Add finding to examples and output

* CU-86956du3q: Add config to another fake CDB during test time

* CU-86956du3q: Allow strictness to propagate to parts when looking at examples

* CU-86956du3q: Add placeholder to examples output

* CU-86956du3q: Refactor report output generation slightly

* CU-86956du3q: Show all non-identical examples

* CU-86956du3q: Update example checking with strictness requirement (instead of simple boolean)

* CU-86956du3q: Simplify targeting somewhat (remove unnecessary method)

* CU-86956du3q: Allow changing of ouptut phrase max length

* CU-86956du3q: Fix doc string for changed method

* CU-86956du3q: Small whitespace fix

* CU-86956du3q: Fix total-included checking iteration

* CU-86956du3q: Add strictness and max phrase length to CLI

* CU-86956du3q: Add examople strictness to CLI

* CU-86956du3q: Fix default value for strictness in CLI

* CU-86956du3q: Update to use number of sub-cases for tqdm/progress bar

* CU-86956du3q: Remove option to set the total for progress bar (the automated one works fine now)

* CU-86956du3q: Simplify the progress bar by combining all cases

* CU-86956du3q: Split subcase iteration

* CU-86956du3q: Rename regression checker to regression suite

* CU-86956du3q: Streamline typing and the like by using intermediate data classes

* CU-86956du3q: Remove redundant method

* CU-86956du3q: Remove redundant method and acommpanying test

* CU-86956du3q: Remove redundant class

* CU-86956du3q: Add another intermediate data class

* CU-86956du3q: Remove completed TODO notes and redundant method

* CU-86956du3q: Add documentation to new methods and clases. Simplify example keeping.

* CU-86956du3q: Small update for how default test suite is handled for CLI

* CU-86956du3q: Small to report output format

* CU-86956du3q: Add easier to read exception when unable to load a placeholder

* CU-86956du3q: Update percentages output to avoid as many decimal places

* CU-86956du3q: Use preferred name for run-to-run consistency

* CU-86956du3q: Update test time fake CDBs

* CU-86956du3q: Update default regression tests with new extensive (yet simple) test case

* CU-86956du3q: Add initial README for regression stuff

* CU-86956du3q: Add option to for failing with having found another concept.

Added other incorrect cui that was found (if applicable).
Fixed issue with finding grandparents.

* CU-86956du3q: Add tests for parent and grandparent finding; fix tests for new changes (with optionally found alternative CUI)

* CU-86956du3q: Add preferred name to wrong CUI found

* CU-86956du3q: Fix tests for new form of determine cui description; add test for exact span grandchild

* CU-86956du3q: Fix determining partial matches for grandchildren and beyond

* CU-86956du3q: Add test for partial matches of grandchildren

* Fixing bug for metacat

Fix issues with compute_class_weights JSON serialization and enforce fc2 usage when fc3 is enabled

* Resolved an issue where compute_class_weights returns a NumPy array, causing an error when saving the configuration as JSON (since JSON does not support NumPy arrays). The fix ensures compatibility by converting the NumPy array to a JSON-serializable format.

* Added a safeguard in the model_architecture_config for meta_cat_config. The current architecture assumes fc3 is only used when fc2 is enabled. If fc2 is set to False and fc3 is True, the model would fail due to a mismatch in hidden layer sizes. The fix automatically enables fc2 if fc3 is set to True, preventing potential errors.

* CU-86956duhb: Add method to backport a model pack from 1.12 to previous version (#465)

* CU-86956duhb: Add method to backport a model pack from 1.12 to previous version

* CU-86956duhb: Fix some doc string issues

* CU-86956duhb: Add deprecation decorator to old config-fix

* CU-86956duhb: Mark backporting method as deprecated and to be removed in 1.14

* CU-8694cd9t2: Allow merging config into model pack config before init (#462)

* CU-8694cd9t2: Allow merging config into model pack config before init

* CU-8694fwyje: Update all configs with pre-load parts documented (#473)

* CU-86956du3q: Add converter from MCT export

* CU-86956du3q: Add documentation to MCT export converter

* CU-86956du3q: Add option to create a regression suite from an MCT export

* CU-86956du3q: Add option to create a regression suite from an MCT export to CLI

* CU-86956du3q: Add a small note for converter placeholder

* CU-86956du3q: Add tests for MedCATtrainer export converter

* CU-86956du3q: Add tests for regression suite generation based on MCT export

* CU-86956du3q: Simplify regression case creation tests somewhat

* CU-86956du3q: Add option to create a regression suite YAML from MCT export

* CU-86956du3q: Add option to stop at MCT export conversion

* CU-86956du3q: Make use of only-prefnames option

* CU-86956du3q: Fix loading of only-prefnames option from yaml

* CU-86956du3q: Add comment for only using preferred names to the default regression suite yaml

* CU-86956du3q: Fix tests broken due to pref-name only change

* CU-86956du3q: Add utility method to set runtime doc strings for enum constants

* CU-86956du3q: Add tests for runtime doc string addition

* CU-86956du3q: Add more tests for runtime doc string addition (to make sure it fails without the change)

* CU-86956du3q: Make Finding enum has runtime doc strings

* CU-86956du3q: Add CLI option to show the various descriptions of the finding types (--only-describe)

* CU-86956du3q: Update dict and json methods for some results for JSON serialisation

* CU-86956du3q: Add a few json serialisation tests

* CU-86956du3q: Add json serialisation example strictness to CLI

* CU-86956du3q: Add a few more json serialisation tests

* CU-86956du3q: Add usage of regression suite name from the name of the file being read

* CU-86956du3q: Fix tests by adding the regression suite name where applicable

* CU-86956du3q: Avoid examples in ResultDescriptor

* CU-86956du3q: Make sure strictness propagates accross all parts of a multi-result descriptor

* CU-86956du3q: Update tests: Use correct reporting for generating fake reports

* CU-86956du3q: Fix small test issue

* CU-86956du3q: Update tests for manual success/fail for results

* CU-86956du3q: Separate calculation section of report finding

* CU-86956du3q: Add a few more tests for report/results

* CU-86956du3q: Add option to force a non-0 exit status upon any regression test failure

* CU-86956du3q: Add files for regression model creation and checking

* CU-86956du3q: Add new part to main workflow to create and regression check a simple model pack

* CU-86956du3q: Update a mistyped comment

* CU-86956du3q: Make regression run at STRICTEST strictness at GHA workflow time

* CU-86956du3q: Fix strictness matrix for anything-typed strictness

* CU-86956du3q: Add strictness matrix information to --describe-only

* CU-86956du3q: Add python version to created model pack for test time

* CU-86956du3q: Use the python version of creat model pack during test time to avoid conflicts with other python versions running in parallel

* CU-86956du3q: [TEMP] Remove tests from main workflow (for faster iteration) and add args to output upon regression checking

* Revert "CU-86956du3q: [TEMP] Remove tests from main workflow (for faster iteration) and add args to output upon regression checking"

This reverts commit 4bf3089.

* CU-86956du3q: Make full model path the last line of the output upon creation model for regression

* CU-86956du3q: Move regression workflow logic to a separate bash script

* CU-86956du3q: Update comments in regression bash script

* CU-8694pz44d: Fix model cleanup during regression

* CU-86956du3q: Fix typos in utils

* CU-86956du3q: Fix a bunch of various typos in doc strings and comments

---------

Co-authored-by: shubham-s-agarwal <[email protected]>
* CU-8695j1be2: Remove deprecated method on CDB

* CU-8695j1be2: Remove unused import due to removal of deprecated method
* Pushing bug fix for metacat

2-phase learning for MetaCAT utilises data_undersampled. Fixed a bug in the eval function, which was incorrectly using the data_undersampled instead of the full_data

* Pushing change for lazy logging

* Pushing update for lazy logging

* Pushing lint fix
* CU-8695uhe5n: Update docs dependency pins

* CU-8695uhe5n: Fix typo in fsspec version pin
* CU-8695pvhfe: Rename a test class

* CU-8695pvhfe: Add tests for multiprocessig usage monitoring

* CU-8695pvhfe: Fix usage monitor for multiprocessig.

When using CAT.multiprocessing_batch_char_size (CAT._multiprocessing_batch and CAT._mp_cons internally), flush the usage monitor at the end of multiprocessing method.
When using CAT.get_entities_multi_texts or CAT.multiprocessing_batch_docs_size (uses the former internally), add logging of usage to output

* CU-8695pvhfe: Fix remaining issues with usage monitor for multiprocessig.

Avoid checking length of (potentially) non-existent strings. Avoid early iteration of generator.
* CU-8695knfbg: Decouple the edit finder methods from the spell checker

* CU-8695knfbg: Add methods for random edit picking and variant estimation to utils; Plus a few tests

* CU-8695knfbg: Add edit distance option and use to CLI

* CU-8695knfbg: Allow retaining order of elements in generator when getting edits for run-to-run consistency

* CU-8695knfbg: Add safeguard for name order to be consistent across runs

* CU-8695knfbg: Sort names when getting from CDB to avoid run to run variance

* CU-8695knfbg: Move edit finding methods back to BasicSpellChecker class, but make the 1-distance method a class method

* CU-8695knfbg: Move validation earlier in edit finder

* CU-8695knfbg: Simplify edit finder somewhat
* CU-869574kvp: Add pattern based release version identifying for Snomed preprocessing

* CU-869574kvp: Add tests for pattern-based snomed release identification

* CU-869574kvp: Update Snomed preprocessing:

Separate extensions into an Enum.
Do the release/paths check at init to allow for early failures in case of issues

* CU-869574kvp: Simplify mappings somewhat.

Move common avoids to a common location.
Fix UK Drug relationship name

* CU-869574kvp: Simplify mappings somewhat more.

Remove some clutter by separating common prefixes for release types and file names.

* CU-869574kvp: Simplify mappings somewhat more, agai.

Remove some clutter by separating common suffixes for release types.

* CU-869574kvp: Update preprocessing.

New abstraction. Use supprted extensions which describe their file formats along with bundles which give some further insight and control.

* CU-869574kvp: Fix data class init

* CU-869574kvp: Fix issue with file paths

* CU-869574kvp: Fix a UK Clinical description file path

* CU-869574kvp: Add (optional) 2nd part of folder name to extension.

For AU models, the folder name seems to be 'SnomedCT_Release_AU1000036_20240630T120000Z', so the 1st part is just 'Release' and the 2nd part is indicative of AU.
Add usage of this where relevant.

* CU-869574kvp: Fix preprocessing tests.

Add patch for files/folders where applicable.
Change the paths of attributes where applicable.
* CU-8695ucw9b: Fix older DeID models due to changes in transformers.

Since transformers 4.42.0, the tokenizer is expected to have the 'split_special_tokens' attribute. But the version we've saved does not. So when it's loaded, this causes an exception to be raised (which is currently caught and logged by medcat).

* CU-8695ucw9b: Add functionality for transformers NER to spectacularly fail upon consistent consecutive exceptions.

The idea is that this way, if something in the underlying models is consistently failing, the exception is raised rather than simply logged

* CU-8695ucw9b: Add tests for exception raising after a pre-defined number of failed document processes

* CU-8695ucw9b: Change conditions for raising exception on consecutive failure.

Now only raise the exception if the consecutive failure is identical (or similar). We determine that from the type and string-representation of the exception being raised.

* CU-8695ucw9b: Small additional cleanup on successful TNER processing

* CU-8695ucw9b: Use custom exception when failing due to consecutive exceptions

* CU-8695ucw9b: Remove try-except when processing transformers NER to force immediate raising of exception
* MetaCAT fixes and upgrades

Pushing for 3 updates:
1) Removed the check and update for labels with zero data, as this was causing issues during evaluation
2) Resolved an issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3)
3) Updated the attention mask creation to dynamically use the actual pad_idx value instead of assuming it to be 0

* Pushing type fix

* Pushing for type fix

* Fixing type issues

* Pushing change

* Pushing update w/o try except block

For the issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3), pushing an optimized version w/o the try except block
…497)

* CU-869671bn4: Update requirements (GHA should fail due to mypy)

* CU-869671bn4: Update mypy dev requirement to be less than 1.12
* CU-86967nnra: Remove python 3.8 from GHA

* CU-86967nnra: Remove python 3.8 from classifiers

* CU-86967nnra: Add python version requirements to setup.py (allowing from 3.9 to 3.11)

* CU-86967nnra: Remove upper bound from python requirements.

Upper bound could be lifted as soon as `spacy` releases a compatible versions. And it _shouldn't_ require any changes from our side. And it isn't possible to install it on higher versions (currently) due to no `spacy` being available for those versions
* CU-86964zm4d: Use ignore tag correctly to ignore certain parts of UK release

* CU-86964zm4d: Use OPCS4 later refset ID by default (and switch to older if needed)

* CU-86964zm4d: Fix OPCS4 refset ID tests.

Fix the default value being tested for (i.e in case of international release that'll be shown).
Add a test for old UK extension.

* CU-86964zm4d: Add note regarding OPCS refset ID relevance only for UK extensions.

* CU-86964zm4d: Fix checking of extension outside loops.

I.e determinie if a UK release/bundle is used for OPCS4/ICD10 mappings splitting.
Always returning separate refsets for ICD10 and OSC internally, even if the latter is None.
* CU-8695hghww: Add bash script to run backwards compatibility

* CU-8695hghww: Rename backwards compatibility running bash script

* CU-8695hghww: Add new step to workflow to run model backwards compatibility

* CU-8695hghww: Fix model compatibility regression suite path

* CU-8695hghww: Simplify creation and removal of fake model folder
…ecated (#500)

* CU-8696m1mch: Remove versioning utility since all its parts were deprecated

* CU-8696m1mch: Remove tests for versioning utility

* CU-8696m1mch: Remove unused test-specific binary (CDB)
* CU-8696nbm03: Remove use of unigram table

* CU-8696nbm03: Fix usage of new unigram table alternative

* CU-8696nbm03: Remove unigram table from loaded vocabs

* CU-8696nbm03: Add tests for unigram table usage/negative sampling frequency

* CU-8696nbm03: Add small comment to tests

* CU-8696nbm03: Calculate frequencies upon load if not present

* CU-8696nbm03: Update comment regarding probability calculatioons

* CU-8696nbm03: Remove commented test case

* CU-8696n7w95: Fix docstring issue

* CU-8696nbm03: Fix serialisation tests

* CU-8696nbm03: Add python 3.9-friendly method for getting the total of a counter
* CU-8695d4www: Bump pydantic requirement to 2.6+

* CU-8695d4www: Update methods to use pydantic2 based ones

* CU-8695d4www: Update methods to use pydantic2 based ones [part 2]

* CU-8695d4www: Use identifier based config when setting last train date on meta cat and tner

* CU-8695d4www: Use pydantic2-based model validation

* CU-8695d4www: Add workarounds for pydantic1 methods

* CU-8695d4www: Add missing utils module for pydantic1 methods

* Revert "CU-8695d4www: Bump pydantic requirement to 2.6+"

This reverts commit b0b3d43.

* CU-8695d4www: [TEMP] Add type-ingores to pydantic2-based methods for GHA workflow

* CU-8695d4www: Make pydantic2-requires getattribute wrapper only apply when appropriate

* CU-8695d4www: Fix missin model dump getter abstraction

* CU-8695d4www: Fix missin model dump getter abstraction (in CAT)

* CU-8695d4www: Update tests for pydantic 1 and 2 support

* Revert "CU-8695d4www: [TEMP] Add type-ingores to pydantic2-based methods for GHA workflow"

This reverts commit b86135a.

* Reapply "CU-8695d4www: Bump pydantic requirement to 2.6+"

This reverts commit 080ae71.

* CU-8695d4www: Allow both pydantic 1 and 2

* CU-8695d4www: Deprecated pydantic utils for removal in 1.15

* CU-8695d4www: Allow usage of specified deprecated method(s) during tests

* CU-8695d4www: Allow usage of pydantic 1-2 workaround methods during tests

* CU-8695d4www: Add documentation for argument allowing usage during tests in deprecation method

* CU-8695d4www: Fix allowing deprecation during test time

* CU-8695d4www: Fix model dump getting in regression checker

* Revert "CU-8695d4www: Fix allowing deprecation during test time"

This reverts commit fadc7d1.

* Revert "CU-8695d4www: Add documentation for argument allowing usage during tests in deprecation method"

This reverts commit 927f807.

* Revert "CU-8695d4www: Allow usage of pydantic 1-2 workaround methods during tests"

This reverts commit 825628e.

* Revert "CU-8695d4www: Allow usage of specified deprecated method(s) during tests"

This reverts commit a89e680.

* Revert "CU-8695d4www: Deprecated pydantic utils for removal in 1.15"

This reverts commit 0ee1a8a.

* CU-8695d4www: Add comment regarding pydantic backwards compatiblity where applicable

* CU-8695d4www: Add pydantic 1 check to GHA workflow

* CU-8695d4www: Fix usage of pydantic-1 based dict method in regression results

* CU-8695d4www: Fix usage of pydantic-1 based dict method in regression tests

* CU-8695d4www: New workflow step to install and run mypy on pydantic 1

* CU-8695d4www: Add type ignore comments to pydantic2 versions in versioning utils for typing during GHA workflow

* CU-8695d4www: Update pydantic requirement to 2.0+ only

* CU-8695d4www: Update to pydantic 2 ONLY

* CU-869671bn4: Update mypy dev requirement to be less than 1.12

* CU-869671bn4: Fix model fields in config

* CU-869671bn4: Fix stats helper method - use correct type adapter

* CU-869671bn4: Fix some model type issues

* CU-869671bn4: Line up with previous model dump methods

* CU-869671bn4: Fix overwriting model dump methods

* CU-869671bn4: Remove pydantic1 workflow step
…ting folds (#508)

* CU-8696v2j42: Add test to make sure per-cui counts are kept when creating folds

* CU-8696v2j42: Fix per annotation fold creation
* CU-8693bc9kc: Add python 3.12 support

* CU-8693bc9kc: Amend dependencies so as to be compatible with python 3.12

* Bump default spacy model version (to 3.8)

* CU-8693bc9kc: Fix some typing issues due to numpy2

* CU-8693bc9kc: Fix some typing issues due to numpy2 (try 2)

* CU-8693bc9kc: Change spacy models to 3.7.2

* CU-8693bc9kc: Pin numpy to v1

* CU-8693bc9kc: Fix numpy requirement comment

* CU-8693bc9kc: Fix usage of old/deprecated assert methods in tests

* CU-8693bc9kc: Update some requirement comments
* CU-8697c86rf: Update docs build requirements

* CU-8697c86rf: Fix docs build requirements (hopefully)

* CU-8697c86rf: Fix docs build requirements (hopefully) x2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⤵️ pull merge-conflict Resolve conflicts manually
Projects
None yet
Development

Successfully merging this pull request may close these issues.