Skip to content

Latest commit

 

History

History
552 lines (267 loc) · 25.1 KB

CHANGELOG.md

File metadata and controls

552 lines (267 loc) · 25.1 KB

CHANGELOG

v0.10.0 (2023-09-13)

Chore

  • chore: updating semantic release config (b30ea37)

Feature

  • feat: remove various transformer warnings and fix training documentation (#25)

Co-authored-by: Curtis Ruck <ruckc@DESKTOP-ME5SH6R> (6f2e1d1)

Fix

  • fix: default sampling to False to avoid changing existing behavior (46260d6)

  • fix: linting (b331dfa)

v0.9.0 (2023-06-08)

Chore

  • chore: adding torch 1.13 to dev deps to help CI run tests (8abff7b)

  • chore: updating docs to discourage multiple sentences per string (e77c5a2)

  • chore: adding an integration test for unk chars strings (1c21345)

Feature

  • feat: removing logger config for release (7947f8b)

Unknown

  • Add ability to set which gpu to use (#23)

  • updated handling of 'use_gpu' option to allow specifying gpu index to use

  • added some error handling with logging messages to the FrameSemanticTransformer constructor

  • fixed the last update so that the format string now is only set for local logger and not for the root logger (c62aa07)

v0.8.2 (2023-04-15)

Chore

  • chore: updating bibtex (0e4af39)

  • chore: pin poetry v1.4.0 so furo will install in CI (24dc8c7)

  • chore: pin poetry v1.4.0 so furo will install in CI (f8ca051)

  • chore: add CITATION.cff (223cfce)

  • chore: adding citation bibtex info to README (b5435fa)

Fix

  • fix: Align trigger-marked sentence to original sentence (#19)

  • trigger identification task alignment fix

  • black formatting

  • updated test cases


Co-authored-by: Jacob Striebel <[email protected]> (6683a22)

v0.8.1 (2023-03-15)

Fix

  • fix: auto-download omw-1.4 for inference (343906c)

v0.8.0 (2023-03-15)

Feature

  • feat: new models trained on Framenet exemplars (#18)

  • include exemplars in framenet training

  • skipping invalid trigger exemplars

  • skip exemplars by default during training

  • fixing tests

  • improving data augmentations

  • ensure wordnet download for inference

  • updating snapshots

  • adding more info when augmentations fail validation

  • adding more augmentations from nlpaug

  • fixing linting

  • fixing keyboard augmentation

  • more checks on keyboard augmentation

  • tweaking augmentations

  • fixing tests

  • adding safety check to uppercase augmentation

  • lower augmentation rate

  • adding more augmentations

  • tweaking augs

  • removing debugging output

  • reduce augmentation

  • tweaking augmentation probs

  • tweaking augmentation probs

  • fixing type import

  • adding option to delete non-optimal models as training progresses

  • tweaking augmentations

  • updating models

  • updating README with new model stats (3f937fb)

v0.7.0 (2023-03-09)

Chore

  • chore: serialize eval logs before writing json (40983ff)

  • chore: fixing missing links for readthedocs (fbca04e)

  • chore: explicitly installing furo in readthedocs (a581825)

  • chore: try using python 3.9 for readthedocs build (55c712f)

  • chore: manually install torch in readthedocs (1ad58a9)

  • chore: setting up docs page with sphinx and readthedocs (#17)

  • setting up docs page with sphinx and readthedocs

  • ignore docs for flake8 linting (8d3c5fd)

  • chore: create outputs dir when logging if it doesn't exist (7629560)

  • chore: adding option to log eval failures (f68fb61)

Feature

  • feat: Propbank (#16)

  • setting up propbank loaders

  • skip light verbs for now

  • trying a different approach to avoid lv verbs

  • fixing typo

  • just ignore non-existent frames

  • use similar lu norm to framenet

  • adding option to resume from checkpoint

  • switching to propbank 3.4 instead of 3.1

  • fixing propbank nltk paths

  • removing debuggin prints

  • fixing test

  • adding optional LR decay (4c53887)

Unknown

  • add readthedocs link to readme (bbeeed8)

v0.6.2 (2023-03-01)

Performance

  • perf: minor performance improvements for arg extraction and frame detection (#15)

  • save best model chkpt based on val loss

  • Adding loader setup to training

  • add more optional config for loggers/callbacks during training

  • adding explicit logging for test/train/val loss to end of epochs

  • rever to default PL logging behavior if no loggers are provided

  • adding helpers for model evaluations

  • try to standardize arg extraction output

  • standardize punct in args extraction

  • use fast tokenizer for sent clenanup

  • switch to just using tokenizer cleanup for speed

  • run clean_up_tokenization just once before arg extraction, not for each arg

  • fixing val_metrics err (7e03969)

v0.6.1 (2023-02-27)

Chore

  • chore: remove poetry.lock from demo docker build (2f8ffb9)

Fix

  • fix: fixing errors when no frames are found (#14) (ef2424c)

v0.6.0 (2023-02-23)

Feature

  • feat: adding support for running inference on multiple sentences in batches (#11) (e6423e5)

v0.5.0 (2023-02-22)

Chore

  • chore: fixing README badge after shields.io breaking change (bd90fef)

Feature

  • feat: Multilingual training refactor (#10)

  • WIP refactoring to make it easier to train on different framenets

  • Making evaluate runnable directly to evaluate pretrained models

  • tweaking tests

  • refactoring training / eval scripts

  • add validation that loaders match model

  • updating README

  • cleaning up typing

  • use 3.8 for CI

  • updating semantic release (7bf7ae5)

Unknown

v0.4.1 (2022-05-25)

Fix

  • fix: updating README stats (76e4e75)

v0.4.0 (2022-05-24)

Feature

  • feat: Frame classification hints (#3)

  • adding in lexical unit data for smarter frame classification

  • adding in stemming for lu handling

  • allow skipping validation in initial epochs for faster training

  • use self.current_epoch instead of batch_idx

  • using bigrams to reduce the amount of frame suggestions

  • refactoring bigrams stuff and adding more tests

  • fixing bug with trigger bigrams

  • updating README

  • updating model revision (201ed51)

Unknown

  • fixing typo in demo server (a15ef6d)

  • improving demo UI (#4)

  • improving demo UI

  • adding secret 'model' param to client (8bf3275)

  • UI improvements for demo (69e85af)

v0.3.3 (2022-05-22)

Fix

  • fix: make trimmed batch contiguous (#2) (21aee70)

v0.3.2 (2022-05-22)

Fix

  • fix: add torch.no_grad() to batch trimming step (8b8a401)

v0.3.1 (2022-05-22)

Fix

  • fix: adding LICENSE into pypi description (c6e0a42)

  • fix: adding README into pypi description (1b99551)

v0.3.0 (2022-05-22)

Chore

  • chore: adding badges to README (ac00793)

Feature

  • feat: adding a helper to trim unnecessary padding chars for faster training / generation (#1) (58e58a8)

v0.2.1 (2022-05-22)

Fix

  • fix: reverting to older lock file for mypy (08f0c63)

  • fix: relaxing transformers version req (57464a1)

v0.2.0 (2022-05-21)

Feature

Fix

  • fix: pinning old version of semantic-release plugin (85f3a62)

  • fix: adding fetch-depth 0 for release (6dab4e6)

  • fix: autopublish to pypi (3591c27)

Unknown

  • restrict model revisions (9a36fc4)

  • adding explanation about lightning_logs dir (7898bba)

  • updating README and improving train script (94d7fac)

  • Create LICENSE (f73fc0e)

  • augment training samples dynamically during training (3b40f07)

  • adding tests for chain_augmentations (57f24dc)

  • adding write permissions to publish job (d4b3e22)

  • try checkout v2 (7883323)

  • try adding token to checkout action (dce1651)

  • adding link to demo in README (f0b632b)

  • fix node version in gh action (7db37cb)

  • add an action to publish the website (182504f)

  • augment data for train but not val or test (660919e)

  • adding data augmentation (13de3f4)

  • adding small size model and lazy-loading for the nltk + models (bda0ca8)

  • adding a demo client using create-react-app (2a673d1)

  • try restricting batch size to 2 to avoid excessive memory use (4f2e58c)

  • try reducing to 1 thread to save memory (cfc4d21)

  • adding cors support to flask (4d463cb)

  • increase gunicorn timeout (acd42a1)

  • try adding poetry.lock to speed up docker build (86c9bcb)

  • bump to trigger cloud run build (c56e5b4)

  • bump to trigger cloud run build (a6d3094)

  • remove poetry.lock from docker build (805adf4)

  • adding a dockerizer flask server for demo purposes (8f192ac)

  • fixing typo (b7b9e34)

  • adding a base FrameSemanticTransformer class to make it easy to parse sentences into frames (c0b78cf)

  • refactoring TaskSample classes into Tasks and TaskSamples (667f85e)

  • fixing tests (eab2f96)

  • more efficient loading of frame elements from framenet (4655768)

  • add a check for invalid output in eval (0e9eb88)

  • add a check for invalid output in eval (81a829a)

  • eval arg id similar to how sesame does it (73b2db9)

  • try adding in all possible frame elements into task intro for argument extraction (94e3c89)

  • updating frame id samples to be closer to how sesame does it (39376e5)

  • fixing evaltuate function to work with batches predictions (7ca93d1)

  • force tranformers v4.18.0 to keep mypy happy (ac2c6a6)

  • using multiple predictions when evaluating frame id task (d28961c)

  • fixing typo (197b93f)

  • trying to add eval into training (d22080b)

  • limiting task rebalancing ratio (451eb6c)

  • adding in task mix balancing (ecde0b2)

  • moving T5Tokenizer.cleanup into standardize_punct method (12ccf38)

  • Trying out built-in clean up tokenization method (5f4a723)

  • allow tweaking predict params in eval (a549041)

  • tweaking trigger processing to hopefully be more amenable to how the tokenizer works (2103f4d)

  • more readable eval print (726126d)

  • adding option to print eval failures (d23bf6f)

  • adding a punct standardization step (d08877e)

  • fixing linting (9250636)

  • tweaking frame id eval to match sesame logic (162f50f)

  • removing sample from dataloader, as it appears to break things (0e1dee0)

  • fixing trigger samples and adding tests (6f19673)

  • adding logging statements inside training function (a28a706)

  • refactoring based on simple-t5 (d89e5db)

  • fixing evaluate typing (3cd3c33)

  • fixing future annotations (16a227c)

  • fixing bug in py 3.7 (ea953fc)

  • refactoring and adding a target id task (37302bb)

  • adding total to tqdm iteration (058a12c)

  • fixing device issues (49a01f3)

  • fixing typing (6012b09)

  • more efficient eval processing (4bbee5c)

  • add tqdm for eval progress (69d98ac)

  • tweaking evaluate (0f3db83)

  • adding evaluate / predict helpers (be08ec1)

  • adding fulltext filenames from sesame for eval (081533b)

  • removing validation loop end as well (1c5dfb2)

  • removing return from training loop end (ace9621)

  • adding in closure... (4b8e184)

  • updating optimzier_step (ea79146)

  • fixing typo (7aa0a49)

  • moving dataset generation out of the tuner (ac9f90f)

  • adding future annotations stuff (611a1a5)

  • adding future annotations stuff (b6c8a53)

  • setting up a model for training (abfc3d8)

  • skipping confusing frames for now (1888b8c)

  • adding helper for parsing examples from docs (8856b2e)

  • fixing mypy (14e4832)

  • fixing black formatting (a24193a)

  • initial commit (4df6628)