Skip to content

Commit

Permalink
[TMVA] Introduce changes in RBatchGenerator presented at CHEP2024
Browse files Browse the repository at this point in the history
This commit builds on changes from

* #16341
* #15692

Authorship is preserved in the sources. Full list of commit messages follows:

[tmva] Fix chunking problem where values are missed

[tmva] Add drop_remainder argument

[tmva] NumPy and PyTorch generators may return batches of different row size

[tmva] Let tensorflow generator be a generator

[tmva] Remove a declared variable that is not used

[tmva] Remove superfluous shuffle

[tmva] Fix generator dimensions

[tmva] Add option to iterate with next(), change "CreateTFDatasets" to "CreateTFGenerators"

[tmva] Pythonize string concatenation

[tmva] Remove commented out lines

[tmva] Save remaining data, if there are less rows from chunk than in batch size

[tmva] Remove redundant TensorFlow import

[tmva] Change RBatchgenerator_NumPy.py tutorial to expose loop with next()

[tmva] Added test file

[tmva] Return target and weight data in columns from _batchgenerator.py

[tmva] Add tests for RBatchGenerator

[tmva] Add option to add multiple files as arguments for vertical concatenation

[tmva] Accept only RDataFrame only when loaded from root file

[tmva] Broaden RDataFrame parameter to RNode

[tmva] Add files support to rdataframe version

[tmva] Add test for multiple files

[tmva] Add templates variables to dictionary

[tmva] Change tutorials and add comments

[tmva] Add support for multiple target columns

[tmva] Pass dataframe to RBatchChunkLoader as reference

[tmva] Add test for multiple target columns in RBatchGenerator

[tmva] Fix the referencing of RDataFrame for each class

[tmva] To be redesigned

[df] Add ChangeEntryRange

[df] Add ChangeBeginAndEndEntries

[df] Rename variables of ChangeBeginAndEndEntries function

[df] Add test for ChangeBeginAndEndEntries

[tmva] Create chunking for df with filters

[tmva] Add number of training and validation batches in RBatchGenerator

[tmva] Fix RBatchGenerator tutorials

[tmva] Add first changes of test to RBatchGenerator

[tmva] Implement rbatchgen tests, fix bug

[tmva] Return only train generator if validation split is 0

[tmva] Fix batch gen Tensorflow tutorial

[tmva] Fix test14 in batchgen

[tmva] Minor typo fix

[tmva] Remove pure stupidness in RBatchGenerator

[tmva] Remove unnecessary vector in RBatchGenerator

[tmva] Remove superfluous function overload in RBatchGenerator

[tmva] Remove unused class variables

[tmva] RBatchGenerator fix std::variant and place while do instead of lambda recursion

[tmva] Check if dataframe's data source is of TTree origin

in the middle of creation

[tmva] Add restore chunk loader to 0 after acitvation

[tmva] Remove mutexes from Validation batch loading

[tmva] Add shuffling and drop_remainder

[tmva] Add support for filtered dataframe in RBatchGenerator

[tmva] RBatchGenerator with filters last batches

[tmva] Fix adding too many batches

[tmva] Using numpy asarray as a bridge between RVec and torch.tensor() and padding vector columns with .resize() and std::copy

[tmva] change RBatchGenerator tensorflow output to "Any" type

[tmva] RBatchGenerator with padding RVec AssignVector() in RChunkLoader.hxx

[tmva] adding LastBatches handler for RBatchGenerator

[tmva] numpybridge_with_stdfill_stdcopy_rchunkloader first commit

[tmva] Apply formatting changes to sources

[tmva] Refactor tensor filling in RChunkLoader

[tmva] Always return a tuple of training and (optionally) validation generators

[df] Fix signature of ChangeEmptyEntryRange

The entry number is of type `Long64_t` so also use the same type for input parameters

Co-authored-by: Kristupas Pranckietis <[email protected]>
Co-authored-by: Nopphakorn Subsa-Ard <[email protected]>
  • Loading branch information
3 people committed Nov 11, 2024
1 parent b2da371 commit f7342e4
Show file tree
Hide file tree
Showing 19 changed files with 2,333 additions and 966 deletions.

Large diffs are not rendered by default.

7 changes: 7 additions & 0 deletions bindings/pyroot/pythonizations/test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -192,3 +192,10 @@ ROOT_ADD_PYUNITTEST(pyroot_tcomplex tcomplex_operators.py)

# Tests with memory usage
ROOT_ADD_PYUNITTEST(pyroot_memory memory.py)

# rbatchgenerator tests
# TODO: We currently do not support TensorFlow for Python >= 3.12 (see requirements.txt)
# Update here once that is fixed.
if (NOT MSVC AND Python3_VERSION VERSION_LESS 3.12)
ROOT_ADD_PYUNITTEST(batchgen rbatchgenerator_completeness.py PYTHON_DEPS numpy tensorflow torch)
endif()
Loading

0 comments on commit f7342e4

Please sign in to comment.