Added all optional arguments and extraction of loss function values #8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have updated the train_topic_model function to take any argument accepted by mallet as kwargs with the only difference being that arguments have '-' changed for '_', i.e. num-iterations --> num_iterations. Numeric values can be passed as either numeric or strings.
It is backwards compatible keeping all of the mandatory arguments. The only thing removed is the default value of --optimize-interval 10 from within the function. Instead it uses mallets default value of 0 and can be set manually by adding optimize_interval=10 as an argument. This was done in order to allow for the user to specify hyperparameter values themselves in case optimization is not wanted (for example by setting alpha=0.05).
The functionality to return loss function values gathered during training has also been added. If logperplexity=True, loss values will be scraped from the output and returned in a list (they are still printed as usual). This option is by default set to False.
The subprocess module is used to get loss values. I saw #2 and had the same issue on mac but managed to resolve it (no delay for printing output). I have not tested it on windows however (but think it should work?). If it does not work however, an option could be to use os by default unless logperplexity=True.