Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add
allenai/OLMoE-1B-7B-0924
This is a new MoE model which I'd like to use with TL. Notes:
transformers
hasn't released a version with OlMoE support yet. We can updatepyproject.toml
to point to it instead of github once it's released. Will leave as a draft until then.router_aux_loss_coef
/router_z_loss_coef
: I don't plan on training OLMoE in TL so there's no need for these coefficients.norm_topk_prob
defaults toFalse
intransformers
and I don't plan to use it.Commenting-out
add_bos_token=True
This is a temporary fix. When running without either location commented out:
Commenting out the location mentioned in the stack trace (
HookedTransformer.py:146
):I'd appreciate advice on what's going wrong here. I'm a bit confused because I didn't change anything related to bos tokens (and e.g. the call to
AutoTokenizer.from_pretrained
inHookedTransformer
always specifiesadd_bos_token=True
but neverbos_token
).Type of change
Checklist: