Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom loss function #1539

Open
mariiapronesti01 opened this issue Jan 14, 2025 · 4 comments
Open

Custom loss function #1539

mariiapronesti01 opened this issue Jan 14, 2025 · 4 comments

Comments

@mariiapronesti01
Copy link

Hi! I am trying to finetune Llama3.1 3B and I would like to use a customized loss function.
I read in a past issue that I have to remove the causal LM head and replace it with mine. Since I am not an expert, can I ask for more details and information?

Thanks a lot and congrats for the great library!

@danielhanchen
Copy link
Contributor

It's best to ask on Discord for this :) Generally it depends on why you're trying to make a custom loss - is it for classification? If yes, then we're actively working on making AutoModelForSequenceClassification work - hopefully by next week

@mariiapronesti01
Copy link
Author

I'll join the channel then!
No it's not classification. I am working with graphs and I would like to use a custom metric that measures the distance between two graphs as loss function

@Gladiator07
Copy link

It's best to ask on Discord for this :) Generally it depends on why you're trying to make a custom loss - is it for classification? If yes, then we're actively working on making AutoModelForSequenceClassification work - hopefully by next week

Awesome, much awaited, will there be a announcement post ? This will be a really nice addition!

Also will it be possible to init the model with num_labels as in the huggingface's implementation then subclass Trainer and override the compute_loss function so that the same architecture can be used in reward model setting (bradley terry loss), etc ?

@danielhanchen
Copy link
Contributor

Yes the goal is to make the experience as seamless as normal Hugging Face, so whatever could be done in HF, it should work fine :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants