Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'tokenize_newline_separately' #791

Open
DigitalPathology opened this issue Oct 12, 2024 · 0 comments

Comments

@DigitalPathology
Copy link

DigitalPathology commented Oct 12, 2024

I would like to conduct object detection task by utilizing a VQA model using autotrain API. I followed this guide. Accordingly, I prepared the metadata.json properly. Three columns are "file_name", "question", "multiple_choice_answer".
Sample format from the dataset:

{"file_name": "1.mrxs__12214_50922_512_512.png", "question": "This image is from 3DHistech Scanner. Where is the mitosis location(four properties of the bounding box: top left x coordinate, top left y coordinate, width, height) in this image?", "multiple_choice_answer": [[181, 199, 43, 42]]}

I tried to use google/paligemma-3b-ft-coco35l-448 and google/paligemma-3b-mix-448 models for this purpose. When I start the process with this command: autotrain --config config.yml

It loads the dataset properly. Everthing seems fine until the training started:
1

Here is the error:

2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant