TypeError: PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'tokenize_newline_separately' #791

DigitalPathology · 2024-10-12T11:22:28Z

I would like to conduct object detection task by utilizing a VQA model using autotrain API. I followed this guide. Accordingly, I prepared the metadata.json properly. Three columns are "file_name", "question", "multiple_choice_answer".
Sample format from the dataset:

{"file_name": "1.mrxs__12214_50922_512_512.png", "question": "This image is from 3DHistech Scanner. Where is the mitosis location(four properties of the bounding box: top left x coordinate, top left y coordinate, width, height) in this image?", "multiple_choice_answer": [[181, 199, 43, 42]]}

I tried to use google/paligemma-3b-ft-coco35l-448 and google/paligemma-3b-mix-448 models for this purpose. When I start the process with this command: autotrain --config config.yml

It loads the dataset properly. Everthing seems fine until the training started:

Here is the error:

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'tokenize_newline_separately' #791

TypeError: PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'tokenize_newline_separately' #791

DigitalPathology commented Oct 12, 2024 •

edited

Loading

TypeError: PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'tokenize_newline_separately' #791

TypeError: PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'tokenize_newline_separately' #791

Comments

DigitalPathology commented Oct 12, 2024 • edited Loading

DigitalPathology commented Oct 12, 2024 •

edited

Loading