-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat. Postprocessing control - custom page separator, postprocess function etc #40
base: main
Are you sure you want to change the base?
Conversation
Changes: - Feature added: now specific pages can be processed with the python sdk using "select_pages" param. Incorporates getomni-ai#23, getomni-ai#24 for python sdk - workflow for the above feature: create a new temperory pdf in the tempdir if select_pages is specified and follow the rest of the process as usual and finally map the page number in the formatted markdown to get the actual number instead of index. - raise warning when both select_pages and maintain used. - required adaptations and updates in messages, exceptions, types, processor, utils etc Fixes/improvements: - memory efficient pdf to image conversion, utilizing paths only option to directly get sorted image paths from pdf2image api Misc: - Bump the version tag - documentation updated
- added post_process_function param to override/skip Zerox's default format_markdown post processing on the model's text output. - removed output_dir param and added output_file_path which is more flexible for arbitrary file extensions - page_separator param added (used when writing the consolidated output to the output_file_path
I'll take a look and test this one as well. I like the |
@tylermaran, you want to provide an option to pass a string with fixed placeholder like {page_no} (if this placeholder is not found then we don't populate anything) and populate that with the page number while writing the output markdown file (if the user has choosed to output)? |
Hey @pradhyumna85, thinking through this a bit more. Right now we return an array of objects (including page number, content, etc.). So for our day to day use I was doing: const result = await zerox(...)
const aggregatedText = result.pages.map((el) => el.content).join('\n\n'); But if we're writing to the output directory, it makes sense to have a configurable page deliminator built in. Although something like |
@tylermaran, Added the functionality for you to have a look. |
@tylermaran could you review this PR for merging. Also should I incorporate the new system prompt from #48? |
…tic json with various models to validate vision capability instead of actual test: https://github.com/BerriAI/litellm/blob/fb523b79e9fdd7ce2d3a33f6c57a3679c7249e35/litellm/utils.py#L4974
To accommodate and resolve #37
Changes
Note: This PR adds changes on top of PR #39. If merged, this will accommodate changes of PR #39, which won't require the previous PR to be merged.
Edit:
Fixes #42