Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to auto-magically extract nutrition facts from a photo #4896

Open
Tracked by #511
rugk opened this issue Dec 12, 2023 · 13 comments
Open
Tracked by #511

Allow to auto-magically extract nutrition facts from a photo #4896

rugk opened this issue Dec 12, 2023 · 13 comments

Comments

@rugk
Copy link

rugk commented Dec 12, 2023

Problem

I'm always frustrated when I have to enter the nutrition facts manually. 😉

Proposed solution

Especially given the recent developments of machine learning models getting more powerful, I guess an automatic OCR/ML recognition would be possible?
Could not you train a ML model based on your OpenFoodFacts data?

Additional context

Of course it has to be corrected manually, but it would already be a good starter.

Mockups

N/A
(likely just similar to the other OCR features for ingredients e.g.)

@teolemon
Copy link
Member

@raphael0202

@teolemon
Copy link
Member

@rugk we plan to extract automatically nutrition next year using Robotoff (our machine learning system)

@raphael0202
Copy link

@rugk Yes it's a project we would like to do in 2024! Is it something you would be interested in contributing to?

@rugk
Copy link
Author

rugk commented Dec 18, 2023

Likely not technically, but testing for sure.

@teolemon teolemon changed the title OCR for nutrition facts Allow to auto-magically extract nutrition facts from a photo Jan 1, 2024
@teolemon
Copy link
Member

teolemon commented Jan 1, 2024

@teolemon
Copy link
Member

@raphael0202 is actively working on this. More updates soon.

@raphael0202
Copy link

raphael0202 commented Oct 29, 2024

The nutrient extraction model was deployed and integrated to Robotoff. For every new image, we run the model on it and generate a prediction. An insight is generated if in the extracted nutrient values, at least one value is not present in the current nutrients.

To get the insights:

GET https://robotoff.openfoodfacts.org/api/v1/insights?insight_types=nutrient_extraction&barcode={BARCODE}

Nutrient values are in insight.data. It contains:

  • entities: a subset of the extracted entities at different processing steps (raw, aggregated, postprocessed). We only have postprocessed here, it's more useful as debug information. We can ignore this field here.
  • nutrients: a dictionary mapping nutrient name to a dict containing:
    • value: the value to add, without the unit
    • unit: the unit (can be g, mg, µg or null). If it's null, it's because we couldn't extract it from the image (either it's missing, the model was wrong or the OCR result was not good enough). In such case I think we can safely use the "default" unit, which depends on the nutrient (as it's done on Product Opener).
    • score: the entity score. Maybe not really relevant here, as this score is not calibrated (most values are > 0.98).
    • char_start, char_end: start and end character offsets in the original text
    • start, end: start and end word offset in the original text

@teolemon
Copy link
Member

cc @monsieurtanuki @g123k

  • We would go for an "Extract Nutrition Facts" button, very much like ingredients, only enabled if we have a precomputer nutrition insight already available (otherwise greyed out). That's a way to keep the user in charge, and avoid potential backlash on "unwanted/unhelpful help" from Robotoff.

  • The model is not language dependant, and we don't need to care about image selection in a specific language

  • Additional values provided by the model would be orange, unchanged values (also provided by the model) would stay normal.

  • In the future, We might nudge the user for a new photo if we deem it too old, but the result won't be instant (photo, bg task, inference, reloading). We might use animations (either on the button and/or the fields to provide hints that new values arrived. Button because all the updated fields might not be visible above the fold)

@monsieurtanuki
Copy link
Contributor

@raphael0202

For every new image

No history, then.

we run the model on it and generate a prediction

How fast? like, 10 seconds?

An insight is generated if in the extracted nutrient values, at least one value is not present in the current nutrients.

Of course that'll make more sense for new products.
What if the robotoff value is different from the current value: do you still send it?

@raphael0202
Copy link

No history, then.

I'm going to process the full image backlog in the coming weeks.

How fast? like, 10 seconds?

Yes, about 10s (we're still running on CPU)

Of course that'll make more sense for new products.

Just to be clear, if the product has no nutrition values, we still generate an insight of course.

What if the robotoff value is different from the current value: do you still send it?

Yes, we currently only generate an insight if the model predicted nutrition values that are not present in the original product. Note that the model can extract both _serving and _100g values.

We discussed a bit the integration the other day with @teolemon, and we came to the conclusion that when clicking on this "extract" button, we could overwrite the product nutrient values that conflict (=because they are already present) with the model prediction.

Also, we consider images from newest to oldest, which means that the image we're extracting nutrition values from are not necessarily ones from the selected image, as we can consider a more recent image.
The used image is indicated in the insight.source_image field returned by the route.

@raphael0202
Copy link

Here are some mockups done by @teolemon to illustrate the behaviour we discussed about :)

Behaviour if no insight is available (greyed button):

Image

Behaviour if an insight is available:

Image

Behaviour once the user clicked on the button:
Image

The idea would be to perform the GET https://robotoff.openfoodfacts.org/api/v1/insights?insight_types=nutrient_extraction?barcode={BARCODE} either before (on product scan?) or after the user goes on the nutrition page.

Before is maybe not the best idea, as the model could run in the meantime. If it's after, we should probably add a loader to the extract button to show we're performing the request.

The idea of greying the button would be to avoid the user to be disappointed when the model failed to extract anything.

@monsieurtanuki
Copy link
Contributor

I'm going to process the full image backlog in the coming weeks.

Cool!

Yes, about 10s (we're still running on CPU)

OK

Just to be clear, if the product has no nutrition values, we still generate an insight of course.

OK

Yes, we currently only generate an insight if the model predicted nutrition values that are not present in the original product. Note that the model can extract both _serving and _100g values.
We discussed a bit the integration the other day with @teolemon, and we came to the conclusion that when clicking on this "extract" button, we could overwrite the product nutrient values that conflict (=because they are already present) with the model prediction.

A bit confusing. Not clear what you do if in off the product has 10g of proteins and robotoff guesses it's 11g: the value is present but is different.

Not focused on the UI/UX for the moment.
I'm rather focused on the off-dart aspect, and I have trouble testing the feature with GETs: any hint?

@raphael0202
Copy link

A bit confusing. Not clear what you do if in off the product has 10g of proteins and robotoff guesses it's 11g: the value is present but is different.

Here we would overwrite the value of the nutrient.

I'm rather focused on the off-dart aspect, and I have trouble testing the feature with GETs: any hint?

A typo slipped in the URL (? instead of &) ;)

https://robotoff.openfoodfacts.org/api/v1/insights?insight_types=nutrient_extraction?barcode={BARCODE}

https://robotoff.openfoodfacts.org/api/v1/insights?insight_types=nutrient_extraction&barcode={BARCODE}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants