-
-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
taxonomy: boolean added for european countries + ecobalyse id is adapted for organic ingredients #10929
base: main
Are you sure you want to change the base?
Conversation
…d to get ecobalyse ids
… the country is part of Europe (For Ecobalyse)
…ean countries, the ecobalyse id is adapted for organic ingredients
taxonomies/countries.txt
Outdated
@@ -3332,6 +3332,7 @@ country_code_3:en: AUT | |||
language_codes:en: de | |||
osm_relation:en: 16239 | |||
wikidata:en: Q40 | |||
ecobalyse_is_part_of_eu: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for boolean, by convention in taxonomies we use properties like "vegan:en: yes"
can you replace "ecobalyse_is_part_of_eu: 1" by "is_part_of_eu:en: yes" ?
Unless Ecobalyse has specific countries it considers to be part of EU, let's make the property name is_part_of_eu:en:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright i'll change this !
Indeed, Ecobalyse considers specific countries to be part of EU sometimes in order to get realistic LCA (as an example the UK is not anymore in the EU but for transport impact it's considered so) so i would recommand to keep 'ecobalyse_is_part_of_eu'.
lib/ProductOpener/Ingredients.pm
Outdated
@@ -3322,6 +3339,86 @@ sub get_missing_ciqual_codes ($ingredients_ref) { | |||
return @ingredients_without_ciqual_codes; | |||
} | |||
|
|||
=head2 get_missing_ecobalyse_ids ($ingredients_ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@very-smartin can you update your branch to merge the current main branch in it, so that we can see only the differences with main?
e.g. here is a test PR where I took the ecobalyse branch, merged the main branch (and resolved the conflict). It's much easier to see the new changes only: https://github.com/openfoodfacts/openfoodfacts-server/pull/10930/files
…ean countries, marked as 'yes'
/update_tests_results |
/lint |
if (defined $ecobalyse_code) { | ||
$ingredient_ref->{ecobalyse_code} = $ecobalyse_code; | ||
# If the ingredient is organic, the 'ecobalyse_labels_en_organic:en' id is used | ||
if ($ingredient_ref->{labels} =~ /\ben:organic\b/) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach won't work if we have an organic ingredient for which we have a general ecobalyse match, but no organic ecobalyse match.
e.g. if we have "organic pear", in the ingredients, we don't have it in Ecobalyse:
ecobalyse:en: pear-eu
ecobalyse_origins_en_france:en: pear-fr
but we need to fallback on pear-eu.
So I suggest a different approach:
Based on the labels, origins etc. of the product, construct a list of prioritized ecobalyse suffixes.
e.g. if the product is "organic pear from france", we might want to get:
- ecobalyse_origins_france_label_organic
- ecobalyse_origins_european-union_label_organic
- ecobalyse_label_organic
- ecobalyse_origins_france
- ecobalyse_origins_european-uinion
- ecobalyse
(or maybe a different order, depending on what we consider more important: the origin or the label)
Then we check if we have those as ecobalyse properties for "en:pear", and we stop at the first one we find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point indeed, i'll change my code consequently
@stephanegigandet Note : for now i've only added the code in case of organic ingredients, the code for the origin will be done later |
lib/ProductOpener/Ingredients.pm
Outdated
if ($ingredient_ref->{labels} =~ /\ben:organic\b/) { | ||
|
||
# Retrieve the correct ecobalyse code | ||
my $ecobalyse_code = get_inherited_property("ingredients", $ingredient_ref->{id}, "ecobalyse_labels_en_organic:en"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's easier to build a list of the possible suffixes, because even if we just want to support organic, we need to check those:
- ecobalyse_labels_en_organic:en
- ecobalyse_proxy_labels_en_organic:en
- ecobalyse_en_organic:en
- ecobalyse_proxy_en_organic:en
Here your current code checks 1. and 4.
You could add more if clauses to cover 2 and 3, but it's going to become difficult to read with a lot of duplicate code.
Instead just do something like this (pseudo code):
my @suffixes = ();
if (label organic) : push @suffixes, "_labels_organic"
push @suffixes, ''; # no suffix when we don't have anything
then a foreach loop on suffixes:
foreach my $suffix (@suffixes) {
foreach my $proxy ("", "_proxy") {
my $ecobalyse_code = get_inherited_property("ingredients", $ingredients_ref->{id}, "ecobalyse" . $proxy . $suffix)
if defined $ecobalyse_code {
..
last;
}
˚
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it's a much more better solution, i hadn't seen it like this : new pr incoming !
/lint |
Quality Gate passedIssues Measures |
What
In order to get more Ecobalyse ids more precisely, a boolean for european countries has been added in the taxonomy.
Furthermore, the ecobalyse id en:ecobalyse_label_organic is retrieved for organic ingredients.