-
-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate proofs & prices sent to Open Prices #5689
Comments
@raphodn I'll have a look at it. |
You're right for the duplicate prices : openfoodfacts/open-prices#422 We need to decide if it's a silent ignore by the server, or if it returns an error. If the latter, then will it be easy to remove from the "Pending contributions" ? |
@raphodn I guess a duplicate price would rather trigger a warning: you don't add the duplicate price, you return a warning as a String result, and then it's up to me to decide what to do with it in Smoothie. |
@raphodn I wasn't able to reproduce the current issue systematically, though I have to admit that sometimes I don't understand why the prices weren't added immediately. What can be done:
|
I managed to reproduce the issue while adding several prices to an existing proof (that had no prices). This this price for instance: https://prices.openfoodfacts.org/products/4974507095172 |
Even though its somehow a bug in the mobile app, it creates a data quality problem. Shouldn´t that be handled by a general data cleaning process, maybe within robotoff? |
@raphael0202 That's interesting, because it looks like a bug on the server side:
|
Fixing the "delete a proof with prices" issue ASAP. But it's not related to our problem in hand ^^ |
I also created an issue to detect/ignore duplicate proofs, but it'll be a bit more complex to implement (we'll need to calculate and store the image hash) : openfoodfacts/open-prices#514 |
@raphodn Any idea what it's about? |
It's strange, I tried to add prices with strange product_codes (with spaces, with padded 0, with X at the end) and they all work. and how 4974507095172 looks like in the prod db
|
@raphodn As an old-timer that still works with RDBMS, let me ask you this: is your "create single price" operation atomic? |
@raphodn Looks like you send some |
The product doesn't exist in OFF : https://fr.openfoodfacts.org/produit/4974507095172
When adding a price, we check if the product_code is associated with an existing Product (in the Open Prices Product table / as we sync daily with the OFF products)
What's missing here is a an actual POST body that reproduces the "product_id can only have letters or digits" error. |
Agreed, but I expected an empty array instead of null for
Btw I expected it to be an
Actually, you may have to check in Beauty, Pet Food and Products too. |
Where do you see that ? in both the message above and the API response it displays an empty list.
Yes the product_id should the Product DB internal ID. it's not set by the user anywhere. So I still don't see where the issue comes from.. ^^
we already do :) and store the flavor in the |
Another one that needs to get cleaned up ... https://prices.openfoodfacts.org/users/professordoc |
@raphodn That's my interpretation: Anyway, in order to avoid those kinds of misunderstandings, should I switch all fields as nullable? |
@raphodn Not sure if you meant "labels_tags cannot be null with the current implementation", therefore here's an example: {
"id": 7095,
"product_id": 3482592,
"location_id": 3129,
"proof_id": 3317,
"product": {
"id": 3482592,
"code": "4068706054488",
"source": "off",
"product_name": "Katzenzungen",
"image_url": null,
"product_quantity": 0,
"product_quantity_unit": null,
"categories_tags": [
"en:snacks",
"en:sweet-snacks",
"en:biscuits-and-cakes",
"en:biscuits-and-crackers",
"en:biscuits",
"en:cat-tongue"
],
"brands": "Choceur, Stollwerck",
"brands_tags": [
"choceur",
"stollwerck"
],
"labels_tags": null,
"nutriscore_grade": "unknown",
"ecoscore_grade": "c",
"nova_group": null,
"unique_scans_n": 0,
"price_count": 1,
"location_count": 0,
"user_count": 0,
"proof_count": 0,
"created": "2024-09-24T17:14:12.605450Z",
"updated": "2024-09-24T17:14:12.697090Z"
},
"location": {
"id": 3129,
"type": "OSM",
"osm_id": 4966187139,
"osm_type": "NODE",
"osm_name": "Carrefour Market",
"osm_display_name": "Carrefour Market, 37, Rue de Lyon, Quartier des Quinze-Vingts, Paris 12e Arrondissement, Paris, Île-de-France, France métropolitaine, 75012, France",
"osm_tag_key": null,
"osm_tag_value": null,
"osm_address_postcode": "75012",
"osm_address_city": "Paris",
"osm_address_country": "France",
"osm_address_country_code": null,
"osm_lat": 48.8484997,
"osm_lon": 2.3709505,
"website_url": null,
"price_count": 29,
"user_count": 3,
"product_count": 4,
"proof_count": 17,
"created": "2024-02-14T14:46:34.432318Z",
"updated": "2024-08-28T00:53:19.638393Z"
},
"proof": {
"id": 3317,
"location_id": null,
"file_path": "0004/Gti5y7BA52.jpg",
"mimetype": "image/jpeg",
"type": "PRICE_TAG",
"image_thumb_path": "0004/Gti5y7BA52.400.jpg",
"location_osm_id": null,
"location_osm_type": null,
"date": "2024-01-02",
"currency": "USD",
"price_count": 0,
"owner": "openfoodfacts-dart",
"created": "2024-10-11T14:01:49.534898Z",
"updated": "2024-10-11T14:01:49.736850Z"
},
"product_code": "4068706054488",
"product_name": null,
"category_tag": null,
"labels_tags": null,
"origins_tags": null,
"price": 3.99,
"price_is_discounted": false,
"price_without_discount": null,
"price_per": null,
"currency": "USD",
"location_osm_id": 4966187139,
"location_osm_type": "NODE",
"date": "2024-01-02",
"owner": "openfoodfacts-dart",
"created": "2024-10-11T14:01:50.029748Z",
"updated": "2024-10-11T14:01:50.037938Z"
} |
@tradmangh Well, the guy obviously loves Katzenzungen: who are we to judge? 😉 |
I see... OFF doesn't return that field for this product_code 😅 : https://world.openfoodfacts.org/api/v2/product/4068706054488
so indeed the default value isn't managed well during the sync with OFF 😬 |
I'll run some scripts tonight or this weekend to cleanup these (and others) duplicates |
Of course you know that as long as professordoc uses the app, Katzenzungen will be added something like every 5 seconds.
|
"Il faut imaginer Sisyphe heureux" |
How is this linked to the current issue ? |
I explained that quickly above, let's focus on that again. You manage to add a price, and return the price. Cool.
|
@raphodn I confirm that with a |
oh wow 👀 that's kinda extreme. did a quick check, there's 132791 products concerned with the labels_tags=None ; 126213 with categories_tags=None ; 59504 with brands_tags=None did a quick SQL update to fix them all in prod. |
Great! Obviously we need to be more resilient in off-dart. NOT NULLABLE
I guess that for those
Nullable fields
|
I've just created #5693 because the "not uploaded immediately" part has nothing to do with the duplicates. |
ok everything should be fixed :)
the product/user/location stat counts will be updated tonight (every start of week) |
@raphodn Splendid! Still needed: what to do with future duplicates, right? |
Indeed ! but it's more a feature than a bug, so I will probably prioritize other stuff before. The related issues are: |
@raphodn So, what now? Should we close the current issue? Is there anything you expect to be coded or fixed in Smoothie or in off-dart? |
Well I'd expect any re-user of the API (web, mobile, third-party) to not retry calling the API if the server responds with a success message (20X). In our case a POST on /proofs & /prices, that returns a 201. Re-trying to create the price, although the server has responded a 201 that the price has indeed been created, but the answer body is not 100% on-par with the API specs, that's something to be fixed.. the API will evolve, or an edge-case will happen, and the problem of the mobile sending dozens of identical prices will resurface 🤷 |
@raphodn Sounds fair. Then of course it means that the status code is always reliable, as it would be from now on the only thing that matters. |
@raphodn I may limit my changes to That's the only case where I pretend to care about the returned JSON when it's not really the case, and the impact on the server is dramatic (prices constantly added) if the JSON seems corrupted (according to my standards). Other not relevant cases:
You know that we cannot be as responsive with an app as with a web app, given that the users upgrade their apps when they want. |
Mainly fixed within Prices; minor related fix in openfoodfacts/openfoodfacts-dart#987. |
What
I'm seeing regularly dozens of duplicate prices being uploaded to Open Prices.
When looking at the source, it indicated "Smoothie - Open Food Facts"
Some users are reporting that the prices are not uploaded immediately, but stay dormant in the "Pending contributions", then lost or uploaded multiple times.
cc @raphael0202 @tradmangh who've faced this issue
Screenshot/Mockup/Before-After
The text was updated successfully, but these errors were encountered: