You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After coding up an initial implementation of aggregation of schema.org Recipe info from multiple entities in HTML page metadata (see #1381), a number of tests have begun failing.
This issue tracks a double-check process to figure out what the expected test values should really be, and if necessary making corrections, for the following affected scraper fields:
Argiro: this is a case where data is provided as both ld+json (JSON Linked Data) and also HTML microdata (itemprop, itemtype HTML attributes). The data contained in each varies slightly; for the core recipe fields it appears mostly consistent. The linked data (JSON format) appears more complete.
Ethan Chlebowski: on this website, the recipe for a meal is sometimes presented as multiple nested recipes -- and indeed in the Huevos Rancheros test case, component recipes are provided for the salsa and pinto beans respectively. The multiple schema.org entities that we find on the page, therefore, are one entity for each of those recipes. This is tricky: we return a single recipe per URL at the moment, so I don't know what we can do about this -- unless we can extract them by referencing each one individually within the webpage using their URI anchor fragments?
Good Food Discoveries: has multiple entities; the first one contains the bulk of the information about the recipe, the second one contains mostly/only review data.
Womens Weekly: for the recipes on this site, both ld+json and microdata (itemprop, ...) are again provided. However, the only case where they overlap seems to be image URLs, and the results are fairly similar (the resizing/scaling parameters in the URLs differ).
After coding up an initial implementation of aggregation of schema.org
Recipe
info from multiple entities in HTML page metadata (see #1381), a number of tests have begun failing.This issue tracks a double-check process to figure out what the expected test values should really be, and if necessary making corrections, for the following affected scraper fields:
The text was updated successfully, but these errors were encountered: