-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scraper request: bergamot.app #986
Comments
Hi @josefhelie - thanks for the question / feature request. In theory, yes this is possible - the webpage is public and represents a recipe. However, there are some potentially important items of information absent on the page: in particular, its origin (from another website? self-authored?) and the instructions. Do you know whether those details can be included when sharing a recipe like this from the app? It's difficult to develop and test without a few complete samples. |
i'm sorry I shared a recipe that don't reflect all the requested fields. Here is a better example: https://dashboard.bergamot.app/shared/mIB4jYQtZU1A97 |
Yep, that initially looks good to me @josefhelie - it's difficult to say for certain without coding it up, but it seems to have most/all of the information we'd need. Thanks! |
Thanks a lot @jayaddison :) |
May I ask any update on this request @jayaddison? |
Hi @josefhelie - apologies for my delayed reply. No further updates on this at the moment I'm afraid. Do you have any interest in learning some Python coding? |
@jayaddison I took a look, looks like it is fairly easy to call the API endpoint, which can be derived from the URL of the recipe. I'm not sure how the library normally supports the case of recipes being loaded via an API call after the original page load - I can see a few examples ( |
Thanks @mlduff!
About the handling of APIs: yep, well discovered - we do have a few scrapers that retrieve data using APIs at the moment. A potential design/architecture problem with that is that it (currently) tightly-couples the scraper to an HTTP client - namely Meanwhile we have a A long explanation, but the short answer is: yep, please go ahead, but be aware that this would currently only be supported in the v14 / mainline branch. |
@mlduff also a design / implementation question for your consideration: those recipes sometimes contain a link to the original source of the recipe. Should we return that as the canonical URL for recipes when possible? |
@jayaddison is your preference for me to develop this in the v15 branch? If I implement in v14 (which seems easier), will it then need rewriting at some point (are the other ones like the example I found going to also need similar rewriting?)? |
Good point, will try to do that. |
I'd recommend implementing it for |
|
@jayaddison I noticed that the tests for the two scrapers I mentioned above are located under the legacy section - do I add my tests under there as well? |
@josefhelie are you able to provide a couple more recipe URLs please so I can test? |
@mlduff yep, that's the correct place for those; thanks for checking 👍 You should be able to configure the |
@josefhelie have you found any pages shared on Bergamot where the original author is credited? I've seen a few pages that have the domain name of the source URL.. I'm wondering whether there are any that list names/usernames. |
@jayaddison I'm not sure I have. Would it help you if you provide me a recipe I could import into Bergamot and then give you the link towards the imported recipe? |
@josefhelie Here is one that has an author https://www.bestrecipes.com.au/recipes/peanut-butter-cookies-recipe/fowk6kuy |
I imported it in my Bergamot, here it is: https://dashboard.bergamot.app/shared/REbGkQaNoVJ5kM |
Thanks @josefhelie - so roughly speaking, it seems like some source recipes may include author info, and the Bergamot page includes a link back to the original, but our scraper can't directly retrieve the author details at the moment (they're not in the Bergamot page, so it seems like we'd have to ask Bergamot to add those, or to retrieve them ourselves from the original URL). I'm not completely sure what to do here; I personally place quite a lot of important on retaining the author name/info (even though it's challenging sometimes) because my assumption is that a lot of recipe authors themselves would want that to be included when people view their recipes. I haven't contacted Bergamot to ask whether they'd consider attempting to include that info themselves, so that's one option I'm considering. Is there a support/feedback option in the app itself? |
I'm currently using the free app Bergamot (which is closed source) to store my recipes, but I'd like to move to Mealie. I've encountered an error message that says, 'recipe_scrapers was unable to scrape this URL.' Is it possible to get a scraper, please? 😇
Thanks for your help.
A link to a shared recipe: https://dashboard.bergamot.app/shared/T8IJLjbtHdh2pj
The text was updated successfully, but these errors were encountered: