Option "do not scrape duplicates" #48

maksbdev · 2024-04-24T15:26:46Z

Often, when scraping several similar queries (or one query in nearby locations), duplicates may appear in search results, resulting in duplicate entries in the output file and requiring additional time to complete the task. Is it possible to implement a parameter that prevents scraping information about an organization if it has already been scraped earlier?

gosom · 2024-04-25T04:53:48Z

hi @maksbdev ,

this sounds like a good idea.

I will try to include in the next release.

thanks for your feedback

ruanbsroche · 2024-05-14T13:15:13Z

i have made one script in py to do this

gosom self-assigned this Apr 25, 2024

gosom added enhancement New feature or request good first issue Good for newcomers labels Apr 25, 2024

gosom removed their assignment Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option "do not scrape duplicates" #48

Option "do not scrape duplicates" #48

maksbdev commented Apr 24, 2024

gosom commented Apr 25, 2024

ruanbsroche commented May 14, 2024

Option "do not scrape duplicates" #48

Option "do not scrape duplicates" #48

Comments

maksbdev commented Apr 24, 2024

gosom commented Apr 25, 2024

ruanbsroche commented May 14, 2024