Skip to content

Commit

Permalink
fix: ban more AI bots in robots.txt (#10726)
Browse files Browse the repository at this point in the history
Co-authored-by: Stéphane Gigandet <[email protected]>
  • Loading branch information
raphael0202 and stephanegigandet authored Aug 27, 2024
1 parent 61f3c30 commit 4ce657e
Show file tree
Hide file tree
Showing 10 changed files with 83 additions and 5 deletions.
4 changes: 2 additions & 2 deletions lib/ProductOpener/Display.pm
Original file line number Diff line number Diff line change
Expand Up @@ -1007,12 +1007,12 @@ sub set_user_agent_request_ref_attributes ($request_ref) {
my $is_crawl_bot = 0;
my $is_denied_crawl_bot = 0;
if ($user_agent_str
=~ /\b(Googlebot|Googlebot-Image|Google-InspectionTool|bingbot|Applebot|Yandex|DuckDuck|DotBot|Seekport|Ahrefs|DataForSeo|Seznam|ZoomBot|Mojeek|QRbot|Qwant|facebookexternalhit|Bytespider|GPTBot|ClaudeBot|SEOkicks|Searchmetrics|MJ12|SurveyBot|SEOdiver|wotbox|Cliqz|Paracrawl|Scrapy|VelenPublicWebCrawler|Semrush|MegaIndex\.ru|Amazon|aiohttp|python-request)/i
=~ /\b(Googlebot|Googlebot-Image|Google-InspectionTool|bingbot|Applebot|Yandex|DuckDuck|DotBot|Seekport|Ahrefs|DataForSeo|Seznam|ZoomBot|Mojeek|QRbot|Qwant|facebookexternalhit|Bytespider|GPTBot|cohere-ai|anthropic-ai|PerplexityBot|ClaudeBot|Claude-Web|SEOkicks|Searchmetrics|MJ12|SurveyBot|SEOdiver|wotbox|Cliqz|Paracrawl|Scrapy|VelenPublicWebCrawler|Semrush|MegaIndex\.ru|Amazon|aiohttp|python-request)/i
)
{
$is_crawl_bot = 1;
if ($user_agent_str
=~ /\b(bingbot|Seekport|Ahrefs|DataForSeo|Seznam|ZoomBot|Mojeek|QRbot|Bytespider|SEOkicks|Searchmetrics|MJ12|SurveyBot|SEOdiver|wotbox|Cliqz|Paracrawl|Scrapy|VelenPublicWebCrawler|Semrush|MegaIndex\.ru|YandexMarket|Amazon|ClaudeBot)/
=~ /\b(bingbot|Seekport|Ahrefs|DataForSeo|Seznam|ZoomBot|Mojeek|QRbot|Bytespider|SEOkicks|Searchmetrics|MJ12|SurveyBot|SEOdiver|wotbox|Cliqz|Paracrawl|Scrapy|VelenPublicWebCrawler|Semrush|MegaIndex\.ru|YandexMarket|Amazon|GPTBot|PerplexityBot|ClaudeBot|Claude-Web|cohere-ai|anthropic-ai)/i
)
{
$is_denied_crawl_bot = 1;
Expand Down
13 changes: 13 additions & 0 deletions templates/web/pages/robots/robots.tt.txt
Original file line number Diff line number Diff line change
Expand Up @@ -90,4 +90,17 @@ Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: GPTBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: PerplexityBot
Disallow: /
[% END %]
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@
"origins_of_ingredients" : {
"aggregated_origins" : [
{
"epi_score" : 0,
"epi_score" : "0",
"origin" : "en:unknown",
"percent" : 100,
"transportation_score" : null
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@
"origins_of_ingredients" : {
"aggregated_origins" : [
{
"epi_score" : "0",
"epi_score" : 0,
"origin" : "en:unknown",
"percent" : 100,
"transportation_score" : null
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@
"origins_of_ingredients" : {
"aggregated_origins" : [
{
"epi_score" : "0",
"epi_score" : 0,
"origin" : "en:unknown",
"percent" : 100,
"transportation_score" : null
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -208,3 +208,16 @@ Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: GPTBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: PerplexityBot
Disallow: /
Original file line number Diff line number Diff line change
Expand Up @@ -297,3 +297,16 @@ Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: GPTBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: PerplexityBot
Disallow: /
Original file line number Diff line number Diff line change
Expand Up @@ -297,3 +297,16 @@ Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: GPTBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: PerplexityBot
Disallow: /
Original file line number Diff line number Diff line change
Expand Up @@ -208,3 +208,16 @@ Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: GPTBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: PerplexityBot
Disallow: /
Original file line number Diff line number Diff line change
Expand Up @@ -208,3 +208,16 @@ Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: GPTBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: PerplexityBot
Disallow: /

0 comments on commit 4ce657e

Please sign in to comment.