The Python implementation of Bottica.
>>> from bottica import Bottica
>>> btca = Bottica()
Bottica
provides the methods to verify an IP according to one of the
verifiers provided in Bottica Core. You can verify a bot:
- 🏷 By name (e.g.
Googlebot
) - 📰 By User-Agent (e.g.
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
)
You can also add your own verifiers for bots that aren't (yet) listed in Bottica Core, and perform the verification methods directly on IPs. Bottica supports both IPv4 and IPv6 everywhere.
When you instantiate Bottica()
, by default it loads the bot list from
Bottica Core. They are available in the verifiers
dictionary, which is a map
from bot name to one or more verifiers.
>>> btca.verifiers["Googlebot"]
{'fcrdns_hosts': ['google.com', 'googlebot.com']}
If multiple verifiers are present, the IP must pass all the verification checks to be considered verified.
In Bottica, bots are identified by name. The names are the name
property of
each entry in the Bottica Core list. To verify that
traffic from a particular IP belongs to a particular bot, use the verify_bot()
method:
>>> btca.verify_bot(ip="1.2.3.4", botname="Googlebot")
False
verify_bot()
relies on the bot name being present in the verifiers
map, so
it will only work for bots from Bottica Core, or any additional bots that you
have added yourself.
Usually you suspect traffic to be coming from a particular bot because of its
User-Agent. Bottica tries to align with the ua-parser project such that User-Agent families
(user_agent.family
) parsed by ua-parser will correspond to bot names in
Bottica. If that is the case, you can use the verify_ua
method to verify an
IP against a particular User-Agent directly.
>>> ua = "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
>>> btca.parse_ua(ua)
'Googlebot'
>>> btca.verify_ua(ip="1.2.3.4", user_agent=ua)
False
If you have a User-Agent that ua-parser can't parse by default, you can add your own parsers to ua-parser:
>>> from ua_parser import user_agent_parser
>>> my_parser = user_agent_parser.UserAgentParser(pattern="MyBotRegex", family_replacement="MyBotName")
>>> user_agent_parser.USER_AGENT_PARSERS.insert(0, my_parser)
You should set the family_replacement
for your parser to be the same
as the bot's name in Bottica. If you're adding a new bot that isn't in Bottica
Core, you should also add your own verifier for it.
By default, Bottica Core supports the biggest bots that provide verification (except Facebook). However, the list can never be exhaustive. If you want to add custom bots to verify, you can do so in two ways:
- Add a verifier directly to
Bottica.verifiers
- Load a custom
bottica.yaml
Do you have a verifiable bot popping up in your logs that isn't included in Bottica Core? Submit a PR!
To add a verifier directly, simply insert it in the verifiers
dictionary
using your bot's name as the key. The value should be a dict specifying one or
more of the verification methods from Bottica Core.
>>> btca.verifiers["MyBotName"] = {"ip_list": ["1.2.3.4", "2.3.4.5"]}
If you want to maintain your custom bots in a more structured way, you can
also load your own bottica.yaml
file, as long as it follows the
specification from Bottica Core.
# my_bottica.yaml
bots:
- name: MyBotName
ip_list:
- 1.2.3.4
- 2.3.4.5
>>> btca.load("path/to/my_bottica.yaml")
>>> btca.verifiers["MyBotName"]
{"ip_list": ["1.2.3.4", "2.3.4.5"]}
Either way (modifying verifiers
directly or loading a custom bottica.yaml
), the end result is the same.
If you don't need the Bottica Core list of verifiers and just want to
to do checks on IPs, you can use the bottica
verification functions
directly.
>>> from bottica import verification
The verification
module offers verification functions that match the names
and behavior of the corresponding verification methods from Bottica Core:
verification.fcrdns_hosts(ip, allowed_hosts, max_tries)
verification.ip_list(ip, allowed_ips)
verification.ip_ranges(ip, allowed_ranges)
verification.cidr_list(ip, allowed_cidrs)