This is a simple, dependendency-free Python script for downloading every XKCD comic via the author's API.
To download every comic to xkcd/
:
mkdir -p xkcd
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - -o xkcd/
- List information about comics in JSONL format
- Download the latest comic or all comics to a directory of your choice
- Automatically cache comic metadata to avoid looking up the same information over and over again
- Automatically inject ISO-8601 dates into comic metadata to make it easier to sort and filter comics by date
The following options are available:
python3 xkcd.py --help
usage: xkcd.py [-h] [-v] [-o OUTPUT_DIR] [-c CACHE_DIR] [-f] [-n NUM] [--filename-policy {safe_title,filename,alt,date}] [--latest] [--download] [--sparse-output] [--limit LIMIT]
Dependency-less XKCD client
optional arguments:
-h, --help show this help message and exit
-v, --verbose Enable debug logging
-o OUTPUT_DIR, --output-dir OUTPUT_DIR
If provided, download comics to this directory
-c CACHE_DIR, --cache-dir CACHE_DIR
If provided, cache comic metadata in this directory
-f, --force Force re-download
-n NUM, --num NUM Comic # (e.g. 1234)
--filename-policy {safe_title,filename,alt,date}, -p {safe_title,filename,alt,date}
Filename policy
--latest Get latest comic
--download Download the comic
--sparse-output Drop empty JSON fields when listing comic metadata
--limit LIMIT Limit number of comics to download
The following commands are equivalent:
python3 xkcd.py
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 -
To download all comics to the current directory:
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - --download
To download all comics to a specific directory:
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - -o xkcd/
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - --latest --download
To download the latest comic to a specific directory:
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - --latest -o xkcd/
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - --latest | jq
{
"month": 1,
"num": 2875,
"link": "",
"year": 2024,
"news": "",
"safe_title": "2024",
"transcript": "",
"alt": "It wasn't originally constitutionally required, but presidents who served two terms have traditionally followed George Washington's example and gotten false teeth.",
"img": "https://imgs.xkcd.com/comics/2024.png",
"title": "2024",
"day": 1,
"date": "2024-01-01"
}
To download all comics to the current directory:
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - --download
To download all comics to a specific directory:
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - -o xkcd/
To download a specific comic (e.g. #1234):
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - -n 1234 -o .
To list information about every comic:
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 -
To list all titles:
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - | jq -r '.title' | sort | uniq
To list all alt texts:
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - | jq -r '.alt' | sort | uniq
To list all download URLs:
curl https://raw.githubusercontent.com/whitfieldsdad/xkcd-downloader/main/xkcd.py -s | python3 - | jq -r '.img' | sort | uniq
To lookup information about the latest comic:
from xkcd import Client
client = Client()
print(client.latest())
{'month': 1, 'num': 2879, 'link': '', 'year': 2024, 'news': '', 'safe_title': 'Like This One', 'transcript': '', 'alt': "A lot of sentences undergo startling shifts in mood if you add 'like this one' to the end, but high on the list is 'I'm a neurologist studying dreams.'", 'img': 'https://imgs.xkcd.com/comics/like_this_one.png', 'title': 'Like This One', 'day': 10, 'date': datetime.date(2024, 1, 10)}
To download the latest comic to the current directory:
from xkcd import Client
import os
client = Client()
client.download_latest(output_dir=os.getcwd())
To download the latest comic to a specific directory:
from xkcd import Client
client = Client()
client.download_latest(output_dir='xkcd/')
To list information about historical comic:
from xkcd import Client
client = Client()
comics = client.iter_comics()
for comic in comics:
print(comic)
To download all comics to the current directory:
from xkcd import Client
client = Client()
client.download_comics()
To download all comics to a specific directory:
from xkcd import Client
client = Client()
client.download_comics(output_dir='xkcd/')
To download a specific comic (e.g. #1234):
from xkcd import Client
client = Client()
client.download_comic(output_dir='xkcd', num=1234)
- There is no comic #404.
- XKCD has a great API. You should use it instead of scraping the website.
- The following information is available for each comic via the API:
curl https://xkcd.com/614/info.0.json --silent | jq
{
"month": "7",
"num": 614,
"link": "",
"year": "2009",
"news": "",
"safe_title": "Woodpecker",
"transcript": "[[A man with a beret and a woman are standing on a boardwalk, leaning on a handrail.]]\nMan: A woodpecker!\n<<Pop pop pop>>\nWoman: Yup.\n\n[[The woodpecker is banging its head against a tree.]]\nWoman: He hatched about this time last year.\n<<Pop pop pop pop>>\n\n[[The woman walks away. The man is still standing at the handrail.]]\n\nMan: ... woodpecker?\nMan: It's your birthday!\n\nMan: Did you know?\n\nMan: Did... did nobody tell you?\n\n[[The man stands, looking.]]\n\n[[The man walks away.]]\n\n[[There is a tree.]]\n\n[[The man approaches the tree with a present in a box, tied up with ribbon.]]\n\n[[The man sets the present down at the base of the tree and looks up.]]\n\n[[The man walks away.]]\n\n[[The present is sitting at the bottom of the tree.]]\n\n[[The woodpecker looks down at the present.]]\n\n[[The woodpecker sits on the present.]]\n\n[[The woodpecker pulls on the ribbon tying the present closed.]]\n\n((full width panel))\n[[The woodpecker is flying, with an electric drill dangling from its feet, held by the cord.]]\n\n{{Title text: If you don't have an extension cord I can get that too. Because we're friends! Right?}}",
"alt": "If you don't have an extension cord I can get that too. Because we're friends! Right?",
"img": "https://imgs.xkcd.com/comics/woodpecker.png",
"title": "Woodpecker",
"day": "24"
}
- The above information is lightly modified to make numbers numbers and to include a date in ISO-8601 format:
{
"month": 7,
"num": 614,
"link": "",
"year": 2009,
"news": "",
"safe_title": "Woodpecker",
"transcript": "[[A man with a beret and a woman are standing on a boardwalk, leaning on a handrail.]]\nMan: A woodpecker!\n<<Pop pop pop>>\nWoman: Yup.\n\n[[The woodpecker is banging its head against a tree.]]\nWoman: He hatched about this time last year.\n<<Pop pop pop pop>>\n\n[[The woman walks away. The man is still standing at the handrail.]]\n\nMan: ... woodpecker?\nMan: It's your birthday!\n\nMan: Did you know?\n\nMan: Did... did nobody tell you?\n\n[[The man stands, looking.]]\n\n[[The man walks away.]]\n\n[[There is a tree.]]\n\n[[The man approaches the tree with a present in a box, tied up with ribbon.]]\n\n[[The man sets the present down at the base of the tree and looks up.]]\n\n[[The man walks away.]]\n\n[[The present is sitting at the bottom of the tree.]]\n\n[[The woodpecker looks down at the present.]]\n\n[[The woodpecker sits on the present.]]\n\n[[The woodpecker pulls on the ribbon tying the present closed.]]\n\n((full width panel))\n[[The woodpecker is flying, with an electric drill dangling from its feet, held by the cord.]]\n\n{{Title text: If you don't have an extension cord I can get that too. Because we're friends! Right?}}",
"alt": "If you don't have an extension cord I can get that too. Because we're friends! Right?",
"img": "https://imgs.xkcd.com/comics/woodpecker.png",
"title": "Woodpecker",
"day": 24,
"date": "2009-07-24"
}