Skip to content

Commit

Permalink
Make types compatible with older Python versions
Browse files Browse the repository at this point in the history
  • Loading branch information
Wesley van Lee committed Oct 7, 2024
1 parent 4d18b14 commit b1fcbdc
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 5 deletions.
3 changes: 3 additions & 0 deletions docs/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ Supported variables: `year`, `month`, `day` and `timestamp`.

```python
WACZ_SOURCE_URL = "s3://scrapy-webarchive/archive.wacz"

# Allows multiple sources, comma seperated.
WACZ_SOURCE_URL = "s3://scrapy-webarchive/archive.wacz,/path/to/archive.wacz"
```

This setting defines the location of the WACZ file that should be used as a source for the crawl job.
Expand Down
4 changes: 0 additions & 4 deletions scrapy_webarchive/wacz.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,6 @@ def get_wacz_fname(self) -> str:

class MultiWaczFile:
"""
Multiple WACZ files
The MultiWACZ file format is not yet finalized, hence instead of pointing to a
MultiWACZ file, this just works with the multiple WACZ files.
Expand Down Expand Up @@ -96,8 +94,6 @@ def iter_warc(self):

class WaczFile:
"""
WACZ file.
Handles looking up pages in the index, and iterating over all pages in the index.
Can also iterate over all entries in each WARC embedded in the archive.
"""
Expand Down
3 changes: 2 additions & 1 deletion scrapy_webarchive/warc.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import socket
from typing import List, Tuple
import uuid
from datetime import datetime, timezone
from io import BytesIO
Expand Down Expand Up @@ -42,7 +43,7 @@ def write_record(
self,
url: str,
record_type: str,
headers: list[tuple[str, str]],
headers: List[Tuple[str, str]],
warc_headers: StatusAndHeaders,
content_type: str,
content: str,
Expand Down

0 comments on commit b1fcbdc

Please sign in to comment.