mallam-scrape

website scrapping tool for mallam-ai

Pre-requisites

go run ./cmd/mallam-scrape "https://www.marxists.org/archive/marx/"

This will scrape all urls and save to out/www.marxists.org/../.. directory

go run ./cmd/mallam-extract-text-marx

This will read all HTML files in out/www.marxists.org/archive/marx/works and save plain text to out/text-marx.txt

Internal Logic

MALLAM Developers, MIT License