https://www.thaqalayn-api.net/
A Rest API that allows for the retrieval of hadiths from thaqalayn.net in JSON format. To create it, I first built a web scraper (python) to get all the hadiths on thaqalayn.net. Afterwards I stored the data in an online database and created an API using node.js + express. I also created a simple front-end with react to showcase one of the endpoints (api/random). The front-end can be reached at https://thaqalayn-api.web.app/
Here is a simple example of how to fetch one of the endpoints using axios. Change url
to whatever endpoint you'd like.
const url = "https://www.thaqalayn-api.net/api/random"
request = axios.get(url).then(res => {
console.log(res.data);
//...
})
- Retrieve all the available books, with minimum and maximum Id's:
https://www.thaqalayn-api.net/api/allbooks
- Retrieve a random hadith from any book:
https://www.thaqalayn-api.net/api/random
- Retrieve a random hadith from a given book:
https://www.thaqalayn-api.net/api/[bookId]/random
- Make a query throughout the entire database. This is a very simplistic case-insensitive search mechanism that accepts both english and arabic and searches for any hadith with an exact match. Use it with query
q
:https://www.thaqalayn-api.net/api/query?q=[query]
- Make a query for a specific book. Same rules as above apply here:
https://www.thaqalayn-api.net/api/query/[bookId]?q=[query]
- Get all the hadiths for a particular book:
https://www.thaqalayn-api.net/api/[bookId]
- Return a specific hadith based on id:
https://www.thaqalayn-api.net/api/[bookId]/[id]
- Retrieve a random hadith from a given book:
- Make a query throughout all books:
- Make a query for a specific book:
- Get all the hadiths for a particular book:
- Get a specific hadith based on id:
In this github repository you'll also find 4 python files, 2 of them web scrapers:
- WebScraper/WebScraperComplete.py -> This scrapes the entire thaqalayn.net website and creates a JSON for every book. Also creates a JSON containing all the books.
- WebScraper/WebScraperPerBook.py -> This scrapes only a single book given the URL of the book from thaqalayn.net. The code is mostly a simple copy/paste from WebScraperComplete.py. If you want to use it, you will need to comment out line 39 and use line 38 with whatever URL you want.
- WebScraper/ChangeJSON.py -> If you're unhappy with the json's you got from the previous web scrapers, can use this to modify them as you like.
- WebScraper/CreateBookNamesJSON -> This python file uses the API and creates a JSON of all the names with the min-max IDs. This JSON is then used to create the /allBooks endpoint.
I also included all the scraped JSONs, in case anyone would like to use them directly. This data includes each book separately, all books combined, and a list of all books present with the maximum query id.
Feel free to use any part of this project and modify as you'd like.
- Made a webscraper "./WebScraper/Scraping Data.ipynb" that uses selenium to scrape the latest website (8/4/2024)
- Put all the csv data in "./csvData/"
- Made a notebook "./total_hadith_count.ipynb" that counts the total number hadith collected from the scraper
- Total Count is 26,975 !!