Lookyloo Scraping Tutorial

This tutorial explains why scraping is important (as well as the pitfalls and limitations) and why Lookyloo is an important tool when investigating a complex website.

Target Audience

The target audience for this tutorial is relatively technical, but this tutorial should be relatively simple to follow by anyone willing to understand how websites work.

Setup guide

Work environment

We assume you have the following environment at your disposal:

Ubuntu 20.04 or 20.10. It can be an other similar general purpose operating system (Debian 10, Fedora), but specialized distros such as Kali Linux are strongly discouraged and won't be supported if you have issues.

NOTE: It is assumed that you're not running as root, but the account you're using is administrator (tl;dr: sudo works)
Python 3.8 or 3.9

NOTE: Check it by running python -V in a terminal.
Basic command line tools: curl, wget, grep, git

Poetry 1.1.0 (or more recent), preferably installed this way:

curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py > get-poetry.py
python3 ./get-poetry.py

Make sure Poetry is working by running poetry self -V in a terminal.

Install

Clone the repository (requires git)

git clone https://github.com/Lookyloo/scraping-tutorial.git
cd scraping-tutorial

Install the dependencies

poetry install

Run the lab

jupyter-lab

Move to your browser. Running jupyter-lab should have opened a tab in your favorite browser. If it didn't, look in the terminal for hints.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.gitignore		.gitignore
01.ipynb		01.ipynb
02.ipynb		02.ipynb
03.1.ipynb		03.1.ipynb
03.ipynb		03.ipynb
04.ipynb		04.ipynb
05.ipynb		05.ipynb
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lookyloo Scraping Tutorial

Target Audience

Setup guide

Work environment

Install

About

Releases

Packages

Languages

License

Lookyloo/scraping-tutorial

Folders and files

Latest commit

History

Repository files navigation

Lookyloo Scraping Tutorial

Target Audience

Setup guide

Work environment

Install

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages