One Day workshop on understanding Docker, Web Scrapping, Regular Expressions, PostgreSQL and Git.
Use Ubuntu 20.04 LTS with following packages installed
- Python 3.9 or above
- docker
- docker-compose
- pip3
- git (any recent version)
- Create an account on GitHub (Only if you do not have an account)
- Fork DataEngineering-Workshop1 repository. Refer this guide to understand how to fork a repository
- Clone forked repo to your machine using SSH Key.
- Make sure you have set up SSH key as per the documentation to create a new SSH Key if you don't have a Key.
- Open your forked repo link in your browser.
- Click on Code (Green color button).
- Select SSH option and copy the link.
- Clone the repo (replace YOUR-GIT-ID with your GitHub id)
git clone [email protected]:<YOUR-GIT-ID>/DataEngineering-Workshop1.git
- To install docker go to your cloned repository and run the following command
sudo prerequisites/install_docker.sh
- Check if Git, Docker, and Docker Compose are installed in on the system.
- Open the terminal and run the following command to check the version of the prerequisites
- Check Git version
git --version
- Check Docker version
docker --version
- Check Docker Compose version
docker-compose --version
- Check Git version
- By the end of this workshop you will learn how to build docker image and it's usage.
- You will learn how to scrape a website using urllib/requests and Beautifulsoup.
- You will learn Regular Expressions and how to work with it.
- You will learn key features of PostgreSQL.
- You will learn how to dockerize your project.
Time | Topics |
---|---|
09:00 - 11:00 | Introduction to Docker |
11:00 - 01:00 | Introduction to Webscrapping. |
01:00 - 02:00 | Break |
02:00 - 03:00 | Dockerizing a project |
03:00 - 04:00 | Introduction to PostgreSQL |
04:00 - 04:30 | Introduction to Github |
04:30 - 04:45 | Q & A |
04:45 - 05:00 | Wrapping Up |