Skip to content

Price tracker for Berserk (Scraper, API and App included)

Notifications You must be signed in to change notification settings

madgustoo/BerserkerPriceTracker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BerserkerPriceTracker

Price tracker for the worldwide phenomenon Berserk manga series (Scraper and Django App/Rest API included)

PriceTrack: The Django project

PriceTrackerSpider: The Scrapy project

Getting Started

These instructions will get you started with setting up all three projects in your local machine.

Prerequisites

Install projects requirements in a virtualenv rather than your global environnement, this is highly recommended and best practice.

Get one up and running and install requirements using:

pip install -r requirements.txt

Installation

First, let's link the Django models to the Scrapy project.

The goal here is to get the three scrapy spiders to succesfully save the scraped data to the database in Django. In order to use Django models inside the Scrapy project, change the following path in the Scrapy's project settings.py to the path to your local [PriceTrack] Django project:

# Setting up django's project full path.
sys.path.insert(0, '/home/madgusto/PycharmProjects/BerserkerPriceTracker/PriceTrack')

For more info on how this works, I'd recommend checking the Step section in scrapy-djangoitem's doc (the README file)

Now let's take the time to setup and configure a PostgreSQL database for Django.

  • Create a database and database User

Then change both respective fields in settings.py:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'berserkdb',
        'USER': 'madgusto',
        'PASSWORD': '*****',
        'HOST': 'localhost',
        'PORT': '',
    }
}

PostgreSQL installation - Highly recommended if you're new to PostgreSQL. You'll learn about the Create database and Create user commands.

Running the spiders

Available spiders as of 12/02/2017 Note: 'update' here can easily mean 'add' if spiders are run the first time.

  • datacrawler : crawls amazon and updates common 'static' entries like the name, the image, etc.
  • amazon : crawls amazon and updates all of amazon's prices and availability entries
  • bookdepo : crawls bookdepository and updates all of bookdepository's prices and availability entries

Example

scrapy crawl datacrawler

About

Price tracker for Berserk (Scraper, API and App included)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages