BerserkerPriceTracker

Price tracker for the worldwide phenomenon Berserk manga series (Scraper and Django App/Rest API included)

PriceTrack: The Django project

PriceTrackerSpider: The Scrapy project

Getting Started

These instructions will get you started with setting up all three projects in your local machine.

Prerequisites

Install projects requirements in a virtualenv rather than your global environnement, this is highly recommended and best practice.

Get one up and running and install requirements using:

pip install -r requirements.txt

Installation

First, let's link the Django models to the Scrapy project.

The goal here is to get the three scrapy spiders to succesfully save the scraped data to the database in Django. In order to use Django models inside the Scrapy project, change the following path in the Scrapy's project settings.py to the path to your local [PriceTrack] Django project:

# Setting up django's project full path.
sys.path.insert(0, '/home/madgusto/PycharmProjects/BerserkerPriceTracker/PriceTrack')

For more info on how this works, I'd recommend checking the Step section in scrapy-djangoitem's doc (the README file)

Now let's take the time to setup and configure a PostgreSQL database for Django.

Create a database and database User

Then change both respective fields in settings.py:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'berserkdb',
        'USER': 'madgusto',
        'PASSWORD': '*****',
        'HOST': 'localhost',
        'PORT': '',
    }
}

PostgreSQL installation - Highly recommended if you're new to PostgreSQL. You'll learn about the Create database and Create user commands.

Running the spiders

Available spiders as of 12/02/2017 Note: 'update' here can easily mean 'add' if spiders are run the first time.

datacrawler : crawls amazon and updates common 'static' entries like the name, the image, etc.
amazon : crawls amazon and updates all of amazon's prices and availability entries
bookdepo : crawls bookdepository and updates all of bookdepository's prices and availability entries

Example

scrapy crawl datacrawler

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
PriceTrack		PriceTrack
PriceTrackerSpider		PriceTrackerSpider
.gitignore		.gitignore
README.md		README.md
_config.yml		_config.yml
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BerserkerPriceTracker

Getting Started

Prerequisites

Installation

Running the spiders

Example

About

Releases

Packages

Languages

madgustoo/BerserkerPriceTracker

Folders and files

Latest commit

History

Repository files navigation

BerserkerPriceTracker

Getting Started

Prerequisites

Installation

Running the spiders

Example

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages