Issuu : from reader Iframe to PDF

Objectif

Issuu platform share a lot of documents but some of them are not dowloadable. The problem for me is that I like to keep track of my readings and join notes with them. This package a only one usage : download some or all pages from the site and merge it into a nice pdf format.

Basic usage

Run : python3 ./main.py;
Input URL;
Input the page count you want ;

Features

Loading pages as jpg file in ./tempfolder;
Metada saved in json format in ./outfolder;
URL parsed saved in txt format in ./out folder;
Some logging for debugging ;
Progress bar ;

Requirements

Python

This script requires Python 3 and BeautifulSoup. To install the required packages:

Conda users (For the exact configuration I uses)

conda env create -f ENV.yml
conda activate scrape_issuu

Pip users

pip3 install bs4

ImageMagick

This package also requires the convert command from ImageMagick

Known issues

memory issues, go see this github thread;
authorization issue, go see this askubuntu thread

Credits

This package is mainly a refactorings of https://github.com/dkl3/py-issuu-scrape . Thanks dude :).

dkl3 was inspired by the Ruby script from pietrop: https://github.com/pietrop/issuu.com-downloader as well as dkl3's original python script: https://github.com/dkl3/py-issuu-scrape

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Issuu : from reader Iframe to PDF

Objectif

Basic usage

Features

Requirements

Python

Conda users (For the exact configuration I uses)

Pip users

ImageMagick

Known issues

Credits

Files

README.md

Latest commit

History

README.md

File metadata and controls

Issuu : from reader Iframe to PDF

Objectif

Basic usage

Features

Requirements

Python

Conda users (For the exact configuration I uses)

Pip users

ImageMagick

Known issues

Credits