Skip to content

Commit

Permalink
Add PDF manual scraping and frontend output (EmulationStation variant…
Browse files Browse the repository at this point in the history
…s, ES-DE) (#60)

Changes by maintainer to initial PR:
- refactored duplicated code segments
- removed manualFormat
- removed surplus methods
- updated documentation
- videoData and manualData as bytearray

Co-authored-by: Giorgio Ceolin <[email protected]>
  • Loading branch information
Gemba and Giorgio Ceolin authored May 7, 2024
1 parent 4e3b9a3 commit 2a3ead6
Show file tree
Hide file tree
Showing 39 changed files with 416 additions and 155 deletions.
5 changes: 5 additions & 0 deletions cache/priorities.xml.example
Original file line number Diff line number Diff line change
Expand Up @@ -106,4 +106,9 @@
<source>esgamelist</source>
<source>screenscraper</source>
</order>
<order type="manual">
<source>import</source>
<source>esgamelist</source>
<source>screenscraper</source>
</order>
</priorities>
2 changes: 1 addition & 1 deletion docs/CACHE.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The default base folder for all of Skyscrapers' locally cached data is in the `/

**Resource and scraping module priorities**

There is ONE file that you can and should edit inside each of the `/home/<USER>/.skyscraper/cache/<PLATFORM>` folders. That file is called `priorities.xml` and decides the scraper priority of resources for each resource type. For instance, if you know that `thegamesdb` always provides the best `descriptions` for games, you'd add an `<order type="description">` node with a `<source>thegamesdb</source>` subnode. You can have multiple `<source>` nodes, Skyscraper will then prefer the topmost source when generating a game list. If the topmost isn't found it'll prioritize the next one and so forth. Any source that isn't listed with an `<order>` node will be prioritized using timestamps for when each resource was added to the cache. So you don't _have_ to add all of them.
There is ONE file that you can and should edit inside each of the `/home/<USER>/.skyscraper/cache/<PLATFORM>` folders. That file is called `priorities.xml` and decides the scraper priority of resources for each resource type. For instance, if you know that `thegamesdb` always provides the best `descriptions` for games, you'd add an `<order type="description">` node with a `<source>thegamesdb</source>` subnode. You can have multiple `<source>` nodes, Skyscraper will then prefer the topmost source when generating a game list. If the topmost isn't found it'll prioritize the next one and so forth. Any source that isn't listed with an `<order>` node will be prioritized using timestamps (newest wins) for when each resource was added to the cache. So you don't _have_ to add all of them.

Skyscraper provides the example file `/home/<USER>/.skyscraper/cache/priorities.xml.example`. Please don't edit this file manually, as it will be overwritten when you update Skyscraper. When a platform is scraped for the first time, it will automatically copy the example file to `/home/<USER>/.skyscraper/cache/<PLATFORM>/priorities.xml` unless it already exists. You can of course also copy the file yourself before scraping a platform. If you do so, be sure to remove the `.example` part of the filename so it's just called `priorities.xml`.

Expand Down
17 changes: 12 additions & 5 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,20 @@
## Changes

### Version 3.12.0 (TBA)

- Added: Support for scraping of PDF manuals (Modules screenscraper, import and
esgamelist) and gamelist output with these manuals for frontends (ES-DE
Frontend, some EmulationStation variants). See configurations options
[`manuals=true`](CONFIGINI.md#manuals) and
[`gameListVariants=enable-manuals`](CONFIGINI.md#gamelistvariants). Thanks for
the initial PR, @pandino

### Version 3.11.0 (2024-04-15)

- Added: Support for EmulationStation Desktop Edition (ES-DE Frontend). Use
[`frontend=esde`](http://localhost:8000/skyscraper/CONFIGINI/#frontend) in
`config.ini` and see
[documentation](http://localhost:8000/skyscraper/FRONTENDS/#emulationstation-desktop-edition-es-de)
on the default settings. Thanks for the hints and for testing, @maxexcloo,
@Nargash
[`frontend=esde`](CONFIGINI.md#frontend) in `config.ini` and see
[documentation](FRONTENDS.md#emulationstation-desktop-edition-es-de) on the
default settings. Thanks for the hints and for testing, @maxexcloo, @Nargash
- Added: Entries in
[`aliasMap.csv`](https://github.com/Gemba/skyscraper/blob/master/aliasMap.csv)
are now also applicable for Screenscraper. Thanks, @retrobit.
Expand Down
8 changes: 8 additions & 0 deletions docs/CLIHELP.md
Original file line number Diff line number Diff line change
Expand Up @@ -442,6 +442,10 @@ This flag forces Skyscraper to use the filename (excluding extension) instead of

When gathering data from any of the scraping modules many potential entries will be returned. Normally Skyscraper chooses the best entry for you. But should you wish to choose the best entry yourself, you can enable this flag. Skyscraper will then list the returned entries and let you choose which one is the best one.

#### manuals

By default Skyscraper doesn't scrape and cache game manuals resources because not all scraping sites provide this data and also only some frontends support PDF display of these game manuals. You can enable it by using this flag. Consider setting this in [`config.ini`](CONFIGINI.md#manuals) instead.

#### nobrackets

Use this flag to disable any bracket notes when generating the game list. It will disable notes such as `(Europe)` and `[AGA]` completely. This flag is only relevant when generating the game list. It makes no difference when gathering data into the resource cache. Consider setting this in [`config.ini`](CONFIGINI.md#brackets) instead.
Expand Down Expand Up @@ -518,6 +522,10 @@ Only relevant when generating an EmulationStation, a Retrobat or a Pegasus game

When generating gamelists, skip processing covers that already exist in the media output folder.

#### skipexistingmanuals

When generating gamelists, skip copying manuals that already exist in the media output folder.

#### skipexistingmarquees

When generating gamelists, skip processing marquees that already exist in the media output folder.
Expand Down
27 changes: 27 additions & 0 deletions docs/CONFIGINI.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ This is an alphabetical index of all configuration options including the section
| [frontend](CONFIGINI.md#frontend) | Y | | | |
| [gameListBackup](CONFIGINI.md#gamelistbackup) | Y | | Y | |
| [gameListFolder](CONFIGINI.md#gamelistfolder) | Y | Y | Y | |
| [gameListVariants](CONFIGINI.md#gamelistvariants) | | | Y | |
| [hints](CONFIGINI.md#hints) | Y | | | |
| [importFolder](CONFIGINI.md#importfolder) | Y | Y | | |
| [includeFrom](CONFIGINI.md#includefrom) | Y | Y | | |
Expand All @@ -93,6 +94,7 @@ This is an alphabetical index of all configuration options including the section
| [lang](CONFIGINI.md#lang) | Y | Y | | |
| [langPrios](CONFIGINI.md#langprios) | Y | Y | | |
| [launch](CONFIGINI.md#launch) | Y | Y | Y | |
| [manuals](CONFIGINI.md#manuals) | Y | Y | | |
| [maxFails](CONFIGINI.md#maxfails) | Y | | | |
| [maxLength](CONFIGINI.md#maxlength) | Y | Y | Y | Y |
| [mediaFolder](CONFIGINI.md#mediafolder) | Y | Y | Y | |
Expand Down Expand Up @@ -959,3 +961,28 @@ However, folder data is not cached by Skyscraper, which means if you delete your

Default value: false
Allowed in sections: Only for frontends `[emulationstation]`, `[esde]` or `[retrobat]`

---

#### manuals

By default Skyscraper doesn't scrape and cache game manuals resources because not all scraping sites provide this data and also only some frontends support PDF display of these game manuals. If enabled Skyscraper will collect game manuals for the scraping modules that provide this data. For frontend ES-DE no further option must be set to enable the output of the PDF manuals to the appropriate folder. For other EmulationStation forks see also option [gameListVariants](CONFIGINI.md#gamelistvariants).

Default value: false
Allowed in sections: `[main]`, `[<PLATFORM>]`

---

#### gameListVariants

This is a comma separated list of options for the different gamelist variants used by the various EmulationStation forks. Currently only `enable-manuals` is evaluated as variant: It generates `<manual/>` entries in the gamelist for the game manuals scraped or found in the cache, if also the manuals configuration option is enabled. This option is not needed for the ES-DE frontend to output game manuals.

**Example(s)**

```ini
[emulationstation]
gameListVariants="enable-manuals"
```

Default value: unset
Allowed in sections: Only for frontend `[emulationstation]`
5 changes: 3 additions & 2 deletions docs/IMPORT.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,13 @@ The following describes how to import your own custom textual, artwork and / or

Be sure to also check the `--cache edit` option [here](CLIHELP.md#--cache-editnewtype).

### Images and Videos
### Images, Videos and Game Manuals

To import videos or images into the resource cache, use the following procedure:

- Name your image or video file with the _exact_ base name of the rom you wish to connect it to. Example: `Bubble Bobble.nes` will import images with a filename of `Bubble Bobble.jpg` or `Bubble Bobble.png` or other well-known image formats. As long as the base name is an _exact_ match. Same goes for video files. I recommend only making use of well-known video formats since Skyscraper imports them directly without conversion (unless you convert them as described [here](CONFIGINI.md#videoconvertcommand)), so they need to be supported directly by the frontend you plan to use.
- Place all of your images or videos in the `/home/<USER>/.skyscraper/import/screenshots`, `covers`, `wheels`, `marquees` or `videos` folders.
- Game manuals are expected to use PDF format and have the extension `.pdf`. The base name must match the ROM file, thus the game manual of the example is `Bubble Bobble.pdf`.
- Place all of your images, videos or game manuals in the `/home/<USER>/.skyscraper/import/<PLATFORM>/screenshots`, `covers`, `wheels`, `marquees`, `videos` or `manuals` folders.
- Now run Skyscraper with `Skyscraper -p <PLATFORM> -s import`. If you named your files correctly, they will now be imported. Look for the green 'YES' in the output at the rom(s) you've placed files for. This will tell you if it succeeded or not.
- The data is now imported into the resource cache. To make use of if read [here](#how-to-actually-use-the-data).

Expand Down
4 changes: 2 additions & 2 deletions docs/SCRAPINGMODULES.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,14 @@ Below follows a description of all scraping modules.
- API request limit: _20k per day for registered users_
- Thread limit: _1 or more depending on user credentials_
- Platform support: _[Check list under "Systémes"](https://www.screenscraper.fr)_ or see `screenscraper_platforms.json` sibling to your `config.ini`
- Media support: _`cover`, `screenshot`, `wheel`, `marquee`, `video`_
- Media support: _`cover`, `screenshot`, `wheel`, `manual`, `marquee`, `video`_
- Example use: `Skyscraper -p snes -s screenscraper`

ScreenScraper is probably the most versatile and complete retro gaming database out there. It searches for games using either the checksums of the files or by comparing the _exact_ file name to entries in their database.

It can be used for gathering data for pretty much all platforms, but it does have issues with platforms that are ISO based. Still, even for those platforms, it does locate some games.

It has the best support for the `wheel` and `marquee` artwork types of any of the databases, and also contains videos for a lot of the games.
It has the best support for the `wheel` and `marquee` artwork types of any of the databases, and also contains videos and manuals for a lot of the games.

I strongly recommend supporting them by contributing data to the database, or by supporting them with a bit of money. This can also give you more threads to scrape with.

Expand Down
1 change: 1 addition & 0 deletions src/abstractfrontend.h
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ class AbstractFrontend : public QObject {
virtual QString getMarqueesFolder() { return QString(); };
virtual QString getTexturesFolder() { return QString(); };
virtual QString getVideosFolder() { return QString(); };
virtual QString getManualsFolder() { return QString(); };
virtual void sortEntries(QList<GameEntry> &gameEntries);

protected:
Expand Down
19 changes: 19 additions & 0 deletions src/abstractscraper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -149,11 +149,17 @@ void AbstractScraper::populateGameEntry(GameEntry &game) {
getVideo(game);
}
break;
case MANUAL:
if (config->manuals) {
getManual(game);
}
break;
default:;
}
}
}

// TODO: openretro and worldofspectrum
void AbstractScraper::getDescription(GameEntry &game) {
if (descriptionPre.isEmpty()) {
return;
Expand All @@ -176,6 +182,7 @@ void AbstractScraper::getDescription(GameEntry &game) {
game.description = StrTools::stripHtmlTags(game.description);
}

// TODO: openretro and worldofspectrum
void AbstractScraper::getDeveloper(GameEntry &game) {
for (const auto &nom : developerPre) {
if (!checkNom(nom)) {
Expand All @@ -188,6 +195,7 @@ void AbstractScraper::getDeveloper(GameEntry &game) {
game.developer = data.left(data.indexOf(developerPost.toUtf8()));
}

// TODO: openretro and worldofspectrum
void AbstractScraper::getPublisher(GameEntry &game) {
if (publisherPre.isEmpty()) {
return;
Expand All @@ -203,6 +211,7 @@ void AbstractScraper::getPublisher(GameEntry &game) {
game.publisher = data.left(data.indexOf(publisherPost.toUtf8()));
}

// TODO: openretro and worldofspectrum
void AbstractScraper::getPlayers(GameEntry &game) {
if (playersPre.isEmpty()) {
return;
Expand All @@ -218,6 +227,7 @@ void AbstractScraper::getPlayers(GameEntry &game) {
game.players = data.left(data.indexOf(playersPost.toUtf8()));
}

// TODO: only for html scrape modules (currently none)
void AbstractScraper::getAges(GameEntry &game) {
if (agesPre.isEmpty()) {
return;
Expand All @@ -233,6 +243,7 @@ void AbstractScraper::getAges(GameEntry &game) {
game.ages = data.left(data.indexOf(agesPost.toUtf8()));
}

// TODO: openretro and worldofspectrum
void AbstractScraper::getTags(GameEntry &game) {
if (tagsPre.isEmpty()) {
return;
Expand All @@ -248,6 +259,7 @@ void AbstractScraper::getTags(GameEntry &game) {
game.tags = data.left(data.indexOf(tagsPost.toUtf8()));
}

// TODO: openretro and worldofspectrum
void AbstractScraper::getRating(GameEntry &game) {
if (ratingPre.isEmpty()) {
return;
Expand All @@ -270,6 +282,7 @@ void AbstractScraper::getRating(GameEntry &game) {
}
}

// TODO: openretro and worldofspectrum
void AbstractScraper::getReleaseDate(GameEntry &game) {
if (releaseDatePre.isEmpty()) {
return;
Expand All @@ -286,6 +299,7 @@ void AbstractScraper::getReleaseDate(GameEntry &game) {
data.left(data.indexOf(releaseDatePost.toUtf8())).simplified();
}

// TODO: openretro and worldofspectrum
void AbstractScraper::getCover(GameEntry &game) {
if (coverPre.isEmpty()) {
return;
Expand All @@ -312,6 +326,7 @@ void AbstractScraper::getCover(GameEntry &game) {
}
}

// TODO: openretro only
void AbstractScraper::getScreenshot(GameEntry &game) {
if (screenshotPre.isEmpty()) {
return;
Expand Down Expand Up @@ -340,6 +355,7 @@ void AbstractScraper::getScreenshot(GameEntry &game) {
}
}

// TODO: only for html scrape modules (currently none)
void AbstractScraper::getWheel(GameEntry &game) {
if (wheelPre.isEmpty()) {
return;
Expand All @@ -366,6 +382,7 @@ void AbstractScraper::getWheel(GameEntry &game) {
}
}

// TODO: openretro only
void AbstractScraper::getMarquee(GameEntry &game) {
if (marqueePre.isEmpty()) {
return;
Expand All @@ -392,6 +409,7 @@ void AbstractScraper::getMarquee(GameEntry &game) {
}
}

// TODO: only for html scrape modules (currently none)
void AbstractScraper::getTexture(GameEntry &game) {
if (texturePre.isEmpty()) {
return;
Expand Down Expand Up @@ -421,6 +439,7 @@ void AbstractScraper::getTexture(GameEntry &game) {
}
}

// TODO: only for html scrape modules (currently none)
void AbstractScraper::getVideo(GameEntry &game) {
if (videoPre.isEmpty()) {
return;
Expand Down
1 change: 1 addition & 0 deletions src/abstractscraper.h
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ class AbstractScraper : public QObject {
virtual void getTexture(GameEntry &game);
virtual void getTitle(GameEntry &);
virtual void getVideo(GameEntry &game);
virtual void getManual(GameEntry &game) { (void)game; };

virtual void nomNom(const QString nom, bool including = true);
bool checkNom(const QString nom);
Expand Down
2 changes: 1 addition & 1 deletion src/arcadedb.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ void ArcadeDB::getVideo(GameEntry &game) {
game.videoData.length() > 4096) {
game.videoFormat = "mp4";
} else {
game.videoData = "";
game.videoData = QByteArray();
}
}

Expand Down
Loading

0 comments on commit 2a3ead6

Please sign in to comment.