Skip to content

A command-line tool that uses Gemini API to generate summaries of academic papers.

License

Notifications You must be signed in to change notification settings

7shi/gemini-paper-summarizer

Repository files navigation

Gemini Paper Summarizer

Overview

A command-line tool that uses Gemini API to generate summaries of academic papers.

Note: Supports multiple languages: Chinese, English, French, German, Japanese, Korean, and Spanish.

Examples

See the examples directory for sample outputs, including summaries of:

Other Summaries in Japanese

DeepSeek

Prerequisites

Note: If you find it difficult to set up the environment locally, please refer to the following Google Colab Notebook by @shoei05 (explained in Japanese):

Installation

  1. Clone the repository
  2. Install dependencies:
    uv sync
    
  3. Create a .env file in the project directory with your Gemini API key:
    GEMINI_API_KEY=your_api_key_here

Usage

uv run gp-summarize path/to/paper.pdf

The tool will generate several markdown files with different types of summaries:

  1. A translation of the abstract
  2. A summary of the entire paper
  3. A JSON structure of the paper's chapters and sections
  4. Individual summaries (not translations) for each main section
  5. Summaries of 1-4 above combined into a single file

The output files will be named based on the input PDF filename. Files 1-4 above will be saved in a directory and numbered continuously (e.g., paper/001.md, paper/002.md, etc.). The combined file will be named paper.md.

Note: If the process is interrupted (e.g. by Ctrl+C or by a 429 rate limit error, etc.), the process can be re-run smoothly, because any existing output files will be skipped.

Output Format

For each prompt, the tool generates a markdown file containing:

  • Title (prompt number)
  • Statistics information about tokens
  • Prompt
  • AI-generated response

The section structure will be displayed in both JSON format and as a hierarchical list.

Command-Line Options

python -m gp_summarize [-h] [-d OUTPUT_DIR] [-o OUTPUT] [-l {de,en,es,fr,ja,ko,zh}] [--version] pdf_paths [pdf_paths ...]
  • pdf_paths: Required Path(s) to one or more PDF files to summarize

    • Multiple PDF files can be specified
    • Wildcards (*) are supported on Windows
  • -d, --output-dir: Optional. Specify the output directory for intermediate files

    • Recommended when processing multiple PDF files
  • -o, --output: Optional. Specify the output file for the summary

    • Can only be used with a single PDF file
  • -l, --language: Optional. Specify the output language

    • Supported languages: de (German), en (English), es (Spanish), fr (French), ja (Japanese), ko (Korean), zh (Chinese)
    • Default: Based on system language settings
  • --version: Display version information

Examples

Summarize a single PDF:

python -m gp_summarize paper.pdf

Summarize multiple PDFs:

python -m gp_summarize paper1.pdf paper2.pdf

Specify an output directory:

python -m gp_summarize -d ./outputs paper.pdf

Summarize in a specific language:

python -m gp_summarize -l ja paper.pdf

License

CC0 1.0 Universal

About

A command-line tool that uses Gemini API to generate summaries of academic papers.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages