Skip to content
Change the repository type filter

All

    Repositories list

    • uwazi

      Public
      Uwazi is a web-based, open-source solution for building and sharing document collections
      TypeScript
      MIT License
      802414416Updated Nov 5, 2024Nov 5, 2024
    • pdf_information_extraction
      Python
      0408Updated Nov 3, 2024Nov 3, 2024
    • Trainable Entity Extractor
      Python
      Apache License 2.0
      0007Updated Nov 1, 2024Nov 1, 2024
    • A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The service allows for the segmentation and classification of different parts of PDF pages, identifying the elements such as texts, titles, pictures, tables and so on.
      Python
      Apache License 2.0
      2316816Updated Nov 1, 2024Nov 1, 2024
    • queue-processor
      Python
      0000Updated Nov 1, 2024Nov 1, 2024
    • text selection handling and highlighting
      TypeScript
      Apache License 2.0
      0061Updated Oct 31, 2024Oct 31, 2024
    • Python
      0000Updated Oct 22, 2024Oct 22, 2024
    • pdf-document-layout-analysis-async
      Python
      0105Updated Oct 22, 2024Oct 22, 2024
    • TypeScript
      Apache License 2.0
      0300Updated Oct 17, 2024Oct 17, 2024
    • HTML
      MIT License
      3260Updated Oct 16, 2024Oct 16, 2024
    • docker-translation-service
      Python
      Apache License 2.0
      0006Updated Oct 9, 2024Oct 9, 2024
    • ml-cloud-connector
      Python
      0000Updated Oct 3, 2024Oct 3, 2024
    • An http service to OCR PDFs based on a redis queue.
      Python
      MIT License
      0130Updated Sep 23, 2024Sep 23, 2024
    • An http service to convert documents to PDF based on a redis queue.
      Python
      MIT License
      0037Updated Sep 19, 2024Sep 19, 2024
    • Python
      4316Updated Jul 4, 2024Jul 4, 2024
    • Python
      MIT License
      74914Updated Jul 4, 2024Jul 4, 2024
    • This project aims to extract Table of Contents (TOC) information from PDF files using the outputs generated by the pdf-document-layout-analysis service. By leveraging the segmentation and classification capabilities of the underlying analysis tool, this project automates the process of identifying and structuring the document's TOC.
      Python
      Apache License 2.0
      1500Updated Jun 10, 2024Jun 10, 2024
    • This project aims to extract text from PDF files using the outputs generated by the pdf-document-layout-analysis service. By leveraging the segmentation and classification capabilities of the underlying analysis tool, this project automates the process of text extraction from PDF files.
      Python
      Apache License 2.0
      01800Updated Jun 4, 2024Jun 4, 2024
    • Python
      21100Updated Apr 26, 2024Apr 26, 2024
    • preserve

      Public
      Preserve is a tool for capturing and saving online digital content. Integrated with Uwazi, Preserve captures content from websites, social media and communication platforms, and archives them with accompanying key metadata to ensure evidentiary value by establishing and demonstrating authenticity and chain of custody.
      TypeScript
      MIT License
      16127Updated Feb 23, 2024Feb 23, 2024
    • 0460Updated Jul 3, 2023Jul 3, 2023
    • Python
      MIT License
      45104Updated May 25, 2023May 25, 2023
    • twitter crawler
      Python
      0101Updated Apr 3, 2023Apr 3, 2023
    • Python
      3313Updated Dec 27, 2022Dec 27, 2022
    • Mock server that simulates the ML server that processes documents for semantic search
      JavaScript
      0001Updated Dec 10, 2022Dec 10, 2022
    • Python
      2003Updated Nov 21, 2022Nov 21, 2022
    • uwazi-fixtures

      Public archive
      Shell
      3300Updated Jul 1, 2022Jul 1, 2022
    • Python API to interact with Uwazi
      Python
      0200Updated Nov 5, 2021Nov 5, 2021
    • casebox

      Public archive
      Casebox: Secure all your information and team communication in one place
      JavaScript
      Other
      1174900Updated Oct 22, 2020Oct 22, 2020
    • OpenEvSys

      Public archive
      OpenEvSys is free open source software designed for use by organisations who need a software tool to manage information on human rights violations
      PHP
      GNU Affero General Public License v3.0
      203003Updated Oct 22, 2020Oct 22, 2020