Skip to content

idanless/sanitize-pdf-and-office

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CDR (Cyber Data Removal) System

The CDR (Cyber Data Removal) system is a comprehensive solution designed to enhance the security of Office documents and PDF files. It offers a wide range of features to safeguard your files from potential security risks.

Features of CDR System:

  • Link Removal: The CDR system detects and removes embedded links from Office documents and PDF files, preventing users from accessing potentially harmful websites or downloading malicious content.
  • Binary File Conversion: CDR converts binary files within Office documents to their corresponding values, ensuring that any hidden or malicious code is eliminated.
  • PDF Attachment Removal: The system scans PDF files for attached files or documents and removes them to eliminate any potential security threats.
  • Full CDR Conversion: CDR converts PDF files to bitmap images, ensuring that the content is displayed as an image rather than text. This prevents any potential risks associated with malicious PDF content.

CDR System for Office Documents:

The CDR system offers advanced protection for Office documents, focusing on link removal, binary file conversion, comprehensive security checks, and macro removal. Here's how it works:

  • Link Removal: The system scans Office documents for embedded links and removes them, preventing users from accidentally clicking on malicious URLs.
  • Binary File Conversion: CDR converts binary files within Office documents to their corresponding values, removing any hidden or potentially malicious code.
  • Enhanced Security Checks: The system performs extensive security checks on Office documents, ensuring that they are free from malware, malicious macros, and other security threats.
  • Macro Removal: The system also offers the option to remove macros from Office documents, mitigating the risks associated with potential macro-based vulnerabilities.

CDR System for PDF Files:

When it comes to PDF files, the CDR system offers comprehensive protection through link removal, attachment removal, and full CDR conversion. Here are the key features:

  • Link Removal: The system scans PDF files for embedded links and removes them, ensuring users do not access potentially harmful websites.
  • Attachment Removal: CDR identifies and removes any attached files or documents within PDF files to eliminate potential security risks.
  • Full CDR Conversion: PDF files are converted to bitmap images, preventing any potential risks associated with text-based PDF content.

Pip Requirements:

To run the CDR system for Office documents and PDF files, ensure that you have the following Python packages installed:

psutil
pywin32
PyPDF2
pdf2image
xlutils
xlrd
office 2009 and above

You can install these packages using pip:

pip install psutil pywin32 PyPDF2 pdf2image etc...

Make sure to install these dependencies before running the CDR system for Office documents and PDF files.

Poc Gui to test:

  • pdf minimal: Link Removal/Attachment Removal
  • PDF Full: PDF files are converted to bitmap images, preventing any potential risks associated with text-based PDF content.
  • office : refer to CDR System for Office Documents

GUI POC

Example 1

Download Here:

Click here to download

Examples:

Example 1: Link Removal in Office Document

Example 1

Example 2: Binary File Conversion in Office Document

Example 2

Example 3: PDF Attachment Removal

Example 3

Example 4: PDF Remove link

Example 4

Example 5: Excel Formula to Text

Example 4

Releases

No releases published

Packages

No packages published

Languages