Skip to content

This Python script automates the extraction of user identifiers (userCode and userId) from log files across multiple directories. It processes each log file, extracts unique identifiers, and saves them to a CSV file for easy analysis. This tool is useful for efficiently identifying and consolidating user information from large sets of log data.

License

Notifications You must be signed in to change notification settings

PKHarsimran/LogUserExtractor

πŸš€ Log User Extractor

A powerful and efficient Python script designed to process log files, extract unique user identifiers (userCode and userId), and save them into a CSV file. Ideal for environments with multiple directories containing extensive log files.

✨ Features

  • πŸ“‚ Directory Scanning: Scans specified directories for log files.
  • πŸ” Identifier Extraction: Extracts user identifiers (userCode and userId) from log files.
  • πŸ’Ύ CSV Output: Saves unique user identifiers to a CSV file.
  • ⚑ Parallel Processing: Handles multiple log files simultaneously, significantly reducing processing time.
  • πŸ”§ Robust Logging: Enhanced error handling and logging for improved monitoring and debugging.
  • βš™οΈ Configurable Parameters: Easily specify file patterns, output file names, and log levels via a configuration file.

πŸ›  Requirements

  • 🐍 Python 3.x
  • 🐼 pandas

πŸ“₯ Installation

  1. Clone the repository:

    git clone https://github.com/PKHarsimran/LogUserExtractor.git
  2. Navigate to the project directory:

    cd LogUserExtractor
  3. Install the required Python packages:

    pip install pandas

πŸš€ Usage

  1. Configure the script:

    Edit the config.ini file to specify your directories, file pattern, output file name, and log level:

    [Paths]
    log_directories = test
    output_csv = extracted_user_codes.csv
    
    [Settings]
    file_pattern = .*\.log$
    
    [Logging]
    log_filename = log_user_extractor.log
    log_level = INFO
  2. Run the script:

    python log_user_extractor.py
  3. Check the output:

    The script will create a CSV file named extracted_user_codes.csv containing the unique user identifiers.

πŸ“Š Flowchart

958ZJGQMU7

πŸ“Š Workflow

  1. Start
  2. Load Configuration from config.ini
  3. Initialize LogProcessor with directories and file pattern
    • Input: List of directories containing log files and the file pattern to match.
  4. Process log files
    • For each directory:
      • List files in the directory.
      • For each file:
        • Check if the file matches the pattern.
        • If true, process the file.
  5. Process the file
    • Read the file line by line.
    • For each line:
      • Extract userCode.
      • Extract userId.
      • Add identifiers to a set to ensure uniqueness.
  6. Save identifiers to CSV
    • Convert the set of identifiers to a DataFrame.
    • Save the DataFrame to a CSV file.
  7. End

πŸš€ Recent and Planned Improvements

We're excited to share the latest updates and upcoming enhancements for the Log User Extractor script. These changes are designed to make the script smarter, faster, and more user-friendly!

πŸŽ‰ Recently Implemented

⚑ Parallel Processing

  • Status: Implemented
  • Details: We've introduced parallel processing to handle multiple log files simultaneously. This enhancement significantly reduces the time required to process large datasets, making the script more efficient and scalable.

πŸ”§ Enhanced Error Handling and Logging

  • Status: Implemented
  • Details: We've added robust error handling and logging mechanisms to track processing status and any issues that arise. This improvement enhances monitoring, debugging, and the overall reliability of the script.

βš™οΈ Configurable Parameters

  • Status: Implemented
  • Details: Users can now specify options such as file patterns, output file names, and log levels through a configuration file. This provides greater flexibility and customization.

πŸ“ˆ Progress Tracking

We believe in transparency and continuous improvement. Here's a snapshot of our progress:

  • Parallel Processing: βœ… Completed
  • Enhanced Error Handling and Logging: βœ… Completed
  • Configurable Parameters: βœ… Completed

Stay tuned for more updates as we continue to enhance the Log User Extractor. Your feedback and contributions are always welcome!

🀝 Contributing

We welcome contributions to enhance Log User Extractor. To contribute:

  1. 🍴 Fork the repository.
  2. 🌿 Create a new branch.
  3. πŸ’Ύ Make your changes and commit them.
  4. πŸš€ Push to the branch.
  5. πŸ”„ Create a new Pull Request.

We appreciate your help in making this project better for everyone!

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“ Acknowledgments

Special thanks to all the contributors who have helped in improving this project. Your efforts are highly valued!

About

This Python script automates the extraction of user identifiers (userCode and userId) from log files across multiple directories. It processes each log file, extracts unique identifiers, and saves them to a CSV file for easy analysis. This tool is useful for efficiently identifying and consolidating user information from large sets of log data.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published