Quantium

Guide

Chronos is a versatile, high-performance tool built to streamline the processing of large-scale datasets. Whether you’re dealing with raw log files, credential dumps, or other data-intensive tasks, Chronos’ modular architecture and asynchronous, multi-threaded processing capabilities ensure that no detail is left behind. This guide is designed to provide comprehensive, step-by-step instructions for every possible use case, enabling you to clean, filter, merge, split, extract, and sort your data with precision.

Chronos User Interface

Architecture and Core Features

Chronos is engineered around a robust, modular architecture that enables:

  • Asynchronous Processing: Non-blocking operations allow for continuous data flow.
  • Multi-threading: Utilizes multiple CPU cores to accelerate data processing.
  • Modular Design: Each module is dedicated to a specific task—cleaning, filtering, sorting, etc.—so you can tailor Chronos to your needs.
  • Flexible Format Support: Handles various credential formats (e.g., url:log:pass, log:pass) without hassle.
  • Advanced Deduplication: Offers both fast, memory-intensive deduplication and a low-memory alternative for constrained environments.
  • Real-Time Monitoring: Interactive UI and detailed logs let you monitor progress and performance metrics.

Detailed Use Cases

1. Bulk Data Cleaning and Deduplication

Scenario: You have a massive dataset containing duplicate and malformed entries that need to be cleaned before further processing.

Modules Involved:

  • AntiPublic: Checks for duplicate entries across your dataset.
  • LpCleaner / UlpCleaner: Cleans up entries by normalizing formats, removing unwanted characters, and ensuring consistency.

Step-by-Step Process:

  1. Input Preparation: Ensure your dataset is formatted as url:log:pass or log:pass. Back up your raw data.
  2. Configure AntiPublic:
    • Source: Select the file or directory containing your raw data.
    • Mode: Choose Fast for quick processing or LowMemory if system resources are limited.
  3. Run LpCleaner/UlpCleaner:
    • Options: Set filters like minimum and maximum login/password lengths.
    • Deduplication: Enable deduplication to automatically remove duplicates.
  4. Review Output: Verify that the duplicates have been removed and the data is normalized.

Pro Tip

Always retain a backup of your original data. This ensures you can revert changes if needed.


2. Advanced Data Filtering

Scenario: You need to extract only the relevant entries from a mixed dataset using complex filtering criteria.

Module Involved:

  • Filter

Step-by-Step Process:

  1. Set Up the Filter Module:
    • Source: Specify the file or folder containing your data.
    • Filter Pattern: Define a regular expression that matches the data you need.
  2. Invert Option:
    • Use Case: Toggle the invert setting to exclude unwanted patterns.
  3. Run the Filter: Chronos processes the data and outputs only the entries that match (or don’t match, if inverted) the regex.

Example Regex Patterns:

  • Email Extraction: /[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}/
  • URL Extraction: /https?:\/\/[^\s]+/

3. Merging Multiple Data Sources

Scenario: You have data spread across multiple files or directories and need to consolidate it into one unified dataset.

Module Involved:

  • Joiner

Step-by-Step Process:

  1. Configure Joiner:
    • Sources: Specify multiple files or directories that contain the data.
    • Deduplication: Enable deduplication to avoid redundant entries.
  2. Merge Process: Chronos concatenates and organizes the data, ensuring consistent formatting across merged sources.
  3. Output Verification: Review the merged file to confirm that all data sources have been accurately combined.

Integration Tip

Ensure all data sources use the same formatting standards to prevent errors during the merge process.


4. Splitting Large Files

Scenario: A single file is too large to process efficiently, so you need to split it into smaller, manageable chunks.

Module Involved:

  • Splitter

Step-by-Step Process:

  1. Define Split Parameters:
    • Chunk Size: Decide the size of each split (number of lines or bytes).
    • Prefix: Specify an output prefix (e.g., Chunk1, Chunk2).
  2. Execute Splitter: Chronos divides the file into multiple smaller files automatically.
  3. Post-Split Review: Check each chunk to ensure data integrity before proceeding with further processing.

5. Extracting Data from Complex Archives

Scenario: You need to extract credential data from compressed archives, including nested archives or those protected by passwords.

Module Involved:

  • UlpExtractor

Step-by-Step Process:

  1. Archive Preparation: Gather all the archives (zip, rar, etc.) that contain your log files.
  2. Configure UlpExtractor:
    • Output Format: Choose from formats such as UrlLogPass, LogPassUrl, or LogPass.
    • Separator: Define the separator to be used (default is :).
    • Password Handling: Pre-configure any archive passwords in the settings.
  3. Extraction Process: Chronos extracts data even from nested archives.
  4. Validation: Review the extracted files and run deduplication (using UlpSorter, if needed) to ensure clean data.

6. Sorting Data Based on Custom Criteria

Scenario: You want to organize your credential entries based on specific patterns such as domain names, numerical sequences, or other custom filters.

Module Involved:

  • UlpSorter

Step-by-Step Process:

  1. Define Sorting Criteria:
    • Regex Patterns: Create regex expressions to capture patterns (e.g., domain names like gmail.com or numeric patterns like /\d{4}/).
  2. Configure UlpSorter:
    • Source: Select the file with the credentials.
    • Sort and Filter: Set up the criteria and enable deduplication to ensure each category is unique.
  3. Execute Sorting: Chronos sorts the data into separate files based on the defined criteria.
  4. Review Sorted Output: Verify that entries have been accurately grouped according to your specifications.

Configuration Options

Chronos provides extensive configuration options to tailor performance and output to your environment:

Threads

  • Purpose: Controls the number of parallel processing threads.
  • Recommendation: Set to match the number of CPU cores; e.g., 32 for a high-performance machine.

Timeout

  • Purpose: Defines the maximum wait time (in seconds) for network responses.
  • Recommendation: Typically set between 10 to 30 seconds.

Deduplication Strategy

  • Options:
    • Fast: High memory usage for rapid deduplication.
    • LowMemory: Lower resource consumption at the cost of speed.

Archive Passwords

  • Purpose: Pre-set passwords for extracting protected archives.

Advanced Usage and Optimization

Performance Tuning

  • Memory Management: Monitor system memory usage during heavy operations. Switch to the low-memory deduplication mode if necessary.
  • Thread Optimization: Experiment with different thread counts to balance speed with resource consumption.
  • I/O Optimization: Ensure your storage system supports high-speed read/write operations to handle large files efficiently.

Troubleshooting Common Issues

High Memory Usage

  • Solution: Switch to a low-memory deduplication strategy and/or lower the thread count.

Data Format Errors

  • Solution: Pre-validate input files using the Filter module; ensure adherence to url:log:pass or log:pass formats.

Slow Processing

  • Solution: Increase the thread count if your hardware permits; ensure that your disk I/O is not a bottleneck.

Archive Extraction Failures

  • Solution: Verify the integrity of archives and pre-configure necessary passwords.

Module Reference

For more detailed information, each Chronos module comes with its own dedicated guide:

Each module guide offers detailed options, configuration examples, and troubleshooting tips to help you achieve optimal results in your specific workflow.


Conclusion

Chronos is more than just a data parser—it’s a comprehensive, high-performance platform designed to handle every aspect of large-scale data management. Its modular architecture, flexible configuration options, and seamless integration with the Quantium ecosystem empower you to clean, filter, merge, split, extract, and sort even the most complex datasets with confidence.

By following this extended guide, you will be equipped with the knowledge and best practices needed to maximize efficiency, maintain data integrity, and unlock new opportunities for analysis and growth.

Happy processing and may your data drive success!