Guide
Chronos is a versatile, high-performance tool built to streamline the processing of large-scale datasets. Whether you’re dealing with raw log files, credential dumps, or other data-intensive tasks, Chronos’ modular architecture and asynchronous, multi-threaded processing capabilities ensure that no detail is left behind. This guide is designed to provide comprehensive, step-by-step instructions for every possible use case, enabling you to clean, filter, merge, split, extract, and sort your data with precision.

Architecture and Core Features
Chronos is engineered around a robust, modular architecture that enables:
- Asynchronous Processing: Non-blocking operations allow for continuous data flow.
- Multi-threading: Utilizes multiple CPU cores to accelerate data processing.
- Modular Design: Each module is dedicated to a specific task—cleaning, filtering, sorting, etc.—so you can tailor Chronos to your needs.
- Flexible Format Support: Handles various credential formats (e.g.,
url:log:pass
,log:pass
) without hassle. - Advanced Deduplication: Offers both fast, memory-intensive deduplication and a low-memory alternative for constrained environments.
- Real-Time Monitoring: Interactive UI and detailed logs let you monitor progress and performance metrics.
Detailed Use Cases
1. Bulk Data Cleaning and Deduplication
Scenario: You have a massive dataset containing duplicate and malformed entries that need to be cleaned before further processing.
Modules Involved:
- AntiPublic: Checks for duplicate entries across your dataset.
- LpCleaner / UlpCleaner: Cleans up entries by normalizing formats, removing unwanted characters, and ensuring consistency.
Step-by-Step Process:
- Input Preparation:
Ensure your dataset is formatted as
url:log:pass
orlog:pass
. Back up your raw data. - Configure AntiPublic:
- Source: Select the file or directory containing your raw data.
- Mode: Choose
Fast
for quick processing orLowMemory
if system resources are limited.
- Run LpCleaner/UlpCleaner:
- Options: Set filters like minimum and maximum login/password lengths.
- Deduplication: Enable deduplication to automatically remove duplicates.
- Review Output: Verify that the duplicates have been removed and the data is normalized.
Pro Tip
Always retain a backup of your original data. This ensures you can revert changes if needed.
2. Advanced Data Filtering
Scenario: You need to extract only the relevant entries from a mixed dataset using complex filtering criteria.
Module Involved:
- Filter
Step-by-Step Process:
- Set Up the Filter Module:
- Source: Specify the file or folder containing your data.
- Filter Pattern: Define a regular expression that matches the data you need.
- Invert Option:
- Use Case: Toggle the invert setting to exclude unwanted patterns.
- Run the Filter: Chronos processes the data and outputs only the entries that match (or don’t match, if inverted) the regex.
Example Regex Patterns:
- Email Extraction:
/[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}/
- URL Extraction:
/https?:\/\/[^\s]+/
3. Merging Multiple Data Sources
Scenario: You have data spread across multiple files or directories and need to consolidate it into one unified dataset.
Module Involved:
- Joiner
Step-by-Step Process:
- Configure Joiner:
- Sources: Specify multiple files or directories that contain the data.
- Deduplication: Enable deduplication to avoid redundant entries.
- Merge Process: Chronos concatenates and organizes the data, ensuring consistent formatting across merged sources.
- Output Verification: Review the merged file to confirm that all data sources have been accurately combined.
Integration Tip
Ensure all data sources use the same formatting standards to prevent errors during the merge process.
4. Splitting Large Files
Scenario: A single file is too large to process efficiently, so you need to split it into smaller, manageable chunks.
Module Involved:
- Splitter
Step-by-Step Process:
- Define Split Parameters:
- Chunk Size: Decide the size of each split (number of lines or bytes).
- Prefix: Specify an output prefix (e.g.,
Chunk1
,Chunk2
).
- Execute Splitter: Chronos divides the file into multiple smaller files automatically.
- Post-Split Review: Check each chunk to ensure data integrity before proceeding with further processing.
5. Extracting Data from Complex Archives
Scenario: You need to extract credential data from compressed archives, including nested archives or those protected by passwords.
Module Involved:
- UlpExtractor
Step-by-Step Process:
- Archive Preparation: Gather all the archives (zip, rar, etc.) that contain your log files.
- Configure UlpExtractor:
- Output Format: Choose from formats such as
UrlLogPass
,LogPassUrl
, orLogPass
. - Separator: Define the separator to be used (default is
:
). - Password Handling: Pre-configure any archive passwords in the settings.
- Output Format: Choose from formats such as
- Extraction Process: Chronos extracts data even from nested archives.
- Validation: Review the extracted files and run deduplication (using UlpSorter, if needed) to ensure clean data.
6. Sorting Data Based on Custom Criteria
Scenario: You want to organize your credential entries based on specific patterns such as domain names, numerical sequences, or other custom filters.
Module Involved:
- UlpSorter
Step-by-Step Process:
- Define Sorting Criteria:
- Regex Patterns: Create regex expressions to capture patterns (e.g., domain names like
gmail.com
or numeric patterns like/\d{4}/
).
- Regex Patterns: Create regex expressions to capture patterns (e.g., domain names like
- Configure UlpSorter:
- Source: Select the file with the credentials.
- Sort and Filter: Set up the criteria and enable deduplication to ensure each category is unique.
- Execute Sorting: Chronos sorts the data into separate files based on the defined criteria.
- Review Sorted Output: Verify that entries have been accurately grouped according to your specifications.
Configuration Options
Chronos provides extensive configuration options to tailor performance and output to your environment:
Threads
- Purpose: Controls the number of parallel processing threads.
- Recommendation: Set to match the number of CPU cores; e.g., 32 for a high-performance machine.
Timeout
- Purpose: Defines the maximum wait time (in seconds) for network responses.
- Recommendation: Typically set between 10 to 30 seconds.
Deduplication Strategy
- Options:
- Fast: High memory usage for rapid deduplication.
- LowMemory: Lower resource consumption at the cost of speed.
Archive Passwords
- Purpose: Pre-set passwords for extracting protected archives.
Advanced Usage and Optimization
Performance Tuning
- Memory Management: Monitor system memory usage during heavy operations. Switch to the low-memory deduplication mode if necessary.
- Thread Optimization: Experiment with different thread counts to balance speed with resource consumption.
- I/O Optimization: Ensure your storage system supports high-speed read/write operations to handle large files efficiently.
Troubleshooting Common Issues
High Memory Usage
- Solution: Switch to a low-memory deduplication strategy and/or lower the thread count.
Data Format Errors
- Solution: Pre-validate input files using the Filter module; ensure adherence to
url:log:pass
orlog:pass
formats.
Slow Processing
- Solution: Increase the thread count if your hardware permits; ensure that your disk I/O is not a bottleneck.
Archive Extraction Failures
- Solution: Verify the integrity of archives and pre-configure necessary passwords.
Module Reference
For more detailed information, each Chronos module comes with its own dedicated guide:
- AntiPublic Module
- Filter Module
- Joiner Module
- LpCleaner Module
- Splitter Module
- UlpCleaner Module
- UlpExtractor Module
- UlpSorter Module
Each module guide offers detailed options, configuration examples, and troubleshooting tips to help you achieve optimal results in your specific workflow.
Conclusion
Chronos is more than just a data parser—it’s a comprehensive, high-performance platform designed to handle every aspect of large-scale data management. Its modular architecture, flexible configuration options, and seamless integration with the Quantium ecosystem empower you to clean, filter, merge, split, extract, and sort even the most complex datasets with confidence.
By following this extended guide, you will be equipped with the knowledge and best practices needed to maximize efficiency, maintain data integrity, and unlock new opportunities for analysis and growth.
Happy processing and may your data drive success!