Command Line Interface

TACO includes a powerful command-line interface (CLI) that provides convenient access to common trajectory operations without requiring any programming. The``taco extract <input> <output> [options]`` ~~~``taco convert <input> <output> [options]`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Convert a TACO file to use different compression settings while preserving all trajectory data.

Arguments:

  • input - Input TACO file to convert

  • output - Output TACO file with new compression settings

Options:

  • -c, --compression <level> - Zstandard compression level 1-22 (default: 3, higher = better compression)

  • --full-precision - Use 32-bit float precision (lossless for coordinates)

  • --lossless - Alias for --full-precision

  • --full-frame-interval <N> - Store complete frame every N frames for random access (default: 10)

Examples:

# Maximum compression for archival storage
taco convert original.taco archive.taco --compression 22

# Lossless conversion for high-precision analysis
taco convert half_precision.taco full_precision.taco --lossless

# Balanced compression with frequent checkpoints
taco convert input.taco output.taco --compression 10 --full-frame-interval 5

Higher compression levels reduce file size but increase CPU time. Full precision mode eliminates quantization errors but increases file size.~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Extract a subset of frames from a trajectory to create a new, smaller TACO file.

Arguments:

  • input - Input TACO file (source trajectory)

  • output - Output TACO file (will be created)

Options:

  • --start <N> - Starting frame index (0-based, default: 0)

  • --end <N> - Ending frame index (exclusive, default: total frames)

  • --step <N> - Frame step size for subsampling (default: 1)

  • --frames <list> - Comma-separated list of specific frame indices (alternative to start/end/step)

Examples:

# Extract equilibration period (frames 0-1000)
taco extract production.taco equilibration.taco --start 0 --end 1000

# Subsample every 5th frame
taco extract full_traj.taco subsampled.taco --step 5

# Extract specific time points
taco extract dynamics.taco snapshots.taco --frames 0,100,500,1000

Frame indices are 0-based. The output file preserves all metadata from the input file.e both as a standalone Rust binary and as a Python command when the package is installed via pip or maturin.

Installation

The TACO CLI is automatically available after installation:

Via pip/maturin (Python users):

pip install taco-format
# or
maturin develop --release

Via cargo (Rust users):

cargo install taco_format

Usage

The CLI provides several subcommands for working with TACO files:

taco --help

Basic Commands

File Information

Get detailed information about a TACO file including metadata, compression settings, and file statistics:

taco info trajectory.taco

This command provides a comprehensive overview of your trajectory file, displaying:

  • File format version and basic properties - TACO version and compatibility information

  • Trajectory dimensions - Number of atoms, frames, and time step

  • Simulation metadata - Temperature, pressure, ensemble type, and software used

  • Compression settings - Compression level and precision mode (half/full precision)

  • Atom metadata - Element information when available

  • File efficiency - File size, compression ratio, and storage per frame

This is typically the first command you’ll want to run when working with a new TACO file to understand its contents and properties.

Example output:

=== TACO File Information ===
Format version: 0.1.0
Number of atoms: 1000
Number of frames: 5000
Time step: 0.001 ps

=== Compression Settings ===
Compression level: 3
Precision mode: Half

Frame Extraction

Extract specific frames or frame ranges from a trajectory to create focused datasets for analysis:

# Extract frames 100-200 for detailed analysis
taco extract input.taco output.taco --start 100 --end 200

# Extract every 10th frame to reduce data size
taco extract input.taco output.taco --step 10

# Extract specific frames of interest
taco extract input.taco output.taco --frames 0,50,100,150

Frame extraction is useful for:

  • Creating analysis windows - Focus on specific time periods of interest

  • Reducing data size - Extract every Nth frame for coarse-grained analysis

  • Isolating events - Extract frames around specific molecular events

  • Creating training sets - Extract representative frames for machine learning

The extracted file maintains all the original metadata and can be used with any TACO-compatible tool.

File Conversion

Convert TACO files between different compression settings to optimize for storage, precision, or compatibility:

# Convert to lossless compression for maximum accuracy
taco convert input.taco output.taco --lossless

# Use maximum compression for long-term storage
taco convert input.taco output.taco --compression 19

# Use full precision (32-bit floats) for high-accuracy calculations
taco convert input.taco output.taco --full-precision

File conversion allows you to:

  • Balance accuracy vs. storage - Choose between lossy and lossless compression

  • Optimize for analysis - Convert to full precision for sensitive calculations

  • Prepare for archival - Use maximum compression for long-term storage

  • Ensure compatibility - Convert between different precision modes

The conversion process preserves all trajectory data and metadata while applying new compression settings.

File Validation

Verify the integrity and consistency of TACO files to ensure data quality:

# Quick validation check
taco check trajectory.taco

# Detailed validation with progress reporting
taco check trajectory.taco --verbose

File validation performs comprehensive checks including:

  • Format integrity - Ensures the file structure is valid and readable

  • Data consistency - Verifies that position, velocity, and force data are reasonable

  • Frame completeness - Checks that all frames can be read successfully

  • Numerical validity - Detects NaN or infinite values that indicate corruption

This is essential after file transfers, format conversions, or when working with files from unknown sources. Use verbose mode for large files to monitor progress.

Trajectory Statistics

Analyze trajectory properties and data distribution to understand system behavior:

# Overview of trajectory characteristics
taco stats trajectory.taco

# Detailed per-frame analysis for first 20 frames
taco stats trajectory.taco --detailed

Statistical analysis includes:

  • Spatial distribution - Position ranges and system dimensions

  • Data completeness - Percentage of frames with velocities, forces, energies

  • Trajectory overview - Sampling of frames to characterize the entire trajectory

  • Quality assessment - Identification of potential issues or anomalies

This command is valuable for:

  • Trajectory validation - Ensuring reasonable coordinate ranges and data completeness

  • Analysis planning - Understanding data availability for different properties

  • System characterization - Getting an overview of molecular system size and dynamics

  • Quality control - Detecting outliers or problematic frames

Example output:

=== Trajectory Statistics ===
Position ranges:
  X: -15.234 to 15.456 Å (span: 30.690 Å)
  Y: -14.987 to 15.123 Å (span: 30.110 Å)
  Z: -15.001 to 14.999 Å (span: 30.000 Å)
Data availability:
  Frames with velocities: 5000/5000 (100.0%)
  Frames with forces: 5000/5000 (100.0%)

Command Reference

taco info <file>

Display comprehensive information about a TACO file including format version, trajectory dimensions, compression settings, and file statistics.

Arguments:

  • file - Path to the TACO file to analyze

Example:

taco info simulation.taco

This command is read-only and safe to run on any TACO file. It provides essential information for understanding file contents before processing.

taco extract <input> <output> [options]

Extract a subset of frames to a new file.

Arguments:

  • input - Input TACO file

  • output - Output TACO file

Options:

  • --start <N> - Starting frame index (default: 0)

  • --end <N> - Ending frame index (default: all frames)

  • --step <N> - Frame step size (default: 1)

  • --frames <list> - Comma-separated list of specific frame indices

taco convert <input> <output> [options]

Convert a TACO file with different compression settings.

Arguments:

  • input - Input TACO file

  • output - Output TACO file

Options:

  • -c, --compression <level> - Compression level 1-22 (default: 3)

  • --full-precision - Use 32-bit float precision

  • --lossless - Use lossless compression (alias for –full-precision)

  • --full-frame-interval <N> - Store complete frame every N frames (default: 10)

taco check <file> [options]

Validate the integrity and consistency of a TACO file to ensure data quality and detect corruption.

Arguments:

  • file - Path to the TACO file to validate

Options:

  • -v, --verbose - Show detailed validation progress and per-frame information

Examples:

# Quick integrity check
taco check downloaded_file.taco

# Detailed validation with progress reporting
taco check large_trajectory.taco --verbose

The validation process checks for file format correctness, readable frames, and reasonable coordinate values. Use verbose mode for large files to monitor progress.

taco stats <file> [options]

Analyze trajectory statistics including spatial distributions, data completeness, and system properties.

Arguments:

  • file - Path to the TACO file to analyze

Options:

  • -d, --detailed - Show detailed frame-by-frame statistics for the first 20 frames

Examples:

# Overview of trajectory characteristics
taco stats simulation.taco

# Detailed analysis with per-frame breakdown
taco stats simulation.taco --detailed

The analysis samples frames throughout the trajectory to characterize spatial distributions and data availability without reading every frame (for performance).

Integration with Python

The CLI functionality is also accessible from Python code:

import taco_format

# Call CLI commands from Python
taco_format.run_cli(["info", "trajectory.taco"])
taco_format.run_cli(["extract", "input.taco", "output.taco", "--start", "100", "--end", "200"])

This allows for easy integration of TACO CLI operations into Python workflows and scripts.

Common Workflows

Quick file inspection:

taco info trajectory.taco
taco stats trajectory.taco

Extracting a time range:

# Extract frames 1000-2000 (assuming 1 fs timestep = 1000-2000 fs)
taco extract simulation.taco analysis_window.taco --start 1000 --end 2000

Creating a compressed backup:

# Create a highly compressed version for long-term storage
taco convert large_trajectory.taco compressed_backup.taco --compression 19

Quality control:

# Validate file integrity after transfer
taco check downloaded_trajectory.taco --verbose