Revolutionizing I/O for the Convergence of HPC, Big Data, and AI
LABIOS is an NSF-funded (Award #2331480) and patented (US Patent 11,630,834 B2) distributed I/O system that introduces a revolutionary label-based paradigm for data management. Think of it as "shipping labels for data" - just as a shipping label contains all information needed to deliver a package, LABIOS labels contain everything needed to process data intelligently across modern computing systems.
- Universal Data Representation: Convert any I/O request into intelligent, self-describing labels
- Operation Embedding: Labels carry both data and operations, enabling computational storage
- Metadata Rich: Complete context for intelligent routing and processing
- 3x GPU memory reduction for AI workloads (MegaMmap)
- 10x lower p99 latency with priority scheduling
- 805x improved bottleneck detection coverage (WisIO)
- 40% performance boost for HPC applications (VPIC)
- Fully Decoupled: Components can scale independently
- Storage Agnostic: Works with POSIX, HDF5, S3, and more
- AI/ML Optimized: Native support for model checkpointing and KV caching
- Production Validated: Deployed at DOE National Laboratories
# Add LABIOS Spack repository
git clone https://github.com/grc-iit/labios-spack
spack repo add labios-spack
# Install LABIOS
spack install labios
# Load LABIOS environment
spack load labios
# Clone repository
git clone https://github.com/grc-iit/labios
cd labios
# Build with CMake
mkdir build && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=/path/to/install
make -j8
make install
# Pull LABIOS container
docker pull grciit/labios:latest
# Run LABIOS server
docker run -d -p 4000:4000 --name labios-server grciit/labios:latest
# Run client application
docker run --link labios-server:labios grciit/labios:latest labios-client
#include <labios/labios.h>
// Create a label for data processing
Label* label = labios_create_label(
LABIOS_WRITE, // Operation
data_buffer, // Data pointer
buffer_size, // Size
"/path/to/destination" // Destination
);
// Submit label for asynchronous processing
labios_submit(label);
# Scaffold LABIOS configuration
jarvis labios scaffold hpc_deployment
# Initialize LABIOS services
jarvis labios init
# Start LABIOS (automatically scales based on workload)
jarvis labios start
# Monitor performance
jarvis labios status --metrics
# Stop services
jarvis labios stop
// High-priority checkpoint operation
Label* checkpoint = labios_create_label_with_priority(
LABIOS_WRITE,
checkpoint_data,
checkpoint_size,
"/checkpoints/iteration_1000",
LABIOS_PRIORITY_HIGH
);
// Low-priority logging
Label* log = labios_create_label_with_priority(
LABIOS_WRITE,
log_data,
log_size,
"/logs/debug.log",
LABIOS_PRIORITY_LOW
);
Transparently extend memory capacity using storage hierarchy:
# Enable MegaMmap for out-of-core computation
export LABIOS_MEGAMMAP_ENABLED=1
export LABIOS_MEGAMMAP_TIERS="DRAM:32GB,NVMe:256GB,SSD:1TB"
# Run memory-intensive application
./my_ai_training_app --model-size 100GB
Automated bottleneck detection across your I/O stack:
# Enable WisIO profiling
wisio start --app ./my_application
# Generate performance report
wisio report --format html --output performance_report.html
import labios
import torch
# Configure LABIOS for PyTorch checkpointing
labios.configure_ml_checkpointing(
framework="pytorch",
compression="lz4",
async_mode=True
)
# Transparent model checkpointing
model = MyLargeModel()
labios.enable_smart_checkpointing(model, interval=1000)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Applications β
β (HPC Simulations, AI Training, Big Data Analytics) β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββ
β LABIOS Client Library β
β β’ Label Creation β’ Metadata Enrichment β’ Async Submit β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β Labels
ββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββ
β Label Dispatcher β
β β’ Priority Scheduling β’ Load Balancing β’ QoS Policies β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββ
β Worker Pool β
β β’ Elastic Scaling β’ GPU Support β’ Operation Execution β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββ
β Storage Backends β
β POSIX | HDF5 | S3 | Lustre | DAOS | Custom β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- LABIOS Core: Main label-based I/O system
- MegaMmap: Tiered memory extension (3x GPU memory savings)
- WisIO: Multi-perspective I/O analysis (805x coverage)
- HStream: Hierarchical streaming (2x throughput)
- Viper: DNN model transfer (9x faster)
- IOWarp Runtime: Unified I/O interception
- Jarvis: Automated deployment and scaling
- ChronoLog: Distributed log ordering
Workload Type | Baseline | With LABIOS | Improvement |
---|---|---|---|
VPIC Checkpoint | 100s | 60s | 40% faster |
AI Model Loading | 45s | 5s | 9x faster |
Streaming Analytics | 1GB/s | 2GB/s | 2x throughput |
GPU Memory Usage | 24GB | 8GB | 3x reduction |
p99 Latency | 100ms | 10ms | 10x lower |
We welcome contributions! Please see our Contributing Guide for details.
# Fork and clone
git clone https://github.com/YOUR_USERNAME/labios
cd labios
# Create feature branch
git checkout -b feature/amazing-feature
# Make changes and test
make test
# Submit pull request
git push origin feature/amazing-feature
- Full Documentation: Comprehensive guides and API reference
- Research Papers: Technical details and evaluations
- Tutorial Videos: Getting started guides
- Community Chat: Join discussions on Zulip
- [IPDPS'25] J. Ye et al., "Characterizing KV Caching on Transformer Inferences"
- [ICS'25] I. Yildirim et al., "WisIO: Multi-Perspective I/O Analysis"
- [SC'24] L. Logan et al., "MegaMmap: Extending Memory Boundaries for GPUs"
- [ICPP'24] J. Cernuda et al., "HStream: Hierarchical Streaming for HPC"
- [US Patent] A. Kougkas et al., "Label-Based Data Representation I/O Process and System"
LABIOS is available for commercial licensing through Illinois Tech's technology transfer office.
Ideal for:
- Cloud service providers seeking better I/O performance
- HPC centers managing diverse workloads
- AI companies optimizing model training infrastructure
- Storage vendors building next-generation systems
Principal Investigator: Dr. Xian-He Sun
Co-PI: Dr. Anthony Kougkas
Graduate Students: Luke Logan, Jaime Cernuda, Jie Ye, Izzet Yildirim, Rajni Pawar
This material is based upon work supported by the National Science Foundation under Grant No. 2331480. Special thanks to our partners at DOE National Laboratories (Argonne, LLNL, Sandia) for their collaboration and support.
LABIOS is released under the BSD 3-Clause License. Commercial use requires additional licensing.
Ready to revolutionize your I/O?
β Star us on GitHub β’
π Visit Project Page β’
π§ Contact Team