- Overview
- System Architecture
- Component Design
- Workflow
- Data Model
- Infrastructure
- Testing Framework
- Publishing System
- Security Considerations
- Extensibility
- Future Enhancements
BTRFSQA is an automated continuous integration and quality assurance system for the BTRFS filesystem. It provides automated testing of the latest BTRFS kernel development code against comprehensive test suites and publishes results to a public dashboard.
- Automation: Fully automated end-to-end testing pipeline with zero manual intervention
- Coverage: Comprehensive testing across kernel, userspace tools, and filesystem operations
- Transparency: Public visibility of test results through GitHub Pages
- Cost-Effectiveness: Efficient use of cloud resources with automatic cleanup
- Reproducibility: Consistent test environment using fresh infrastructure per run
- Automated AWS EC2 infrastructure provisioning and teardown
- Latest BTRFS development kernel compilation and installation
- Multi-stage test suite execution (btrfsprogs, xfstests, RAID5 scrub tests)
- Terminal recording via Asciinema for visual debugging
- Automated results publishing to GitHub Pages
- Historical test results preservation
┌──────────────────────────────────────────────────────────────┐
│ BTRFSQA System │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Control Plane (Local Machine) │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ btrfsqa.py Orchestrator │ │ │
│ │ │ ┌────────────────┬─────────────────────────┐ │ │ │
│ │ │ │ AWS Manager │ Remote Executor │ │ │ │
│ │ │ │ (Boto) │ (Fabric) │ │ │ │
│ │ │ ├────────────────┼─────────────────────────┤ │ │ │
│ │ │ │ Config Manager │ Results Publisher │ │ │ │
│ │ │ │ │ (Git) │ │ │ │
│ │ │ └────────────────┴─────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Configuration Sources: │ │
│ │ • aws_auth.json • ec2.json • github.json │ │
│ │ • timeout.json • kernel.config • local.config │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ │ Provision & Execute │
│ ▼ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Test Execution Environment (AWS EC2) │ │
│ │ │ │
│ │ Instance: r4.large (Fedora 26) │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ Layer 1: BTRFS Development Kernel │ │ │
│ │ │ • Source: btrfs-devel/misc-next │ │ │
│ │ │ • Build: Custom kernel.config │ │ │
│ │ │ • Install: Automatic bootloader update │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ Layer 2: Test Execution Framework │ │ │
│ │ │ ┌────────────┬──────────────┬────────────────┐ │ │ │
│ │ │ │ Script 001 │ Script 002 │ Script 003 │ │ │ │
│ │ │ │ Kernel │ btrfsprogs │ xfstests │ │ │ │
│ │ │ │ Build │ Test Suite │ Test Suite │ │ │ │
│ │ │ ├────────────┼──────────────┼────────────────┤ │ │ │
│ │ │ │ Script 004 │ Asciinema │ Results │ │ │ │
│ │ │ │ RAID5 │ Recorder │ Collector │ │ │ │
│ │ │ │ Scrub │ │ │ │ │ │
│ │ │ └────────────┴──────────────┴────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ Layer 3: Storage Infrastructure │ │ │
│ │ │ 6x EBS Volumes (20GB each) │ │ │
│ │ │ /dev/xvdb, /dev/xvdc, /dev/xvdd, │ │ │
│ │ │ /dev/xvde, /dev/xvdf, /dev/xvdg │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ │ Results Collection │
│ ▼ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Publishing Layer │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌──────────────────────────┐ │ │
│ │ │ Asciinema.org │ │ GitHub Repository │ │ │
│ │ │ Terminal │ │ • results/ │ │ │
│ │ │ Recordings │ │ • _layouts/default.html │ │ │
│ │ └──────────────────┘ └──────────────────────────┘ │ │
│ │ │ │ │
│ │ │ Triggers │ │
│ │ ▼ │ │
│ │ ┌────────────────────────┐ │ │
│ │ │ GitHub Pages │ │ │
│ │ │ Public Dashboard │ │ │
│ │ │ (Jekyll + Cayman) │ │ │
│ │ └────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
- Location: Local machine
- Language: Python 2.7
- Dependencies: Boto (AWS SDK), Fabric (SSH automation)
- Responsibilities: Orchestration, provisioning, configuration, results publishing
- Location: AWS EC2 (ephemeral)
- OS: Fedora 26
- Instance Type: r4.large (spot instance)
- Responsibilities: Kernel compilation, test execution, results generation
- GitHub Repository: Version control and static hosting
- GitHub Pages: Jekyll-based public dashboard
- Asciinema.org: Terminal recording hosting
Purpose: Main control script that coordinates all system operations
Key Functions:
main() # Entry point, orchestrates entire workflow
req_instance_and_tag() # Provisions EC2 spot instance with tags
set_bdm() # Configures block device mapping (6x EBS volumes)
install_sw() # Uploads configs, installs dependencies, runs tests
update_htmltable() # Generates results HTML and publishes to GitHub
del_sys() # Cleanup: terminates instance and deletes volumesDesign Patterns:
- Configuration-driven: All settings externalized to JSON files
- Sequential execution: Scripts run in order with timeout controls
- Error handling: Graceful degradation with cleanup on failure
- Idempotency: Safe to re-run, cleans up previous resources
File Structure:
setup/config/
├── aws_auth.json # AWS access key, secret key, region
├── ec2.json # AMI ID, instance type, security group
├── github.json # Repository URL, credentials
├── timeout.json # Per-script timeout limits (minutes)
├── kernel.config # Linux kernel build configuration
├── local.config # xfstests environment variables
├── bashrc # Shell environment customization
└── netrc # Git credentials for automation
Design Principles:
- Separation of concerns: Credentials separate from code
- Version control: Non-sensitive configs tracked in Git
- Flexibility: Easy modification without code changes
- Security: Sensitive files (.gitignored)
Execution Model: Sequential execution with completion signaling
Script Architecture:
001_btrfsdevel (Timeout: 120 min)
├── Clone btrfs-devel kernel source
├── Copy kernel.config
├── Compile kernel (make -j4)
├── Install kernel modules
├── Update bootloader
├── Reboot instance
└── Signal: touch /tmp/001_btrfsdevel.completed
002_btrfsprogs (Timeout: 120 min)
├── Clone btrfsprogs repository
├── Build from source (autogen, configure, make)
├── Run test suites: fsck, cli, misc, fuzz
└── Signal: touch /tmp/002_btrfsprogs.completed
003_xfstests (Timeout: 120 min)
├── Clone xfstests repository
├── Setup test environment (local.config)
├── Create test filesystems on /dev/xvdb-xvdg
├── Execute: make && make install && ./check -g auto
└── Signal: touch /tmp/003_xfstests.completed
004_raid5_scrub (Timeout: 30 min)
├── Apply RAID5-specific patches
├── Run scrub tests
└── Signal: touch /tmp/004_raid5_scrub.completed
Completion Protocol:
- Each script creates
/tmp/SCRIPTNAME.completedon success - Orchestrator polls for completion files with timeout
- Missing completion file = test failure
Technology: Asciinema (terminal recording)
Workflow:
1. Start recording: asciinema rec -c "bash SCRIPT" OUTPUT.json
2. Execute test script within recording
3. Upload to asciinema.org: asciinema upload OUTPUT.json
4. Parse upload URL from response
5. Embed in HTML table with thumbnail
Benefits:
- Visual debugging of test failures
- Exact reproduction of terminal session
- Lightweight (text-based format)
- Publicly shareable links
Start
│
├─► [1] Load Configuration Files
│ ├── aws_auth.json
│ ├── ec2.json
│ ├── github.json
│ └── timeout.json
│
├─► [2] AWS Infrastructure Provisioning
│ ├── Connect to AWS (Boto)
│ ├── Request spot instance (r4.large, Fedora 26)
│ ├── Configure 6x EBS volumes (20GB each)
│ ├── Tag resources (Name: btrfsqa-DATE)
│ ├── Wait for instance state: running
│ └── Get public IP address
│
├─► [3] Remote Environment Setup
│ ├── SSH connect (Fabric, wait for availability)
│ ├── Upload configuration files:
│ │ ├── kernel.config → /tmp/
│ │ ├── local.config → /tmp/
│ │ ├── bashrc → /tmp/
│ │ └── netrc → /root/.netrc
│ ├── Upload test scripts (001-004)
│ └── Install base dependencies:
│ ├── git
│ ├── python3
│ ├── asciinema
│ └── screen
│
├─► [4] Sequential Test Execution
│ │
│ ├─► Script 001: BTRFS Kernel Build (120 min timeout)
│ │ ├── Record: asciinema rec -c "bash 001_btrfsdevel"
│ │ ├── Download kernel source (misc-next branch)
│ │ ├── Configure with kernel.config
│ │ ├── Compile: make -j4
│ │ ├── Install: make modules_install && make install
│ │ ├── Update grub bootloader
│ │ ├── Reboot instance
│ │ ├── Wait for SSH reconnection
│ │ ├── Verify new kernel: uname -r
│ │ ├── Create completion marker
│ │ └── Upload recording to asciinema.org
│ │
│ ├─► Script 002: btrfsprogs Tests (120 min timeout)
│ │ ├── Record execution
│ │ ├── Clone btrfsprogs repository
│ │ ├── Build: ./autogen.sh && ./configure && make
│ │ ├── Run test suites:
│ │ │ ├── make test-fsck
│ │ │ ├── make test-cli
│ │ │ ├── make test-misc
│ │ │ └── make test-fuzz
│ │ ├── Create completion marker
│ │ └── Upload recording
│ │
│ ├─► Script 003: xfstests (120 min timeout)
│ │ ├── Record execution
│ │ ├── Clone xfstests repository
│ │ ├── Install dependencies
│ │ ├── Build: make && make install
│ │ ├── Setup test devices:
│ │ │ ├── TEST_DEV=/dev/xvdb
│ │ │ ├── TEST_DIR=/mnt/test
│ │ │ ├── SCRATCH_DEV_POOL=/dev/xvdc-xvdg
│ │ │ └── SCRATCH_MNT=/mnt/scratch
│ │ ├── Execute: ./check -g auto (all tests)
│ │ ├── Create completion marker
│ │ └── Upload recording
│ │
│ └─► Script 004: RAID5 Scrub (30 min timeout)
│ ├── Record execution
│ ├── Apply specific patches
│ ├── Run RAID5 scrub tests
│ ├── Create completion marker
│ └── Upload recording
│
├─► [5] Results Collection
│ ├── Download asciinema upload URLs
│ ├── Download test logs from /tmp/
│ ├── Determine pass/fail status:
│ │ ├── Pass: *.completed file exists
│ │ └── Fail: timeout or missing completion
│ └── Collect metadata (timestamps, script names)
│
├─► [6] Results Publishing
│ ├── Clone GitHub repository (local temp directory)
│ ├── Create results directory: results/results_YYYY-MM-DD_HH:MM/
│ ├── Copy test logs to results directory
│ ├── Generate HTML table with:
│ │ ├── Script name
│ │ ├── Status badge (pass/fail)
│ │ ├── Asciinema embed with thumbnail
│ │ └── Log file download links
│ ├── Update _layouts/default.html
│ ├── Git commit with timestamp message
│ ├── Git push to origin/master
│ └── GitHub Pages auto-rebuilds site
│
├─► [7] Infrastructure Cleanup
│ ├── Wait 2 minutes (allow final syncs)
│ ├── Terminate EC2 instance
│ ├── Delete unattached EBS volumes
│ └── Log cleanup completion
│
End
results/
├── results_2024-11-15_10:30/
│ ├── btrfsprogs_001/
│ │ ├── test.log
│ │ ├── fsck-tests.log
│ │ └── cli-tests.log
│ ├── xfstests_001/
│ │ ├── results.log
│ │ ├── failed.log
│ │ └── check.log
│ ├── logs/
│ │ ├── 001_btrfsdevel.log
│ │ ├── 002_btrfsprogs.log
│ │ ├── 003_xfstests.log
│ │ └── 004_raid5_scrub.log
│ └── screencasts/
│ ├── 001.json (asciinema recording)
│ └── 001.url (uploaded URL)
└── results_2024-11-16_09:45/
└── ... (next test run)
<tr>
<td>Script Name</td>
<td>
<span class="status-badge pass|fail">PASS|FAIL</span>
</td>
<td>
<script src="https://asciinema.org/a/ID.js" data-theme="monokai"></script>
</td>
<td>
<a href="results/PATH/logs/SCRIPT.log">View Log</a>
</td>
</tr>EC2 Instance Specifications:
- Instance Type: r4.large (15.25 GB RAM, 2 vCPUs)
- Purchasing: Spot instance (cost optimization)
- AMI: Fedora 26 (ami-id from ec2.json)
- Region: Configurable (us-east-1 default)
- Security Group: SSH (port 22) enabled
- Key Pair: btrfsqa-keypair
Storage Configuration:
- Root Volume: Default AMI root (typically 8-10 GB)
- Data Volumes: 6x EBS volumes (20 GB GP2 each)
/dev/xvdb: Primary test device/dev/xvdc-xvdg: Scratch devices for multi-disk tests
- Lifecycle: Deleted on termination
Resource Tagging:
{
"Name": "btrfsqa-2024-11-18",
"Project": "btrfsqa",
"ManagedBy": "automation"
}Internet
│
├─► Local Machine (Control Plane)
│ ├── Outbound: AWS API (HTTPS)
│ └── Outbound: SSH to EC2
│
└─► AWS Region (us-east-1)
│
├─► EC2 Instance (Public Subnet)
│ ├── Public IP: Dynamic (assigned at launch)
│ ├── Inbound: SSH (port 22) from anywhere
│ └── Outbound: Internet access (Git, package repos)
│
├─► GitHub.com
│ ├── Git clone (btrfs-devel, btrfsprogs, xfstests)
│ └── Git push (results publishing)
│
└─► Asciinema.org
└── Recording upload (HTTP POST)
BTRFSQA Testing Pyramid
┌───────────────────────┐
│ Integration Tests │ Script 004: RAID5 Scrub
│ (Specific Scenarios)│ • Targeted regression tests
└───────────────────────┘ • Known bug validation
│
┌────────┴─────────┐
│ Functional Tests │ Script 003: xfstests
│ (Filesystem Ops) │ • 400+ test cases
└────────────────────┘ • POSIX compliance
│ • Stress testing
┌────────┴─────────┐ • Data integrity
│ Unit Tests │ Script 002: btrfsprogs
│ (Userspace Tools)│ • Tool-specific tests
└────────────────────┘ • CLI validation
│ • Format verification
┌────────┴─────────┐
│ Kernel Build │ Script 001: Kernel
│ (Base Layer) │ • Compilation check
└────────────────────┘ • Module loading
Objective: Validate latest BTRFS kernel code compiles and boots
Steps:
- Clone btrfs-devel repository (misc-next branch)
- Copy custom kernel configuration
- Compile kernel (make -j4)
- Install kernel and modules
- Update bootloader configuration
- Reboot instance
- Verify new kernel loaded
Success Criteria:
- Compilation completes without errors
- Kernel boots successfully
- BTRFS module loads
/tmp/001_btrfsdevel.completedcreated
Objective: Validate userspace tools functionality
Test Categories:
- fsck-tests: Filesystem check and repair
- cli-tests: Command-line interface
- misc-tests: Miscellaneous utilities
- fuzz-tests: Malformed input handling
Success Criteria:
- All test categories pass
- No crashes or hangs
/tmp/002_btrfsprogs.completedcreated
Objective: Comprehensive filesystem testing
Test Coverage:
- File operations (create, read, write, delete)
- Directory operations
- Extended attributes
- ACLs and permissions
- Quotas
- Snapshots and clones
- Compression
- Checksumming
- RAID configurations
- Error injection
- Recovery scenarios
Configuration:
export TEST_DEV=/dev/xvdb
export TEST_DIR=/mnt/test
export SCRATCH_DEV_POOL="/dev/xvdc /dev/xvdd /dev/xvde /dev/xvdf /dev/xvdg"
export SCRATCH_MNT=/mnt/scratch
export FSTYP=btrfsSuccess Criteria:
- Test suite completes
- No kernel panics
- Acceptable pass rate
/tmp/003_xfstests.completedcreated
Objective: Validate specific RAID5 functionality
Focus Areas:
- RAID5/6 rebuild
- Scrub operation
- Data recovery
- Parity verification
Success Criteria:
- Specific test cases pass
- No data corruption
/tmp/004_raid5_scrub.completedcreated
Technology Stack:
- Framework: Jekyll (static site generator)
- Theme: Cayman (GitHub Pages default)
- Hosting: GitHub Pages (automatic deployment)
File Structure:
btrfsqa/
├── _config.yml # Jekyll configuration
│ ├── theme: jekyll-theme-cayman
│ └── title: BTRFSQA Dashboard
│
├── _layouts/
│ └── default.html # Main page template
│ ├── Header: Project info
│ ├── Table: Test results (dynamically updated)
│ └── Footer: Known issues
│
├── results/ # Test execution results
│ └── (timestamped directories)
│
└── index.md # Landing page content
Update Mechanism:
def update_htmltable():
1. Clone repository to temp directory
2. Read _layouts/default.html
3. Generate new table rows for latest results
4. Insert rows into HTML template
5. Commit changes: "Update results - YYYY-MM-DD HH:MM"
6. Push to origin/master
7. GitHub Pages rebuilds (automatic, ~1 minute)Table Columns:
- Script Name: Test identifier (e.g., "001_btrfsdevel")
- Status: Visual badge (green PASS, red FAIL)
- Screencast: Embedded Asciinema player with thumbnail
- Logs: Download links for detailed output
Asciinema Integration:
<script
id="asciicast-RECORDING_ID"
src="https://asciinema.org/a/RECORDING_ID.js"
async
data-theme="monokai"
data-size="small"
data-cols="120"
></script>Benefits:
- No server infrastructure required
- Automatic HTTPS
- CDN distribution
- Version controlled history
- Zero operational cost
Sensitive Files (not in version control):
setup/config/
├── aws_auth.json # AWS access keys
├── github.json # GitHub credentials
└── netrc # Git authentication
Git Configuration:
# .gitignore entries
setup/config/aws_auth.json
setup/config/github.json
setup/config/netrc
*.pem
*.keyAWS Permissions Required:
ec2:RunInstances(spot instance creation)ec2:TerminateInstancesec2:CreateTagsec2:DescribeInstancesec2:DescribeVolumesec2:DeleteVolume
GitHub Permissions:
- Repository write access (for results publishing)
- Pages deployment (automatic with write access)
EC2 Security Group:
- Inbound: SSH (port 22) from 0.0.0.0/0
- Outbound: All traffic allowed
Recommendations:
- Restrict SSH to known IP ranges
- Use IAM roles instead of access keys
- Enable CloudTrail for audit logging
- Implement GitHub deploy keys (read-only clones)
Process:
- Create new script file:
setup/scripts/00X_testname - Make executable:
chmod +x 00X_testname - Follow completion protocol:
# At end of script touch /tmp/00X_testname.completed - Add timeout to
setup/config/timeout.json:{ "00X_testname": 60 } - Script will be automatically discovered and executed
Script Template:
#!/bin/bash
set -e # Exit on error
# Test logic here
echo "Running custom test..."
# Signal completion
touch /tmp/00X_testname.completedCommon Modifications:
-
Instance Type (
ec2.json):{ "InstanceType": "r5.xlarge" # More CPU/RAM } -
Storage (
btrfsqa.py:set_bdm()):# Add more volumes bdm.append({ 'DeviceName': '/dev/xvdh', 'Ebs': {'VolumeSize': 50} })
-
Timeout Adjustments (
timeout.json):{ "003_xfstests": 240 # Increase to 4 hours }
Future Extensibility:
- Notification plugins: Email, Slack, PagerDuty alerts
- Storage backends: S3, NFS for results
- Test schedulers: Cron integration, webhook triggers
- Results analyzers: Automated failure classification
- Comparison tools: Regression detection across runs
-
Error Handling:
- Retry logic for transient failures
- Partial result preservation on timeout
- Email notifications on test failures
-
Performance:
- Parallel test execution (where safe)
- Incremental kernel builds
- Result compression
-
Reporting:
- Test duration tracking
- Pass/fail rate graphs
- Historical trend analysis
-
Multi-Kernel Testing:
- Test multiple kernel versions per run
- Comparison matrix
- Regression bisection
-
Custom Test Configurations:
- Parameterized xfstests runs
- Mount option variations
- Feature flag combinations
-
Integration:
- GitHub webhook triggers
- PR comment integration
- Slack notifications
-
Distributed Testing:
- Multi-region execution
- Parallel instance testing
- Load balancing
-
Advanced Analytics:
- ML-based failure prediction
- Automatic bug categorization
- Performance regression detection
-
Community Features:
- Public API for results
- Custom test submission
- Comparison with community runs
BTRFSQA provides a robust, automated testing infrastructure for BTRFS development. Its design emphasizes automation, transparency, and cost-effectiveness while maintaining extensibility for future enhancements. The system successfully bridges kernel development with public quality assurance, enabling the BTRFS community to track stability and progress over time.