Dogfooding

Using Tuulbelt tools to validate Tuulbelt tools.

What is Dogfooding?

Dogfooding means using our own tools to validate our tools. For CLI Progress Reporting, we use Test Flakiness Detector to ensure all 111 tests are deterministic and reliable.

Why It Matters

Flaky tests undermine confidence. If tests pass sometimes and fail sometimes, you can't trust:

Whether code changes introduced bugs
Whether the tool works reliably in production
Whether failures indicate real issues

Dogfooding proves our tools work in real scenarios and catches integration issues early.

Validation Process

Test File

cli-progress-reporting/test/flakiness-detection.test.ts

Configuration

Runs: 10 iterations of the complete test suite
Test Command: npm test
Detection Threshold: Any test with 0% < failure rate < 100%
Runtime: ~20 minutes for 10 runs

What It Checks

Consistency - Do all runs produce the same result?
Determinism - Are there any probabilistic failures?
Isolation - Do tests interfere with each other?

Running Flakiness Detection

bash

cd cli-progress-reporting
npx tsx test/flakiness-detection.test.ts

Expected Output

Flakiness detection complete!

Results:
────────────────────────────────────────────────────────────
Total runs:        10
Passed runs:       10
Failed runs:       0
Success rate:      100.0%
Execution time:    ~1200s
Flaky tests found: 0
────────────────────────────────────────────────────────────

Perfect! All tests passed consistently across all runs.
No flaky tests detected - tests are deterministic and reliable.

Dogfooding Success: Test Flakiness Detector validated CLI Progress Reporting

What We Validate

111 tests across 34 suites:

Unit tests (35 tests) - Core functionality
CLI integration tests (28 tests) - Command-line interface
Filesystem tests (21 tests) - Edge cases and error handling

Each test must:

Pass or fail consistently across all 10 runs
Not exhibit probabilistic behavior
Clean up resources properly
Not interfere with other tests

Results

Last Validation: 2025-12-23

Total runs: 2 (development validation)
Pass rate: 100%
Flaky tests: 0
Confidence: Production-ready

Full 10-run validation recommended in CI (takes ~20 minutes).

Design Patterns That Prevent Flakiness

1. Deterministic Test IDs

typescript

// Good: Deterministic ID generation
const id = `test-${Date.now()}-${counter++}`;

// Bad: Random IDs
const id = `test-${Math.random()}`;  // Non-deterministic!

2. Proper Cleanup

typescript

test('my test', () => {
  const id = `test-${Date.now()}`;
  init(100, 'Test', { id });

  // Do test...

  // Clean up
  clear({ id });
});

3. Unique Test Resources

typescript

// Good: Unique filename per test
const counterFile = join(tmpDir, `counter-${Date.now()}-${testId}.txt`);

// Bad: Shared filename
const counterFile = join(tmpDir, 'counter.txt');  // Collision!

4. No Probabilistic Logic

typescript

// Good: Deterministic counter pattern
const counter = parseInt(readFileSync(counterFile, 'utf-8'));
const shouldPass = counter % 2 === 0;

// Bad: Random behavior
const shouldPass = Math.random() < 0.5;  // Non-deterministic!

Continuous Validation

When to Run Flakiness Detection:

Before every release (10+ runs)
After adding new tests (2-5 runs for quick check)
After modifying test infrastructure (10+ runs)
When CI shows intermittent failures (10+ runs)

Development Validation:

bash

# Quick validation (2 runs, ~4 minutes)
cd cli-progress-reporting
# Edit test/flakiness-detection.test.ts: set TEST_RUNS = 2
npx tsx test/flakiness-detection.test.ts

Integration with CI:

yaml

# .github/workflows/flakiness-check.yml
name: Flakiness Check
on:
  pull_request:
    paths:
      - 'test/**'
jobs:
  detect-flakiness:
    runs-on: ubuntu-latest
    timeout-minutes: 30
    steps:
      - uses: actions/checkout@v3
      - run: npx tsx test/flakiness-detection.test.ts  # Uses 10 runs

Future Dogfooding Opportunities

As more Tuulbelt tools are built, we'll use them to validate each other:

Cross-Platform Path Handling → Validate paths in Test Flakiness Detector and CLI Progress
File-Based Semaphore → Coordinate concurrent test execution
Output Diffing Utility → Compare test output across runs

This creates a network of validated, production-ready tools.

Conclusion

Dogfooding proves that:

Test Flakiness Detector works - Successfully ran 111 tests 10+ times
CLI Progress tests are reliable - 100% consistent results
Tuulbelt tools integrate well - Clean APIs, no surprises
Real-world usage validated - Not just toy examples

This establishes dogfooding as a standard practice for all future Tuulbelt tools.

Learn more: Test Flakiness Detector Documentation

Dogfooding ​

What is Dogfooding? ​

Why It Matters ​

Validation Process ​

Test File ​

Configuration ​

What It Checks ​

Running Flakiness Detection ​

Expected Output ​

What We Validate ​

Results ​

Design Patterns That Prevent Flakiness ​

1. Deterministic Test IDs ​

2. Proper Cleanup ​

3. Unique Test Resources ​

4. No Probabilistic Logic ​

Continuous Validation ​

Future Dogfooding Opportunities ​

Conclusion ​

Dogfooding

What is Dogfooding?

Why It Matters

Validation Process

Test File

Configuration

What It Checks

Running Flakiness Detection

Expected Output

What We Validate

Results

Design Patterns That Prevent Flakiness

1. Deterministic Test IDs

2. Proper Cleanup

3. Unique Test Resources

4. No Probabilistic Logic

Continuous Validation

Future Dogfooding Opportunities

Conclusion