Audit Process Runbook¶

Overview¶

This runbook defines the end-to-end process for conducting independent strategy audits. The audit is a mandatory gate before any deployment.

Audit Principles¶

Independence: Auditor must not have developed the strategy
Thoroughness: Every test must be run, no shortcuts
Skepticism: Assume the strategy is flawed until proven otherwise
Documentation: Every finding must be recorded
Objectivity: Verdict based on evidence, not opinion

Audit Workflow¶

┌─────────────────────────────────────────────────────────────────────┐
│                         AUDIT WORKFLOW                              │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  ┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐  │
│  │ RECEIVE  │────▶│ VALIDATE │────▶│ EXECUTE  │────▶│ VERDICT  │  │
│  │ PACKAGE  │     │ PACKAGE  │     │ TESTS    │     │ & REPORT │  │
│  └──────────┘     └──────────┘     └──────────┘     └──────────┘  │
│       │                │                │                │         │
│       ▼                ▼                ▼                ▼         │
│  - Code            - Complete?      - Anti-bias      - PASS       │
│  - Docs            - Runnable?      - Stress         - PASS+FIX   │
│  - Results         - Documented?    - Worst-week     - FAIL       │
│                                     - Innovation                   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Phase 1: Receive Audit Package¶

Required Artifacts¶

audit_package:
  strategy:
    code: path/to/strategy.pine  # or .py
    version: X.Y.Z
    checksum: sha256:xxx

  documentation:
    hypothesis: path/to/hypothesis.md
    validation_checklist: path/to/checklist.md

  backtest_results:
    report: path/to/backtest_report.md
    trade_log: path/to/trades.csv
    equity_curve: path/to/equity.csv

  data:
    dataset: path/to/data.parquet
    date_range: YYYY-MM-DD to YYYY-MM-DD
    catalog_entry: data/catalog.yaml#entry_name

  requestor:
    name: Developer Name
    date: YYYY-MM-DD
    priority: normal | urgent

Package Validation Checklist¶

If incomplete: Return to submitter with specific missing items.

Phase 2: Execute Audit Tests¶

Step 2.1: Anti-Bias Tests¶

# Run anti-bias test suite
cd /path/to/rqf-ml
pytest tests/test_bias.py -v --strategy=path/to/strategy

# Expected output:
# test_lookahead ... PASSED/FAILED
# test_leakage ... PASSED/FAILED
# test_repaint ... PASSED/FAILED
# test_time_alignment ... PASSED/FAILED

Manual verification for Pine Script:

# Check for common issues
grep -n "high\[0\]\|low\[0\]" strategy.pine  # Same-bar high/low
grep -n "request.security.*lookahead_on" strategy.pine  # Lookahead enabled
grep -n "barstate.isrealtime" strategy.pine  # Realtime-only logic

Step 2.2: Independent Backtest¶

Run backtest independently to verify developer's results:

# Run with same parameters as developer
from src.evaluation.backtester import Backtester

bt = Backtester(config={
    'start_date': 'YYYY-MM-DD',
    'end_date': 'YYYY-MM-DD',
    'initial_capital': X,
    'commission': X,
    'slippage': X
})

results = bt.run(strategy, data)

# Compare to developer's results
assert abs(results.sharpe - developer_sharpe) < 0.1, "Sharpe mismatch"
assert abs(results.max_dd - developer_max_dd) < 0.02, "DD mismatch"

Step 2.3: Stress Tests¶

# Run standard stress scenarios
stress_scenarios = [
    'covid_crash',
    'flash_crash',
    'brexit',
    '2x_slippage',
    '2x_spread',
    '50%_fill_rate',
    '10_bar_delay'
]

for scenario in stress_scenarios:
    result = bt.run_stress_test(strategy, scenario)
    log_result(scenario, result)

Step 2.4: Worst-Week Analysis¶

# Identify worst week
daily_returns = results.returns.resample('D').sum()
weekly_returns = daily_returns.resample('W').sum()
worst_week = weekly_returns.idxmin()

# Analyze that week
worst_week_data = data[worst_week - pd.Timedelta(days=7):worst_week]
worst_week_trades = trades[trades['date'].between(worst_week - pd.Timedelta(days=7), worst_week)]

# Document findings
print(f"Worst week: {worst_week}")
print(f"Return: {weekly_returns[worst_week]:.2%}")
print(f"Trades: {len(worst_week_trades)}")
print(f"Win rate: {worst_week_trades['profitable'].mean():.2%}")

Step 2.5: Sensitivity Analysis¶

# Parameter sensitivity sweep
base_params = strategy.get_params()

for param_name, base_value in base_params.items():
    for multiplier in [0.8, 0.9, 1.1, 1.2]:
        test_params = base_params.copy()
        test_params[param_name] = base_value * multiplier

        result = bt.run(strategy.with_params(test_params), data)
        log_sensitivity(param_name, multiplier, result.sharpe)

# Check for cliff edges
# Flag if 10% change causes >30% Sharpe drop

Step 2.6: Innovation & Prior Art¶

Search for similar strategies:
Academic papers (Google Scholar, SSRN)
Trading forums (Elite Trader, ForexFactory)
Open source (GitHub, TradingView public)
Document findings:
Similar approaches found
Key differences from this strategy
Risk of edge decay due to crowding
Identify improvements:
What could make this better?
What research is worth pursuing?

Phase 3: Generate Verdict¶

Decision Tree¶

All anti-bias tests pass?
├── No → FAIL
└── Yes → Continue

OOS degradation < 30%?
├── No → FAIL
└── Yes → Continue

Walk-forward consistency ≥ 60%?
├── No → FAIL
└── Yes → Continue

Worst week survives prop firm limits?
├── No → FAIL (or PASS-WITH-FIXES if mitigatable)
└── Yes → Continue

Any critical issues?
├── Yes → FAIL
└── No → Continue

Any major issues?
├── Yes → PASS-WITH-FIXES
└── No → PASS

Verdict Definitions¶

Verdict	Meaning	Required Actions
PASS	Approved for deployment	Proceed to paper trading
PASS-WITH-FIXES	Conditionally approved	Complete fix list, re-verify, then deploy
FAIL	Not approved	Return to development, address findings

Phase 4: Generate Report¶

Report Generation Steps¶

Copy docs/audits/audit_template.md to docs/audits/{strategy}_audit.md
Fill in all sections
Attach evidence:
Anti-bias test output
Independent backtest results
Stress test results
Worst-week analysis
Document all findings
Generate fix list (if applicable)
Record verdict and justification
Sign and date

Report Delivery¶

delivery:
  report_path: docs/audits/{strategy}_audit.md
  notification:
    to: [strategy_author, founder]
    subject: "Audit Complete: {strategy} - {verdict}"
  next_steps:
    PASS: "Proceed to paper trading setup"
    PASS-WITH-FIXES: "Address fixes, request re-verification"
    FAIL: "Review findings, revise strategy"

Audit SLAs¶

Priority	Initial Response	Complete Audit
Normal	24 hours	72 hours
Urgent	4 hours	24 hours

Urgent criteria: - Live issue requiring fix - Time-sensitive opportunity - Founder-approved escalation

Re-Audit Process¶

When Re-Audit Required¶

After PASS-WITH-FIXES fixes completed
After FAIL issues addressed
After major strategy modification
After 90 days (periodic re-validation)

Re-Audit Scope¶

Trigger	Scope
Fixes completed	Verify fixes + affected tests only
Major modification	Full audit
Periodic	Abbreviated (key metrics + drift check)

Audit Records¶

Record Keeping¶

All audits logged in docs/audits/audit_log.md:

| Audit ID | Strategy | Version | Date | Verdict | Auditor |
|----------|----------|---------|------|---------|---------|
| AUD-2024-01-15-001 | ICT QM | 1.0.0 | 2024-01-15 | PASS | Auditor |
| AUD-2024-01-20-002 | Session BO | 0.9.0 | 2024-01-20 | FAIL | Auditor |

Retention¶

Audit reports: Permanent
Evidence files: 2 years minimum
Superseded audits: Archived, not deleted

Escalation¶

Audit Disagreement¶

If strategy author disputes findings:

Auditor provides detailed evidence
Author provides counter-evidence
Founder reviews and decides
Decision is final and documented

Audit Blockage¶

If audit cannot be completed:

Document specific blocker
Return package to submitter
Submitter resolves and resubmits
Clock restarts on SLA