Audit Process Runbook¶
Overview¶
This runbook defines the end-to-end process for conducting independent strategy audits. The audit is a mandatory gate before any deployment.
Audit Principles¶
- Independence: Auditor must not have developed the strategy
- Thoroughness: Every test must be run, no shortcuts
- Skepticism: Assume the strategy is flawed until proven otherwise
- Documentation: Every finding must be recorded
- Objectivity: Verdict based on evidence, not opinion
Audit Workflow¶
┌─────────────────────────────────────────────────────────────────────┐
│ AUDIT WORKFLOW │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ RECEIVE │────▶│ VALIDATE │────▶│ EXECUTE │────▶│ VERDICT │ │
│ │ PACKAGE │ │ PACKAGE │ │ TESTS │ │ & REPORT │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ - Code - Complete? - Anti-bias - PASS │
│ - Docs - Runnable? - Stress - PASS+FIX │
│ - Results - Documented? - Worst-week - FAIL │
│ - Innovation │
│ │
└─────────────────────────────────────────────────────────────────────┘
Phase 1: Receive Audit Package¶
Required Artifacts¶
audit_package:
strategy:
code: path/to/strategy.pine # or .py
version: X.Y.Z
checksum: sha256:xxx
documentation:
hypothesis: path/to/hypothesis.md
validation_checklist: path/to/checklist.md
backtest_results:
report: path/to/backtest_report.md
trade_log: path/to/trades.csv
equity_curve: path/to/equity.csv
data:
dataset: path/to/data.parquet
date_range: YYYY-MM-DD to YYYY-MM-DD
catalog_entry: data/catalog.yaml#entry_name
requestor:
name: Developer Name
date: YYYY-MM-DD
priority: normal | urgent
Package Validation Checklist¶
- All required files present
- Checksums match
- Code compiles without errors
- Documentation is complete
- Backtest results are reproducible
- Data is accessible
If incomplete: Return to submitter with specific missing items.
Phase 2: Execute Audit Tests¶
Step 2.1: Anti-Bias Tests¶
# Run anti-bias test suite
cd /path/to/rqf-ml
pytest tests/test_bias.py -v --strategy=path/to/strategy
# Expected output:
# test_lookahead ... PASSED/FAILED
# test_leakage ... PASSED/FAILED
# test_repaint ... PASSED/FAILED
# test_time_alignment ... PASSED/FAILED
Manual verification for Pine Script:
# Check for common issues
grep -n "high\[0\]\|low\[0\]" strategy.pine # Same-bar high/low
grep -n "request.security.*lookahead_on" strategy.pine # Lookahead enabled
grep -n "barstate.isrealtime" strategy.pine # Realtime-only logic
Step 2.2: Independent Backtest¶
Run backtest independently to verify developer's results:
# Run with same parameters as developer
from src.evaluation.backtester import Backtester
bt = Backtester(config={
'start_date': 'YYYY-MM-DD',
'end_date': 'YYYY-MM-DD',
'initial_capital': X,
'commission': X,
'slippage': X
})
results = bt.run(strategy, data)
# Compare to developer's results
assert abs(results.sharpe - developer_sharpe) < 0.1, "Sharpe mismatch"
assert abs(results.max_dd - developer_max_dd) < 0.02, "DD mismatch"
Step 2.3: Stress Tests¶
# Run standard stress scenarios
stress_scenarios = [
'covid_crash',
'flash_crash',
'brexit',
'2x_slippage',
'2x_spread',
'50%_fill_rate',
'10_bar_delay'
]
for scenario in stress_scenarios:
result = bt.run_stress_test(strategy, scenario)
log_result(scenario, result)
Step 2.4: Worst-Week Analysis¶
# Identify worst week
daily_returns = results.returns.resample('D').sum()
weekly_returns = daily_returns.resample('W').sum()
worst_week = weekly_returns.idxmin()
# Analyze that week
worst_week_data = data[worst_week - pd.Timedelta(days=7):worst_week]
worst_week_trades = trades[trades['date'].between(worst_week - pd.Timedelta(days=7), worst_week)]
# Document findings
print(f"Worst week: {worst_week}")
print(f"Return: {weekly_returns[worst_week]:.2%}")
print(f"Trades: {len(worst_week_trades)}")
print(f"Win rate: {worst_week_trades['profitable'].mean():.2%}")
Step 2.5: Sensitivity Analysis¶
# Parameter sensitivity sweep
base_params = strategy.get_params()
for param_name, base_value in base_params.items():
for multiplier in [0.8, 0.9, 1.1, 1.2]:
test_params = base_params.copy()
test_params[param_name] = base_value * multiplier
result = bt.run(strategy.with_params(test_params), data)
log_sensitivity(param_name, multiplier, result.sharpe)
# Check for cliff edges
# Flag if 10% change causes >30% Sharpe drop
Step 2.6: Innovation & Prior Art¶
- Search for similar strategies:
- Academic papers (Google Scholar, SSRN)
- Trading forums (Elite Trader, ForexFactory)
-
Open source (GitHub, TradingView public)
-
Document findings:
- Similar approaches found
- Key differences from this strategy
-
Risk of edge decay due to crowding
-
Identify improvements:
- What could make this better?
- What research is worth pursuing?
Phase 3: Generate Verdict¶
Decision Tree¶
All anti-bias tests pass?
├── No → FAIL
└── Yes → Continue
OOS degradation < 30%?
├── No → FAIL
└── Yes → Continue
Walk-forward consistency ≥ 60%?
├── No → FAIL
└── Yes → Continue
Worst week survives prop firm limits?
├── No → FAIL (or PASS-WITH-FIXES if mitigatable)
└── Yes → Continue
Any critical issues?
├── Yes → FAIL
└── No → Continue
Any major issues?
├── Yes → PASS-WITH-FIXES
└── No → PASS
Verdict Definitions¶
| Verdict | Meaning | Required Actions |
|---|---|---|
| PASS | Approved for deployment | Proceed to paper trading |
| PASS-WITH-FIXES | Conditionally approved | Complete fix list, re-verify, then deploy |
| FAIL | Not approved | Return to development, address findings |
Phase 4: Generate Report¶
Report Generation Steps¶
- Copy
docs/audits/audit_template.mdtodocs/audits/{strategy}_audit.md - Fill in all sections
- Attach evidence:
- Anti-bias test output
- Independent backtest results
- Stress test results
- Worst-week analysis
- Document all findings
- Generate fix list (if applicable)
- Record verdict and justification
- Sign and date
Report Delivery¶
delivery:
report_path: docs/audits/{strategy}_audit.md
notification:
to: [strategy_author, founder]
subject: "Audit Complete: {strategy} - {verdict}"
next_steps:
PASS: "Proceed to paper trading setup"
PASS-WITH-FIXES: "Address fixes, request re-verification"
FAIL: "Review findings, revise strategy"
Audit SLAs¶
| Priority | Initial Response | Complete Audit |
|---|---|---|
| Normal | 24 hours | 72 hours |
| Urgent | 4 hours | 24 hours |
Urgent criteria: - Live issue requiring fix - Time-sensitive opportunity - Founder-approved escalation
Re-Audit Process¶
When Re-Audit Required¶
- After PASS-WITH-FIXES fixes completed
- After FAIL issues addressed
- After major strategy modification
- After 90 days (periodic re-validation)
Re-Audit Scope¶
| Trigger | Scope |
|---|---|
| Fixes completed | Verify fixes + affected tests only |
| Major modification | Full audit |
| Periodic | Abbreviated (key metrics + drift check) |
Audit Records¶
Record Keeping¶
All audits logged in docs/audits/audit_log.md:
| Audit ID | Strategy | Version | Date | Verdict | Auditor |
|----------|----------|---------|------|---------|---------|
| AUD-2024-01-15-001 | ICT QM | 1.0.0 | 2024-01-15 | PASS | Auditor |
| AUD-2024-01-20-002 | Session BO | 0.9.0 | 2024-01-20 | FAIL | Auditor |
Retention¶
- Audit reports: Permanent
- Evidence files: 2 years minimum
- Superseded audits: Archived, not deleted
Escalation¶
Audit Disagreement¶
If strategy author disputes findings:
- Auditor provides detailed evidence
- Author provides counter-evidence
- Founder reviews and decides
- Decision is final and documented
Audit Blockage¶
If audit cannot be completed:
- Document specific blocker
- Return package to submitter
- Submitter resolves and resubmits
- Clock restarts on SLA