Validation
We use robustness and surrogate tests suited to heterogeneous records (PADM2M long, CALS10k short, GGF100k differing). Classical train/holdout isn’t appropriate; instead, we:
1. Internal robustness (within-record) - Leave-one-signal-out: spikes that persist are robust. - Downsample/coarsen cadence: spikes that survive scale changes are credible. - Window sensitivity: vary ±25–50% and track timing drift bands.
2. Surrogate/negative controls - IAAFT surrogates preserve spectrum and distribution; spike rates and magnitudes should drop toward baseline. - Block-shuffle preserves local structure while breaking long-range alignment. - Full time-scramble as a strong negative control.
3. Age-model uncertainty - Monte Carlo “wiggle” timestamps and recompute ΔZ; report spike retention rate and timing uncertainty. - Coarse bin alignment (e.g., 500–1000 yr) should preserve broad spike clusters.
4. Cross-product consistency (no fitting) - Minimal harmonization (coarse cadence + shared band-pass) across overlapping spans; report descriptive coincidence only.
5. Event overlays (descriptive) - Overlay independent bands strictly for visualization; compute coincidence vs. block-shift baselines for empirical p-values.
Acceptance criteria (alpha)
- Robustness: ≥60% of prominent PADM2M spikes persist under leave-one-out and ±33% window variation; ≤10% timing drift (relative to window). - Surrogates: Spike rate −40% and top-quantile magnitude −30% vs. real. - Age-model: ≥50% spike retention; median timing shift < one bin width. - Cross-product: Overlaps exceed 95% of block-shift baselines.
Artifacts
Code (validation utilities)
- run_padm2m_validation.py - surrogates.py - robustness.py - age_wiggle.py - overlay_stats.py
Plan
Data JSON
- assets/data/validation/padm2m_validation.json
Figures
- /figures/validation/padm2m_window_sensitivity.png - /figures/validation/padm2m_surrogate_q95.png - /figures/validation/padm2m_downsample_overlay.png
PADM2M snapshots
JSON summary: padm2m_validation.json.
Safe summary
Given heterogeneous datasets, we use robustness and surrogate tests instead of classical holdouts. ΔZ spikes persist under signal dropouts and parameter variation and are rarer in surrogates that preserve autocorrelation. Coincidence with independent event bands exceeds baselines from block-shift tests. These results support ΔZ as a regime-change detector; they do not establish mechanism or prediction.