Perpetual Futures and Market Quality · AFA 2026
Cryptocurrency Microstructure · Kaiko Tick Data · 13TB · 2017–2023 · Accepted at AFA 2026
How do perpetual futures affect spot market quality? Tick-level analysis across Binance, Huobi, OKEx — 5,213 trading days, 13 pairs. Causal identification via Huobi's 2021 contract termination.
Key finding: Perpetual futures listing reduces quoted spreads by 44.5% and increases depth by 216.8%. Huobi termination DiD confirms: removing perps widens spot spreads (t = 2.38, p = 0.017).
Tiny Trades, Big Questions: Fractional Shares
DTAQ Millisecond Data · CRSP · 13B Trades · 276 Trading Days · 2021–2022
Independent replication of Bartlett, McCrary & O'Hara (JFE 2024) latency-based fractional share classifier. Identifying Robinhood and Drivewealth trades in consolidated tape using SIP–participant timestamp gaps.
Key finding: 176.2M fractional trades identified (90.5% of BMO). Top 30 stock ranking exact match. Concentration distribution identical within 0.5pp. U-shaped price quintile pattern confirmed.
Prediction Market Microstructure
Polymarket · Hyperliquid DEX · 2024 US Presidential Election
Microstructure of Polymarket's CLOB-based prediction market. Order flow, liquidity provision, and price formation during the 2024 election.
Key finding: Spreads tighten as certainty increases. Liquidity concentrates at 50/50 and drains near 0 or 1.
Cross-Market Predictability
Crypto–Equity Linkages · Kaiko + TAQ + WRDS · 2017–2023
Can crypto signals predict equity returns? BTC realized volatility from 2,024 days of tick data (mean RV: 73.3%). Dynamic correlations and spillover effects.
Interim: BTC RV shows regime-dependent correlation with VIX. Crypto order flow leads equity moves by 15–30 min during stress.
LLM-Powered Earnings Call Stock Picker
Gemini LLM · WRDS Transcripts (84K) · CRSP · 2021–2022
Using Gemini to analyze 84,121 earnings call transcripts, generating Buy/Sell/Hold with confidence scores. Backtesting against CRSP daily returns.
Interim: High-confidence Buy signals: +3.11% avg 20-day return. 438 signals from 30/50 dates.
CSI 1000 Index Option Volatility Selling
CFFEX · AKShare · 68K Options · 1,022 Contracts · 2022–2026 · 91 Pages
Sell ATM straddles when IV exceeds 80th percentile on China's CSI 1000 index options. Gamma-controlled (per-trade ≤ ¥2M, portfolio ≤ ¥200M), delta-hedged with index futures. Complete daily Greek exposures and trade logs.
Key finding: +71.9% total return (16.5% ann.), Sharpe 0.94, -12.3% max DD, 63% win rate. 305 trades over 3.5 years. Variance risk premium robust in China's small-cap options market.
Systematic Volatility Selling with Delta Hedging
OptionMetrics · WRDS · 4.6M Options · 6,043 Securities · 2016–2021 · 51 Pages
Sell ATM puts when IV exceeds 80th percentile, delta-hedge with index futures. Per-trade delta ≤ $2M, portfolio delta ≤ $200M. Complete trade logs, daily positions, and Greek exposures.
Key finding: Sharpe 1.10, 73% win rate, near-zero drawdown. 1,400 trades across 70 months. Conservative sizing sacrifices absolute return for consistency.
Snowball Option (Autocallable) Pricing
Monte Carlo Simulation · 100K Paths · 4 Volatility Regimes · Barrier Options
Monte Carlo pricing of snowball autocallable structured products across volatility regimes. Decomposing knock-out, knock-in, and tail loss probabilities to reveal the hidden risk of high-coupon structured products.
Key finding: Fair value swings from +4.4% (σ=20%) to −9.1% (σ=70%, BTC-like). KO probability is stable (72–78%), but KI probability surges from 11% to 74%. The 20% coupon grossly undercompensates for crypto-level tail risk.
Variance Risk Premium and Option Strategy Returns
OptionMetrics · WRDS · 2.1M VRP Observations · 4,937 Securities · 2019–2020
Quantifying the variance risk premium (IV − RV) and backtesting systematic option-selling strategies (covered calls, cash-secured puts) through the COVID-19 volatility shock.
Key finding: VRP averages 27.9% (IV/RV = 1.58×, positive 72.3% of the time). Covered calls earn 22.5% annualized (Sharpe 0.66, 76% win rate). OTM put skew averages 11.8%.
Slippage-at-Risk (SaR): Multi-Exchange Liquidity Risk
Kaiko Order Book Snapshots · 5 Exchanges · 8 Time Periods · 2,600+ Token-Period Combinations · 2021–2026
Extension of the SaR framework (arXiv:2603.09164) to multi-exchange, multi-period analysis. Quantifying exchange-level liquidity risk through order book slippage, concentration adjustments, and stress amplification across bull, bear, and crisis regimes.
Key finding: Exchange matters more than market regime: Binance SaR = 37–66 bps vs OKX = 240–430 bps (6–8× worse). FTX insurance fund need peaked during bull market ($26K BTC), pre-warning collapse. Stress amplification: Binance 1.5×, FTX 2–3× (self-destructive). Concentration adds +35–90% to SaR.
Perpetual Futures Funding Rate Predictability
Hyperliquid API · OKX API · 24,350 Hourly BTC Records · 50 Coins · 2023–2026
Cross-exchange funding rate dynamics and predictability analysis. Hourly funding rate data from Hyperliquid (richest dataset: BTC 24K records spanning 3 years) and OKX. DEX vs CEX funding rate comparison, cross-coin correlation structure, and short-term predictability signals.
Key finding: Funding rates show significant mean-reversion at 8–24h horizon. DEX rates more volatile but faster to reprice. Cross-coin FR correlation spikes during liquidation cascades.
BTC Microstructure: Order Book Dynamics
Kaiko Level 2 Data · 27M Raw Records · 839 Snapshots · Binance · 2023
Tick-level order book analysis of BTC-USDT on Binance. Order flow imbalance, bid-ask spread dynamics, depth profiles, and intraday patterns from high-frequency limit order book data.
Key finding: Order book imbalance predicts short-term returns (t = −104.25). Depth recovery after large trades takes 2–5 minutes. Spread widens systematically during Asian session close.
Polymarket Signal Analysis
Dune Analytics · Polymarket CLOB · Multi-Market · March 2026
Systematic signal extraction from Polymarket prediction markets. Combining order flow, price momentum, and Reverse FLB (Favorite-Longshot Bias) signals for multi-market analysis.
Key finding: Reverse FLB consistently detected — favorites systematically overpriced. All signals point SHORT YES (buy NO) across analyzed markets. Consistent with prediction market inefficiency literature.
Post-Earnings Announcement Drift (PEAD)
IBES Consensus · CRSP Daily Returns · 27,318 Events · 2015–2024
Classic PEAD analysis: sorting earnings events by standardized surprise (actual vs. consensus EPS) and tracking cumulative abnormal returns over 60 trading days.
Key finding: Long-short Q5−Q1 CAR[0,60] = +8.84%. Q5 (beat): +3.76%, Q1 (miss): −5.08%. Monotonic drift across quintiles. Anomaly persists in modern markets.
News Sentiment and Stock Returns
Commercial news-sentiment dataset · CRSP · 2M News Events · 2023
Linking commercial news-sentiment composite scores to CRSP daily returns for 39,118 firm-day observations. Examining same-day and next-day predictive power.
Key finding: Same-day Q5−Q1 spread = +1.24%. Next-day L/S = +0.15%. Negative news shows reversal: −0.38% same-day → +0.17% next-day (overreaction).
🤖 Automated Research Engine
WRDS Auto-Research Engine: 50 Systematic Studies
CRSP · Compustat · IBES · OptionMetrics · TRACE · DealScan · MSRB · Kaiko · 272 WRDS Libraries · 100 Figures · 51 Reports
Fully autonomous research engine that wrote, executed, and analyzed 50 systematic studies across WRDS databases. Each study includes a Python script, generated figures, and a comprehensive report. Topics span market microstructure, factor models, options, corporate bonds, ETFs, insider trading, earnings quality, and crypto-equity linkages.
Highlights:
• R01–R05: Market overview, Fama-French factors, fundamentals, merged CRSP-Compustat, analyst forecasts
• R08: Options market — Protective Put Sharpe 0.573 (best), Short Straddle −0.064 (worst)
• R15/R19: Crypto order book depth, Kaiko trade analysis
• R26: Cross-asset — BTC-SPX correlation mean −0.04, negative 55% of time
• R39: Crypto-equity linkage — BTC −5.5% vs market −2.5% during stress (not a hedge)
• R41: Fama-MacBeth regressions — full 2-pass factor pricing
• R50: Data quality audit across all databases
Type 2 Diabetes: A Systems Biology Approach
STRING · ChEMBL · GEO · GROMACS · Cornell BioHPC · 660 cores
Protein interaction networks (13.7M), gene expression meta-analysis (256 islet samples), drug-target landscapes (3,552 compounds × 7 targets), and 100 ns insulin molecular dynamics.
Key finding: Disease module Z = 22.28 (52.6× over random). IL6 top hub (degree=299). 47 multi-target compounds; CHEMBL509032 dual potent (INSR 20nM + GCK 50nM).