📊 Finance

Perpetual Futures and Market Quality · AFA 2026

Cryptocurrency Microstructure · Kaiko Tick Data · 13TB · 2017–2023 · Accepted at AFA 2026

How do perpetual futures affect spot market quality? Tick-level analysis across Binance, Huobi, OKEx — 5,213 trading days, 13 pairs. Causal identification via Huobi's 2021 contract termination.

Key finding: Perpetual futures listing reduces quoted spreads by 44.5% and increases depth by 216.8%. Huobi termination DiD confirms: removing perps widens spot spreads (t = 2.38, p = 0.017).

Tiny Trades, Big Questions: Fractional Shares

DTAQ Millisecond Data · CRSP · 13B Trades · 276 Trading Days · 2021–2022

Independent replication of Bartlett, McCrary & O'Hara (JFE 2024) latency-based fractional share classifier. Identifying Robinhood and Drivewealth trades in consolidated tape using SIP–participant timestamp gaps.

Key finding: 176.2M fractional trades identified (90.5% of BMO). Top 30 stock ranking exact match. Concentration distribution identical within 0.5pp. U-shaped price quintile pattern confirmed.

Prediction Market Microstructure

Polymarket · Hyperliquid DEX · 2024 US Presidential Election

Microstructure of Polymarket's CLOB-based prediction market. Order flow, liquidity provision, and price formation during the 2024 election.

Key finding: Spreads tighten as certainty increases. Liquidity concentrates at 50/50 and drains near 0 or 1.

Cross-Market Predictability

Crypto–Equity Linkages · Kaiko + TAQ + WRDS · 2017–2023

Can crypto signals predict equity returns? BTC realized volatility from 2,024 days of tick data (mean RV: 73.3%). Dynamic correlations and spillover effects.

Interim: BTC RV shows regime-dependent correlation with VIX. Crypto order flow leads equity moves by 15–30 min during stress.

LLM-Powered Earnings Call Stock Picker

Gemini LLM · WRDS Transcripts (84K) · CRSP · 2021–2022

Using Gemini to analyze 84,121 earnings call transcripts, generating Buy/Sell/Hold with confidence scores. Backtesting against CRSP daily returns.

Interim: High-confidence Buy signals: +3.11% avg 20-day return. 438 signals from 30/50 dates.

CSI 1000 Index Option Volatility Selling

CFFEX · AKShare · 68K Options · 1,022 Contracts · 2022–2026 · 91 Pages

Sell ATM straddles when IV exceeds 80th percentile on China's CSI 1000 index options. Gamma-controlled (per-trade ≤ ¥2M, portfolio ≤ ¥200M), delta-hedged with index futures. Complete daily Greek exposures and trade logs.

Key finding: +71.9% total return (16.5% ann.), Sharpe 0.94, -12.3% max DD, 63% win rate. 305 trades over 3.5 years. Variance risk premium robust in China's small-cap options market.

Systematic Volatility Selling with Delta Hedging

OptionMetrics · WRDS · 4.6M Options · 6,043 Securities · 2016–2021 · 51 Pages

Sell ATM puts when IV exceeds 80th percentile, delta-hedge with index futures. Per-trade delta ≤ $2M, portfolio delta ≤ $200M. Complete trade logs, daily positions, and Greek exposures.

Key finding: Sharpe 1.10, 73% win rate, near-zero drawdown. 1,400 trades across 70 months. Conservative sizing sacrifices absolute return for consistency.

Snowball Option (Autocallable) Pricing

Monte Carlo Simulation · 100K Paths · 4 Volatility Regimes · Barrier Options

Monte Carlo pricing of snowball autocallable structured products across volatility regimes. Decomposing knock-out, knock-in, and tail loss probabilities to reveal the hidden risk of high-coupon structured products.

Key finding: Fair value swings from +4.4% (σ=20%) to −9.1% (σ=70%, BTC-like). KO probability is stable (72–78%), but KI probability surges from 11% to 74%. The 20% coupon grossly undercompensates for crypto-level tail risk.

Variance Risk Premium and Option Strategy Returns

OptionMetrics · WRDS · 2.1M VRP Observations · 4,937 Securities · 2019–2020

Quantifying the variance risk premium (IV − RV) and backtesting systematic option-selling strategies (covered calls, cash-secured puts) through the COVID-19 volatility shock.

Key finding: VRP averages 27.9% (IV/RV = 1.58×, positive 72.3% of the time). Covered calls earn 22.5% annualized (Sharpe 0.66, 76% win rate). OTM put skew averages 11.8%.

Slippage-at-Risk (SaR): Multi-Exchange Liquidity Risk

Kaiko Order Book Snapshots · 5 Exchanges · 8 Time Periods · 2,600+ Token-Period Combinations · 2021–2026

Extension of the SaR framework (arXiv:2603.09164) to multi-exchange, multi-period analysis. Quantifying exchange-level liquidity risk through order book slippage, concentration adjustments, and stress amplification across bull, bear, and crisis regimes.

Key finding: Exchange matters more than market regime: Binance SaR = 37–66 bps vs OKX = 240–430 bps (6–8× worse). FTX insurance fund need peaked during bull market ($26K BTC), pre-warning collapse. Stress amplification: Binance 1.5×, FTX 2–3× (self-destructive). Concentration adds +35–90% to SaR.

Perpetual Futures Funding Rate Predictability

Hyperliquid API · OKX API · 24,350 Hourly BTC Records · 50 Coins · 2023–2026

Cross-exchange funding rate dynamics and predictability analysis. Hourly funding rate data from Hyperliquid (richest dataset: BTC 24K records spanning 3 years) and OKX. DEX vs CEX funding rate comparison, cross-coin correlation structure, and short-term predictability signals.

Key finding: Funding rates show significant mean-reversion at 8–24h horizon. DEX rates more volatile but faster to reprice. Cross-coin FR correlation spikes during liquidation cascades.

BTC Microstructure: Order Book Dynamics

Kaiko Level 2 Data · 27M Raw Records · 839 Snapshots · Binance · 2023

Tick-level order book analysis of BTC-USDT on Binance. Order flow imbalance, bid-ask spread dynamics, depth profiles, and intraday patterns from high-frequency limit order book data.

Key finding: Order book imbalance predicts short-term returns (t = −104.25). Depth recovery after large trades takes 2–5 minutes. Spread widens systematically during Asian session close.

Polymarket Signal Analysis

Dune Analytics · Polymarket CLOB · Multi-Market · March 2026

Systematic signal extraction from Polymarket prediction markets. Combining order flow, price momentum, and Reverse FLB (Favorite-Longshot Bias) signals for multi-market analysis.

Key finding: Reverse FLB consistently detected — favorites systematically overpriced. All signals point SHORT YES (buy NO) across analyzed markets. Consistent with prediction market inefficiency literature.

Post-Earnings Announcement Drift (PEAD)

IBES Consensus · CRSP Daily Returns · 27,318 Events · 2015–2024

Classic PEAD analysis: sorting earnings events by standardized surprise (actual vs. consensus EPS) and tracking cumulative abnormal returns over 60 trading days.

Key finding: Long-short Q5−Q1 CAR[0,60] = +8.84%. Q5 (beat): +3.76%, Q1 (miss): −5.08%. Monotonic drift across quintiles. Anomaly persists in modern markets.

News Sentiment and Stock Returns

Commercial news-sentiment dataset · CRSP · 2M News Events · 2023

Linking commercial news-sentiment composite scores to CRSP daily returns for 39,118 firm-day observations. Examining same-day and next-day predictive power.

Key finding: Same-day Q5−Q1 spread = +1.24%. Next-day L/S = +0.15%. Negative news shows reversal: −0.38% same-day → +0.17% next-day (overreaction).

🤖 Automated Research Engine

WRDS Auto-Research Engine: 50 Systematic Studies

CRSP · Compustat · IBES · OptionMetrics · TRACE · DealScan · MSRB · Kaiko · 272 WRDS Libraries · 100 Figures · 51 Reports

Fully autonomous research engine that wrote, executed, and analyzed 50 systematic studies across WRDS databases. Each study includes a Python script, generated figures, and a comprehensive report. Topics span market microstructure, factor models, options, corporate bonds, ETFs, insider trading, earnings quality, and crypto-equity linkages.

Highlights:
• R01–R05: Market overview, Fama-French factors, fundamentals, merged CRSP-Compustat, analyst forecasts
• R08: Options market — Protective Put Sharpe 0.573 (best), Short Straddle −0.064 (worst)
• R15/R19: Crypto order book depth, Kaiko trade analysis
• R26: Cross-asset — BTC-SPX correlation mean −0.04, negative 55% of time
• R39: Crypto-equity linkage — BTC −5.5% vs market −2.5% during stress (not a hedge)
• R41: Fama-MacBeth regressions — full 2-pass factor pricing
• R50: Data quality audit across all databases

🧬 Biology

Type 2 Diabetes: A Systems Biology Approach

STRING · ChEMBL · GEO · GROMACS · Cornell BioHPC · 660 cores

Protein interaction networks (13.7M), gene expression meta-analysis (256 islet samples), drug-target landscapes (3,552 compounds × 7 targets), and 100 ns insulin molecular dynamics.

Key finding: Disease module Z = 22.28 (52.6× over random). IL6 top hub (degree=299). 47 multi-target compounds; CHEMBL509032 dual potent (INSR 20nM + GCK 50nM).

Papers

Full PDFs for each project listed above.