Bringing modern AI to financial research

AI SuperInvestor is an AI research project investigating how machine learning and large language models can process public datasets, identify trends, generate research reports, and deepen our understanding of market dynamics.

Explore the Research View Methodology

Backed by leading investors

Research agenda

Three open questions drive the work

Markets are noisy, non-stationary, and described largely in natural language. Each property poses a distinct challenge for machine learning, and each is a research track of its own.

FORECASTING

Signal & Trend Detection

How well can supervised learning mine meaningful patterns from noisy public market data? We benchmark gradient boosting, LSTMs, and transformer architectures on trend-identification tasks across equities, rates, and macro series.

MARKET DYNAMICS

Modeling Non-Stationary Markets

Financial environments shift constantly. We study regime detection, concept drift, and adaptive modeling, measuring how quickly analytical models degrade and what it takes to keep them honest.

LANGUAGE

Financial Language Understanding

Filings, transcripts, and news encode market context in text. We evaluate how large language models extract, summarize, and reason over financial documents, and where they hallucinate.

What we're building

Analytical tools for open financial research

In the spirit of open platforms like Qlib, FinGPT, and OpenBB, the project develops reusable software for studying markets, built around public data and reproducible methodology.

INFRASTRUCTURE

Data Pipeline

An ingestion and normalization layer for public datasets such as SEC filings, market prices, macro indicators, and news feeds, cleaned and aligned into research-ready form.

EVALUATION

Model Benchmark Suite

A reproducible harness for comparing ML architectures on financial analysis tasks, with walk-forward validation and honest out-of-sample reporting.

LLM

Report Generator

An LLM pipeline that converts quantitative findings into structured research reports, with every claim traceable back to the underlying public data.

ANALYTICS

Market Dynamics Lab

Interactive analytics for studying how information propagates through markets: correlation structure, regime shifts, and cross-asset behavior over time.

Methodology

From public data to research insight

Source & Prepare

Public datasets including filings, prices, macro indicators, and news are collected, cleaned, and normalized. Data quality work is treated as first-class research, not plumbing.

Model

Statistical methods, gradient boosting, deep learning, and LLMs are applied to extract structure: patterns, correlations, anomalies, and regime changes.

Synthesize

Quantitative findings are turned into structured, readable research reports by LLM pipelines, with full traceability from every claim to its source data.

Evaluate & Iterate

Outputs are benchmarked out-of-sample, stress-tested, and peer-reviewed. Negative results are documented, because knowing what doesn't work is half the research.

Principles & scope

How the research is conducted

Public data only

All analysis is conducted on publicly available datasets. No privileged information, no proprietary feeds requiring special access.

Research, not advice

Activities are limited to research, software development, and data analytics. Nothing produced is investment advice, and no assets are managed or traded.

Honest evaluation

Out-of-sample testing, walk-forward validation, and published negative results. Overfit findings help no one.

Human + machine

Models are tools for understanding, not oracles. Every automated finding passes through human review before it becomes a research claim.

Scope of activities. AI SuperInvestor is an AI research project. Activities are limited to research, software development, and data analytics. The project does not provide investment advice, manage assets, execute trades, or offer financial services of any kind. Nothing on this site should be construed as a recommendation to buy or sell any security.