Automated Market Volatility Trading System - Detecting and trading on political market signals in real-time
Volfefe Machine is an intelligent, event-driven trading system that monitors real-time political and economic content, analyzes market impact using ML-based sentiment analysis, and executes automated trading strategies based on detected volatility signals.
The name? A playful nod to Trump's infamous "covfefe" tweet + volatility (vol) = Volfefe β‘
Political/Economic Event β Sentiment Analysis β Market Impact Assessment β Automated Trade
Starting with Truth Social posts (particularly Trump's tariff announcements), the system will expand to monitor news APIs, social media, and financial feeds to identify market-moving events before they cause significant price action.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SOURCES (Modular Adapters) β
β β’ Truth Social (via Apify) β’ NewsAPI β’ RSS β’ Moreβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INGESTION PIPELINE β
β Fetch β Normalize β Store (Postgres) β Broadcast β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MULTI-MODEL CLASSIFICATION (ML Analysis) β
β
β β’ 3 Sentiment Models: DistilBERT, Twitter-RoBERTa, β
β FinBERT (weighted consensus) β
β β’ 1 NER Model: BERT-base-NER (entity extraction) β
β Output: Sentiment + Confidence + Entities (ORG/LOC/PER)β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ASSET LINKING (Phase 2 - In Progress) β
β Match entities β Assets database β ContentTargets β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β STRATEGY ENGINE (Rule-Based Decisions) β
β Sector Mapping β Company Selection β Trade Type β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EXECUTION (Alpaca API) β
β Paper Trading β Live Trading (Options, Stocks) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Elixir 1.15+ and Erlang/OTP 26+
- PostgreSQL 14+
- Node.js 18+ (for Phoenix LiveView assets)
# Clone the repository
git clone https://github.com/razrfly/volfefe.git
cd volfefe
# Install dependencies
mix deps.get
cd assets && npm install && cd ..
# Set up environment variables
cp .env.example .env
# Edit .env with your actual credentials (database password, API tokens, etc.)
# Set up database
mix ecto.setup
# (Optional) Install Python dependencies for ML scripts
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
# Start Phoenix server
mix phx.serverVisit localhost:4002 to see the live dashboard.
Once content is ingested (see Content Ingestion below), you can run multi-model classification:
# Classify first 10 unclassified items with all models (sentiment + NER)
mix classify.contents --limit 10 --multi-model
# Classify all unclassified content
mix classify.contents --all --multi-model
# Classify specific content IDs
mix classify.contents --ids 1,2,3 --multi-model
# Preview what would be classified (dry run)
mix classify.contents --limit 10 --dry-runOutput includes:
- Sentiment consensus from 3 models (positive/negative/neutral)
- Confidence scores and model agreement rates
- Extracted entities: Organizations (ORG), Locations (LOC), Persons (PER), Miscellaneous (MISC)
- Entity confidence scores and context
After classifying content, analyze market reactions by capturing price/volume snapshots around each Trump post.
First, fetch asset information and establish baseline statistics:
# Fetch starter universe of assets (SPY, QQQ, DIA, IWM, GLD, TLT)
mix fetch.assets --symbols SPY,QQQ,DIA,IWM,GLD,TLT
# Calculate 60-day baseline statistics (mean, std dev, percentiles)
# This fetches historical data and computes rolling returns for 1hr, 4hr, 24hr windows
mix calculate.baselines --allCapture 4 time-windowed snapshots (before, 1hr, 4hr, 24hr after) for classified content:
# Single content item
mix snapshot.market --content-id 165
# Multiple specific items
mix snapshot.market --ids 165,166,167
# All content published on a date
mix snapshot.market --date 2025-10-28
# Date range
mix snapshot.market --date-range 2025-10-01 2025-10-31
# All classified content
mix snapshot.market --all
# Only content missing complete snapshots
mix snapshot.market --missing
# Preview without capturing (dry run)
mix snapshot.market --date 2025-10-28 --dry-runEach snapshot captures:
- Open, High, Low, Close (OHLC) prices
- Volume and volume deviation from baseline
- Volume z-score (standard deviations from mean)
- Market state (pre-market, regular, after-hours, closed)
- Data validity flags
- Isolation score (contamination from nearby content)
Keep baseline statistics fresh as new market data accumulates:
# Update only stale baselines (older than 24 hours)
mix calculate.baselines --all --check-freshness
# Update specific assets
mix calculate.baselines --symbols SPY,QQQ
# Force recalculation (ignore freshness)
mix calculate.baselines --all --forceMonitor data quality and coverage:
# Check which content is missing snapshots
mix snapshot.market --missing --dry-run
# View baseline statistics
iex -S mix
> alias VolfefeMachine.{Repo, MarketData.BaselineStats}
> Repo.all(BaselineStats) |> Enum.map(& {&1.asset_id, &1.window_minutes, &1.mean_return})Daily Update Workflow:
# 1. Update baselines for all assets (skips fresh ones)
mix calculate.baselines --all --check-freshness
# 2. Capture snapshots for newly classified content
mix snapshot.market --missing
# 3. Verify coverage
mix snapshot.market --missing --dry-runBackfill Historical Data:
# Capture snapshots for all content in October 2025
mix snapshot.market --date-range 2025-10-01 2025-10-31
# Force recalculate all baselines with latest 60 days
mix calculate.baselines --all --forceTroubleshooting:
- "No data available": TwelveData may not have data for that timestamp (market closed, weekend, holiday)
- "Rate limit exceeded": Free tier allows 8 calls/minute, 800/day - add delays between operations
- Incomplete snapshots: Use
--missingflag to find and fill gaps - Stale baselines: Use
--check-freshnessto update only old statistics
The project uses environment variables for sensitive configuration. Copy .env.example to .env and update with your credentials:
# PostgreSQL Database
PGHOST=localhost
PGDATABASE=volfefe_machine_dev
PGUSER=postgres
PGPASSWORD=your_postgres_password
# Apify API (for Truth Social scraping)
APIFY_USER_ID=your_user_id_here
APIFY_PERSONAL_API_TOKEN=your_api_token_here.env file to version control! It's already in .gitignore.
Status: β Complete - Unified ingestion pipeline ready
Fetch and import content from Truth Social using a single command:
# Fetch 100 posts from a specific user
mix ingest.content --source truth_social --username realDonaldTrump --limit 100
# Include replies in results
mix ingest.content --source truth_social --username realDonaldTrump --limit 50 --include-replies
# Preview what would be fetched (dry run)
mix ingest.content --source truth_social --username realDonaldTrump --limit 10 --dry-runAvailable Options:
--source, -s- Content source (currently:truth_social)--username, -u- Username/profile to fetch (required)--limit, -l- Maximum posts to fetch (default: 100)--include-replies- Include replies in results (default: false)--dry-run- Preview configuration without fetching (default: false)
| Component | Purpose | Status |
|---|---|---|
| Database Schema | Assets, Contents, Classifications, ContentTargets | β Complete |
| Multi-Model Classification | 3 sentiment models + weighted consensus | β Complete |
| NER Entity Extraction | Extract organizations, locations, persons | β Complete |
| Apify Integration | Fetch Truth Social posts via API | β Complete |
| Ingestion Pipeline | Unified fetch + import workflow | β Complete |
| Asset Linking | Match extracted entities to assets database | π Phase 2 |
| Strategy Engine | Rule-based trade decision logic | π Phase 3 |
| Trade Executor | Alpaca API integration | π Phase 4 |
| Dashboard | Real-time monitoring UI | π Future |
Legend: β Complete | π§ In Progress | π Planned
sources - External data sources (Truth Social, NewsAPI, etc.)
%Source{
name: "truth_social",
adapter: "TruthSocialAdapter",
base_url: "https://api.example.com",
last_fetched_at: ~U[2025-01-26 10:00:00Z]
}contents - Normalized posts/articles
%Content{
source_id: uuid,
external_id: "12345",
author: "realDonaldTrump",
text: "Big tariffs on steel coming soon!",
url: "https://truthsocial.com/@realDonaldTrump/12345",
published_at: ~U[2025-01-26 09:45:00Z],
classified: false
}classifications - ML analysis results with sentiment consensus
%Classification{
content_id: uuid,
sentiment: "negative",
confidence: 0.9556,
meta: %{
"agreement_rate" => 1.0,
"model_results" => [
%{"model_id" => "distilbert", "sentiment" => "negative", "confidence" => 0.9812},
%{"model_id" => "twitter_roberta", "sentiment" => "negative", "confidence" => 0.9654},
%{"model_id" => "finbert", "sentiment" => "negative", "confidence" => 0.9201}
],
"entities" => [
%{"text" => "Tesla", "type" => "ORG", "confidence" => 0.9531},
%{"text" => "United States", "type" => "LOC", "confidence" => 0.9912}
]
}
}assets - Tradable securities (9,000+ loaded)
%Asset{
symbol: "TSLA",
name: "Tesla Inc",
exchange: "NASDAQ",
asset_class: "us_equity"
}content_targets - Extracted entities linked to assets (Phase 2)
%ContentTarget{
content_id: uuid,
asset_id: uuid,
extraction_method: "ner_bert",
confidence: 0.9531,
context: "Tesla stock tumbled 12% today..."
}- Framework: Phoenix 1.7 + LiveView
- Language: Elixir
- Database: PostgreSQL with Ecto
- Job Queue: Oban for background processing
- HTTP Client: HTTPoison for external APIs
- Python: Python 3.9+ with virtual environment
- ML Framework: Transformers (Hugging Face)
- Models:
- Sentiment: DistilBERT, Twitter-RoBERTa, FinBERT
- NER: BERT-base-NER (dslim/bert-base-NER)
- Elixir Integration: Python interop via
System.cmd/3
- Data Source: Apify for Truth Social scraping
- Trading: Alpaca Markets API (future)
- Project setup and architecture
- Database schemas (contents, sources, classifications, assets, content_targets)
- Assets database loaded (9,000+ securities)
- Multi-model sentiment classification (DistilBERT, Twitter-RoBERTa, FinBERT)
- Weighted consensus algorithm
- NER entity extraction (BERT-base-NER)
- Classification mix task with batch processing
- Content ingestion - Unified mix task (Issue #46)
- Content backup/seeding system (Issue #45) - Next Step
- Entity β Asset matching logic (Issue #42)
- ContentTargets creation
- Fuzzy name matching
- Confidence scoring
- Manual validation tools
- Sector-to-ticker mapping
- Rule-based trade logic
- Backtesting framework
- Signal generation
- Alpaca API integration
- Paper trading
- Risk management
- Live trading (manual approval)
- NewsAPI adapter
- Reddit adapter
- RSS feeds
- Source weighting
See: Issue #43 (Phase 1 NER) | Issue #42 (Phase 2 Asset Linking)
The system includes tools for detecting potential insider trading on Polymarket prediction markets by analyzing blockchain trade data.
Reference Cases (Seed Data) Blockchain (Subgraph)
ββ Event date ββ ALL trades in date range
ββ Description ββ Wallet addresses
ββ Context from news ββ Trade amounts & timing
β β
βββββββββββββ¬ββββββββββββββββββββββββ
βΌ
Pattern Discovery
ββ Scan by date range (not keyword)
ββ Group trades by market
ββ Score: volume, whales, timing
ββ Output candidate markets
βΌ
Human Confirmation
ββ Review candidate list
ββ Confirm correct market
ββ Update reference case
βΌ
Trade Ingestion
ββ Ingest trades for analysis
Scan blockchain trades in a date range to discover market activity patterns:
# Scan last 7 days, show top 10 markets
mix polymarket.ingest --subgraph --days 7 --scan
# Scan specific date range around a known event
mix polymarket.ingest --subgraph --from 2025-10-08 --to 2025-10-12 --scan --top 20
# Full ingestion (without --scan flag)
mix polymarket.ingest --subgraph --from 2025-10-01 --to 2025-10-15Scan output includes:
- Total trades and volume in date range
- Markets grouped by trading activity
- Whale trade counts (>$1K positions)
- Unique wallet counts per market
- Trading period timestamps
Discover which Polymarket market corresponds to a reference case:
# Discover markets for a specific reference case
mix polymarket.discover --reference-case "Nobel Peace Prize 2025"
# Custom window: 10 days before event, 2 days after
mix polymarket.discover --reference-case "Nobel Peace Prize 2025" --window 10 --after 2
# Show top 20 candidates
mix polymarket.discover --reference-case "Nobel Peace Prize 2025" --top 20
# Discover for ALL reference cases missing condition_ids
mix polymarket.discover --all-referencesDiscovery output includes:
- Ranked candidate markets by score (0.0-1.0)
- Condition ID for each candidate
- Volume and pre-event volume percentage
- Whale count and unique wallet count
- Trading period around the event
- Suspicious wallets with timing and volume data (Phase 3)
After reviewing discover output, confirm which market is correct:
# Confirm market match for reference case
mix polymarket.confirm --reference-case "Nobel Peace Prize 2025" --condition 0x14a3dfeba8...
# With optional slug (auto-fetched if omitted)
mix polymarket.confirm --reference-case "Case Name" --condition 0xabc... --slug "market-slug"After confirming, ingest trades for analysis:
# Ingest trades for a specific reference case
mix polymarket.ingest --subgraph --reference-case "Nobel Peace Prize 2025"
# Ingest trades for all reference cases with condition_ids
mix polymarket.ingest --subgraph --reference-cases
# Ingest for specific condition
mix polymarket.ingest --subgraph --condition 0x14a3dfeba8... --from 2025-10-01# 1. You have a reference case (from news: "Nobel Peace Prize 2025", event Oct 11)
# Check it exists in the database
mix polymarket.references
# 2. Discover which market matches this event
mix polymarket.discover --reference-case "Nobel Peace Prize 2025"
# Output shows candidate markets ranked by score:
# 1. "Will MarΓa Corina Machado win the Nobel Peace Prize?"
# Condition: 0x14a3... Score: 0.85 Volume: $45K (62% pre-event)
# 3. Review candidates and confirm the correct match
mix polymarket.confirm --reference-case "Nobel Peace Prize 2025" --condition 0x14a3dfeba8...
# 4. Promote discovered wallets to investigation candidates
mix polymarket.promote --reference-case "Nobel Peace Prize 2025"
# Output: Created 12 investigation candidate(s)
# Batch ID: refcase-nobel-peace-1234567890
# 5. Ingest trades for detailed analysis
mix polymarket.ingest --subgraph --reference-case "Nobel Peace Prize 2025"
# 6. Review and investigate candidates
mix polymarket.candidates --batch refcase-nobel-peace-1234567890
mix polymarket.investigate --id 1Reference case discovery scores markets based on:
| Factor | Weight | Description |
|---|---|---|
| Whale Activity | 30% | Trades >$1K indicate informed trading |
| Pre-Event Volume | 30% | Volume concentration before event |
| Total Volume | 25% | Log-scaled trading volume |
| Wallet Diversity | 15% | Unique wallets (avoid wash trading) |
Individual wallets are scored for suspicious activity:
| Factor | Weight | Description |
|---|---|---|
| Volume | 25% | Log-scaled position size |
| Whale Trades | 25% | Number of trades >$1K |
| Pre-Event Concentration | 30% | % of volume placed before event |
| Timing Precision | 20% | Closer to event = more suspicious |
Timing Precision Breakdown:
- Within 24h of event: 20%
- Within 48h: 15%
- Within 72h: 10%
- Beyond 72h: 5%
Discovery results (wallets + candidate markets) are automatically saved to the reference case for later analysis.
After discovery, convert suspicious wallets into investigation candidates:
# Preview promotion (dry run)
mix polymarket.promote --reference-case "Nobel Peace Prize 2025" --dry-run
# Promote wallets (creates investigation candidates)
mix polymarket.promote --reference-case "Nobel Peace Prize 2025"
# Custom thresholds
mix polymarket.promote --reference-case "Case Name" --min-score 0.6 --limit 10
# Force priority level
mix polymarket.promote --reference-case "Case Name" --priority criticalPromotion workflow:
- Reads discovered wallets from reference case
- Filters by minimum suspicion score (default: 0.4)
- Creates InvestigationCandidate records with:
- Anomaly breakdown (volume, timing, whale trades)
- Matched patterns (reference case linkage)
- Priority based on suspicion score
- Assigns batch ID for tracking
After promotion:
# List candidates from this batch
mix polymarket.candidates --batch refcase-nobel-peace-1234567890
# Investigate specific candidate
mix polymarket.investigate --id 1
# Confirm as insider
mix polymarket.confirm --id 1 --notes "Pre-event timing matches"| Source | Endpoint | Data Available |
|---|---|---|
| Polymarket Subgraph | api.goldsky.com/.../orderbook-subgraph |
All trades since Nov 2022 |
| Polymarket API | data-api.polymarket.com |
Recent trades (geo-blocked in US) |
| Gamma API | gamma-api.polymarket.com |
Market metadata (geo-blocked in US) |
| CLOB API | clob.polymarket.com |
Active market data (geo-blocked in US) |
Note: The subgraph bypasses geo-blocking and provides complete historical data. For CLOB and Gamma APIs, see VPN Setup below.
Polymarket geo-blocks US IP addresses for their CLOB and Gamma APIs (regulatory compliance). The application uses a Docker-based VPN proxy to route only Polymarket API calls through VPN, without affecting the rest of your network traffic.
- Docker Desktop installed and running
- ProtonVPN account with WireGuard access
-
Get WireGuard credentials from ProtonVPN:
- Go to Proton Account β VPN β WireGuard
- Click "Generate Key" (or use existing)
- Copy the
PrivateKeyvalue
-
Add credentials to
.env:# VPN Proxy for Polymarket API Access PROTONVPN_WIREGUARD_PRIVATE_KEY=your_wireguard_private_key_here VPN_PROXY_ENABLED=true VPN_PROXY_HOST=localhost VPN_PROXY_PORT=8888 -
Start the VPN proxy container:
docker compose -f docker-compose.vpn.yml up -d
-
Verify connection:
# Check container is running docker logs vpn-proxy | grep "Public IP" # Should show: Public IP address is X.X.X.X (Netherlands, ...) # Test API access through proxy curl -x http://localhost:8888 "https://gamma-api.polymarket.com/markets?limit=1"
When the VPN proxy is running and VPN_PROXY_ENABLED=true, the application automatically routes Polymarket API calls (CLOB, Gamma) through the VPN tunnel. The subgraph API does not require VPN.
# Start VPN proxy
docker compose -f docker-compose.vpn.yml up -d
# Run with VPN enabled
export VPN_PROXY_ENABLED=true
mix phx.server
# Or for one-off commands
export VPN_PROXY_ENABLED=true
mix polymarket.health| Issue | Solution |
|---|---|
| Container won't start | Ensure Docker Desktop is running |
| "Connection refused" on port 8888 | Check docker logs vpn-proxy for errors |
| API still returning 403/blocked | Verify VPN connected: docker logs vpn-proxy | grep "Public IP" |
| Rate limiting from Polymarket | The VPN is working; reduce request frequency |
| Enrichment/metadata fetch failing | Ensure VPN_PROXY_ENABLED=true is set |
docker compose -f docker-compose.vpn.yml downNote: When the VPN proxy is stopped or VPN_PROXY_ENABLED=false, CLOB and Gamma API calls will fail for US users. The subgraph-based trade ingestion will continue to work.
# Run all tests
mix test
# Run with coverage
mix test --cover
# Run specific test file
mix test test/volfefe/pipeline_test.exsInput Text:
"Tesla stock tumbled 12% today as Elon Musk's controversial tweet sparked
concerns about the company's future. Analysts in the United States and
Europe are worried about automotive sector stability."
Multi-Model Classification Output:
%{
# Sentiment Consensus (3 models)
consensus: %{
sentiment: "negative",
confidence: 0.9556,
agreement_rate: 1.0
},
# Individual Model Results
model_results: [
%{model_id: "distilbert", sentiment: "negative", confidence: 0.9812},
%{model_id: "twitter_roberta", sentiment: "negative", confidence: 0.9654},
%{model_id: "finbert", sentiment: "negative", confidence: 0.9201}
],
# Extracted Entities (NER)
entities: [
%{text: "Tesla", type: "ORG", confidence: 0.9531,
context: "Tesla stock tumbled 12% today..."},
%{text: "Elon Musk", type: "PER", confidence: 0.9802,
context: "...12% today as Elon Musk's controversial..."},
%{text: "United States", type: "LOC", confidence: 0.9912,
context: "...Analysts in the United States and Europe..."},
%{text: "Europe", type: "LOC", confidence: 0.9845,
context: "...United States and Europe are worried..."}
],
# Entity Statistics
entity_stats: %{
total_entities: 4,
by_type: %{"ORG" => 1, "LOC" => 2, "PER" => 1, "MISC" => 0}
},
# Performance
total_latency_ms: 663,
successful_models: 4
}Phase 2 Preview (not yet implemented):
- "Tesla" β Match to Asset{symbol: "TSLA", name: "Tesla Inc"}
- Create ContentTarget{content_id: X, asset_id: Y, confidence: 0.95}
- Fetch & Import -
mix ingest.content --source truth_social --username USER --limit 100 - Content Storage - Posts stored in PostgreSQL
contentstable - Multi-Model Classification - Run
mix classify.contents --all --multi-model- 3 sentiment models analyze text (DistilBERT, Twitter-RoBERTa, FinBERT)
- Weighted consensus calculates final sentiment + confidence
- NER model extracts entities (ORG, LOC, PER, MISC)
- Results Storage - Classifications saved to
classificationstable - Entity Analysis - Entities stored in classification metadata
- Scheduler (Oban) - Poll Truth Social every 60 seconds
- Adapter - Fetch and normalize new posts
- PubSub - Broadcast
{:new_content, content}events - Auto-Classification - Trigger multi-model analysis on new content
- Asset Linking (Phase 2) - Match entities to assets, create ContentTargets
- Strategy Engine (Phase 3) - Generate trade recommendations
- Executor (Phase 4) - Place orders via Alpaca API
- Dashboard - Real-time monitoring via LiveView
This is currently a private project, but contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This software is for educational and research purposes only.
- Automated trading carries significant financial risk
- Past performance does not guarantee future results
- This system is not financial advice
- Use at your own risk
- Always start with paper trading
- Understand all risks before deploying real capital
By using this software, you acknowledge that you are solely responsible for any trading decisions and outcomes.
MIT License - see LICENSE for details
- Issue #43: Phase 1 NER Entity Extraction
- Issue #42: Phase 2 Asset Linking
- Issue #45: Content Data Seeding
Built with β€οΈ and Elixir