Optimize classification thresholds for improved model performance.
Most classifiers output probabilities that need thresholds for decisions. The default 0.5 threshold is often suboptimal for real objectives like F1, precision/recall, or business costs. This library finds optimal thresholds using efficient O(n log n) algorithms, with significant improvements possible especially on imbalanced datasets.
# Default 0.5 threshold
y_pred = (model.predict_proba(X)[:, 1] >= 0.5).astype(int)
# F1 Score: 0.654
# Optimized threshold
from optimal_cutoffs import optimize_thresholds
result = optimize_thresholds(y_true, y_scores, metric="f1")
y_pred = result.predict(y_scores_test)
# F1 Score: 0.891 (improvement depends on dataset characteristics)Key considerations: The 0.5 threshold assumes equal costs and balanced classes. Many real-world problems have imbalanced data (e.g., fraud detection: 1% positive rate) and asymmetric costs (e.g., false negatives may be more costly than false positives). In such cases, optimal thresholds often differ significantly from 0.5.
pip install optimal-classification-cutoffsOptional Performance Boost:
# For algorithmic speedups with optional Numba acceleration
pip install optimal-classification-cutoffs[performance]
# For Jupyter examples and visualizations
pip install optimal-classification-cutoffs[examples]Python 3.14+ Support: The package works on all Python versions 3.12+, including cutting-edge Python 3.14. Numba acceleration is optional and will automatically fall back to pure Python when unavailable.
from optimal_cutoffs import optimize_thresholds
# Your existing model probabilities
y_scores = model.predict_proba(X_test)[:, 1]
# Find optimal threshold (exact solution, O(n log n))
result = optimize_thresholds(y_true, y_scores, metric="f1")
print(f"Optimal threshold: {result.threshold:.3f}") # May differ from 0.5
print(f"Expected F1: {result.scores[0]:.3f}")
# Make optimal predictions
y_pred = result.predict(y_scores_new)import numpy as np
from optimal_cutoffs import optimize_thresholds
# Multiclass probabilities (n_samples, n_classes)
y_scores = model.predict_proba(X_test)
# Automatically detects multiclass, optimizes per-class thresholds
result = optimize_thresholds(y_true, y_scores, metric="f1")
print(f"Per-class thresholds: {result.thresholds}")
print(f"Task detected: {result.task.value}") # "multiclass"
print(f"Method used: {result.method}") # "coord_ascent"
# Predictions use optimal thresholds
y_pred = result.predict(y_scores_new)from optimal_cutoffs import optimize_decisions
# Cost matrix: rows=true class, cols=predicted class
# False negatives cost 10x more than false positives
cost_matrix = [[0, 1], [10, 0]]
result = optimize_decisions(y_probs, cost_matrix)
y_pred = result.predict(y_probs_new) # Bayes-optimal decisionsAPI Overview:
from optimal_cutoffs import optimize_thresholds, optimize_decisions
# For threshold-based optimization (F1, precision, recall, etc.)
result = optimize_thresholds(y_true, y_scores, metric="f1")
# For cost matrix optimization (no thresholds)
result = optimize_decisions(y_probs, cost_matrix)from optimal_cutoffs import metrics, bayes, cv, algorithms
# Custom metrics
custom_f2 = lambda tp, tn, fp, fn: (5*tp) / (5*tp + 4*fn + fp)
metrics.register("f2", custom_f2)
# Cross-validation with threshold tuning
thresholds = cv.cross_validate(model, X, y, metric="f1")
# Advanced algorithms
result = algorithms.multiclass.coordinate_ascent(y_true, y_scores)Everything is explainable. The library tells you what it detected and why:
result = optimize_thresholds(y_true, y_scores) # All defaults
print(f"Task: {result.task.value}") # "binary" (auto-detected)
print(f"Method: {result.method}") # "sort_scan" (O(n log n))
print(f"Notes: {result.notes}") # ["Detected binary task...", "Selected sort_scan for O(n log n) optimization..."]Most metrics (F1, precision, recall) are piecewise-constant in threshold τ. Sorting scores once enables exact optimization in O(n log n) time.
Under calibrated probabilities, optimal binary thresholds have closed form:
τ* = cost_fp / (cost_fp + cost_fn)
Independent of class priors, depends only on cost ratio.
- One-vs-Rest: Independent per-class thresholds (macro averaging)
- Coordinate Ascent: Coupled thresholds for single-label consistency
- General Costs: Skip thresholds, apply Bayes rule on probability vectors
- O(n log n) exact optimization for piecewise metrics
- O(1) closed-form solutions for cost-sensitive objectives
- Optional Numba acceleration with automatic pure Python fallback
The O(n log n) sort-scan algorithm provides significant speedups compared to naive approaches:
- 50-300× faster than grid search (101 points)
- 1000-8000× faster than exhaustive unique threshold evaluation
- Performance scales as O(n log n) vs O(n·k) for naive methods
- Exact solutions guaranteed for piecewise-constant metrics
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score
from optimal_cutoffs import optimize_thresholds
# Realistic imbalanced dataset (like fraud detection)
X, y = make_classification(n_samples=1000, weights=[0.9, 0.1], random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=42)
# Train any classifier
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)
y_scores = model.predict_proba(X_test)[:, 1]
# ❌ Default threshold
y_pred_default = (y_scores >= 0.5).astype(int)
f1_default = f1_score(y_test, y_pred_default)
print(f"Default F1: {f1_default:.3f}") # ~0.65
# ✅ Optimal threshold
result = optimize_thresholds(y_test, y_scores, metric="f1")
y_pred_optimal = result.predict(y_scores)
f1_optimal = f1_score(y_test, y_pred_optimal)
print(f"Optimal F1: {f1_optimal:.3f}") # ~0.89
improvement = (f1_optimal - f1_default) / f1_default * 100
print(f"Improvement: {improvement:+.1f}%") # ~+40%Perfect for:
- Imbalanced classification (fraud, medical, spam)
- Cost-sensitive decisions (business impact)
- Performance-critical applications (exact solutions)
- Research requiring theoretical optimality
Not needed for:
- Perfectly balanced classes with symmetric costs
- Problems requiring probabilistic outputs
- Uncalibrated models (calibrate first)
from optimal_cutoffs import cv
# Cross-validation for threshold selection
scores = cv.cross_validate(
model, X, y,
metric="f1",
cv=5,
return_thresholds=True
)from optimal_cutoffs import metrics
# Register custom Fβ score
def f_beta(tp, tn, fp, fn, beta=2.0):
return (1 + beta**2) * tp / ((1 + beta**2) * tp + beta**2 * fn + fp)
metrics.register("f2", lambda tp, tn, fp, fn: f_beta(tp, tn, fp, fn, 2.0))
# Use like any built-in metric
result = optimize_thresholds(y_true, y_scores, metric="f2")# Optimize different metrics
f1_result = optimize_thresholds(y_true, y_scores, metric="f1")
precision_result = optimize_thresholds(y_true, y_scores, metric="precision")
print(f"F1 optimal τ: {f1_result.threshold:.3f}")
print(f"Precision optimal τ: {precision_result.threshold:.3f}")- Lipton et al. (2014) Optimal Thresholding of Classifiers to Maximize F1
- Elkan (2001) The Foundations of Cost-Sensitive Learning
- Dinkelbach (1967) Nonlinear Fractional Programming
- Platt (1999) Probabilistic Outputs for Support Vector Machines