Skip to content

finite-sample/alsgls

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Lightweight ALS Solver for Iterative GLS

PyPI version PyPI Downloads Python License

When a GLS problem involves hundreds of equations, the $K × K$ covariance matrix becomes the computational bottleneck. A simple statistical remedy is to assume that most of the cross‑equation dependence can be captured by a handful of latent factors plus equation‑specific noise. This "low‑rank + diagonal" assumption slashes the number of unknowns from roughly $K^²$ to about $K×k$ parameters, where k (the latent factor rank) is much smaller than $K$. The model alone, however, does not guarantee speed: we still have to fit the parameters.

Installation

Install the library from PyPI:

pip install alsgls

For local development, clone the repo and use an editable install:

pip install -e .

Usage

from alsgls import ALSGLS, ALSGLSSystem, simulate_sur

Xs_tr, Y_tr, Xs_te, Y_te = simulate_sur(N_tr=240, N_te=120, K=60, p=3, k=4)

# Scikit-learn style estimator
est = ALSGLS(rank="auto", max_sweeps=12)
est.fit(Xs_tr, Y_tr)
test_score = est.score(Xs_te, Y_te)  # negative test NLL per observation

# Statsmodels-style system interface
system = {f"eq{j}": (Y_tr[:, j], Xs_tr[j]) for j in range(Y_tr.shape[1])}
sys_model = ALSGLSSystem(system, rank="auto")
sys_results = sys_model.fit()
params = sys_results.params_as_series()  # pandas optional

The benchmarks/compare_sur.py script contrasts ALS-GLS with statsmodels and linearmodels SUR implementations on matched simulation grids while recording peak memory (via Memray, Fil, or the POSIX RSS high-water mark).

Documentation and notebooks

Background material and reproducible experiments are available in the notebooks under als_sim/, such as als_sim/als_comparison.ipynb and als_sim/als_sur.ipynb.

Type-Safe ALS Solver

This package provides a modern, type-safe implementation of Alternating-Least-Squares (ALS) for low-rank GLS problems. The Woodbury identity reduces the expensive inverse to a tiny k × k system, and the β-update can be written without explicitly forming dense matrices.

Key features in v1.0:

  • Full type safety with mypy compliance and comprehensive type hints
  • Numerically stable implementation using Cholesky factorization throughout
  • Clean API with single computational path and enhanced error messages
  • Memory efficient with O(K k) complexity, converging in 5–6 sweeps
  • Breaking changes for a cleaner, more maintainable codebase

Rule of thumb: if your GLS routine keeps looping between $\beta$ and a fresh $\hat{\Sigma}$, the ALS approach yields the same statistical fit with an order‑of‑magnitude smaller memory footprint and better numerical stability.

Beyond SUR: where the idea travels

Random‑effects models, feasible GLS with estimated heteroskedastic weights, optimal‑weight GMM, and spatial autoregressive GLS all iterate β ↔ Σ̂. Each can adopt the same ALS trick: treat the weight matrix as low‑rank + diagonal, invert only the k × k core, and avoid the dense K × K algebra. Memory savings in published examples range from 5× to 20×, depending on k.

A concrete case‑study: Seemingly‑Unrelated Regressions

To demonstrate performance, we benchmark ALS against traditional methods with N = 300 observations, three regressors, rank‑3 factors, and K ranging from 50 to 120 equations. The largest array that traditional methods need is the dense Σ⁻¹ (K×K), whereas ALS's largest is the skinny factor matrix F (K×k).

K β‑RMSE Traditional β‑RMSE ALS Peak MB Traditional Peak MB ALS Memory ratio
50 0.021 0.021 0.020 0.002 10×
80 0.020 0.020 0.051 0.003 17×
120 0.020 0.020 0.115 0.004 29×

The ALS implementation achieves the same statistical performance while using only a few megabytes of memory, providing substantial computational advantages for large systems.

Defaults, tuning knobs, and failure modes

  • Rank (k) – By default the high-level APIs pick min(8, ceil(K / 10)), a conservative fraction of the number of equations. Increase rank if the cross-equation correlation matrix is slow to decay; decrease it when the diagonal dominates.
  • ALS ridge terms (lam_F, lam_B) – Defaults to 1e-3 for both the latent-factor and regression updates; raise them slightly (e.g. 1e-2) if CG struggles to converge or the NLL trace plateaus early.
  • Noise floor (d_floor) – Keeps the diagonal component positive; the default 1e-8 protects against breakdowns when an equation is nearly deterministic. Increase it in highly ill-conditioned settings.
  • Stopping criteria – ALS stops when the relative drop in NLL per sweep is below 1e-6 (configurable via rel_tol) or after max_sweeps. Inspect info["nll_trace"] to diagnose stagnation.
  • Possible failures – Large condition numbers or nearly-collinear regressors can make the β-step CG solve slow; adjust cg_tol/cg_maxit, add stronger ridge, or re-scale predictors. If info["accept_t"] stays at zero and the NLL does not improve, the factor rank may be too large relative to the sample size.

About

Factor Analytic ALS for GLS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •