Skip to content

Commit 5149dce

Browse files
committed
Update notes
1 parent 85b0632 commit 5149dce

File tree

2 files changed

+6
-1
lines changed

2 files changed

+6
-1
lines changed

notes/main.typ

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ Issue of #emph("Observational Studies") titleed #link("https://en.wikipedia.org/
104104

105105
== Fun example
106106

107-
=== on overparameterized models
107+
=== On overparameterized models
108108

109109
Form comment in #link("https://statmodeling.stat.columbia.edu/2025/11/14/how-is-it-that-this-problem-with-its-21-data-points-is-so-much-easier-to-handle-with-1-predictor-than-with-16-predictors/")[`Impossible statistical problems`] of Andrew Gelman by Phil, November 14, 2024.
110110

@@ -120,6 +120,11 @@ With only 10 data points and 7 predictors, there is still some room for analysis
120120

121121
Therefore, in scenarios with extremely small sample sizes, an excess of irrelevant predictors can contaminate the data—rather than enriching it—and render meaningful analysis impossible.
122122

123+
But it now an empirical observation, can we theoretically explain this phenomenon?
124+
125+
#question("Theoretical explanation for overparameterized models with small sample size")[For sample siez $n = 200$, outcome $Y in RR$ and predictors $X in RR^(p)$, $p / n = c in (0, infinity)$, $Y$ is independent of $X$, under what condition? We will see that a machine learning algorithm can still predict $Y$ well from $X$?.]
126+
127+
Well, this is just the global null hypothesis testing problem in high dimension generalized linear model.
123128

124129
= On the undistinguishable or identification of statistical models
125130

static/notes/notes.pdf

3.87 KB
Binary file not shown.

0 commit comments

Comments
 (0)