feat(kde): kernel functions with statistical properties and LOESS support #360
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add kernel functions with statistical properties and LOESS support
This PR prepares the foundation for KDE and LOESS implementations.
Summary
Major enhancement to the kernel module adding:
consts.rskernel.rswith AMISE efficiency metrics, dual evaluation modes, and LOESS integrationChanges to
consts.rs∫ u² K(u) dufor 9 kernels∫ K(u)² dufor 9 kernelsSQRT_PIconstantChanges to
kernel.rs1. Statistical Correctness
evaluate(): Normalized for KDE (integrates to 1)evaluate_weight(): Unnormalized for LOESS (local regression weights)2. New Kernels & Renaming
Quartic→Bisquare(it's a more standard name)Cosinekernel (high efficiency ≈ 0.9995)Logistickernel (heavy-tailed unbounded)Sigmoidkernel (hyperbolic secant)Sigmoidformula: wasexp(πx), now correctlyexp(x)3. Enhanced API
KernelTypeenum: Runtime kernel selection for LOESS and other applicationsCustomKernel: User-defined kernels with metadataevaluate_batch(),compute_distance_weights()robust_reweights(),normalize_weights()recommended_for_kde(),recommended_for_loess(),most_efficient()4. Boundary Behavior Fix
|x| <= 1to|x| >= 1for consistency(-1, 1)as mathematically correctx = ±15. Documentation & Testing
Breaking Changes
|x| = 1now returns 0 (was non-zero)(70/81)(1-|x|³)³(was unnormalized)Fixed
distribution(removed unused import crate::distribution::internal::testing_boiler).Geometric::inverse_cdfplatform-dependent behavior on Windows:Problem
The
test_inverse_cdftest was failing on Windows with:12forinverse_cdf(0.0)Root Cause
Floating-point precision differences across platforms caused
inverse_cdf(0.0)to compute inconsistent results when using the formulaceil(log(1-p) / log(1-self.p)).Solution
Added an explicit implementation of
inverse_cdffor theGeometricdistribution that handles edge cases consistently:min()(1) when input probabilityp <= 0.01when distribution parameterself.p == 1.0max()(u64::MAX) when input probabilityp >= 1.0This ensures consistent behavior across all platforms (macOS, Linux, Windows).