Skip to content

Conversation

@shashank524
Copy link
Collaborator

No description provided.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds a custom non-convex optimizer (PSGD - Perturbed Stochastic Gradient Descent) to the finmath library and enhances existing time series functions with NumPy/Pandas array support.

Key changes:

  • Implements PSGD optimizer with EMA smoothing, gradient clipping, and perturbation-based saddle point escape
  • Adds NumPy/Pandas array support for time series functions (SMA, RSI, EMA, rolling volatility) alongside existing list-based implementations
  • Provides comprehensive test coverage for both C++ and Python implementations of time series functions

Reviewed changes

Copilot reviewed 21 out of 24 changed files in this pull request and generated 26 comments.

Show a summary per file
File Description
src/cpp/Optimization/psgd.cpp Implements the PSGD algorithm with parameter validation, EMA gradient smoothing, and perturbation logic
src/cpp/Optimization/linalg_helpers.cpp Provides linear algebra utilities for PSGD including vector operations, norm calculations, and uniform ball sampling
include/finmath/Optimization/psgd.h Header defining the PSGD function interface with detailed documentation
include/finmath/Optimization/linalg_helpers.h Header for linear algebra helper functions
src/cpp/TimeSeries/simple_moving_average.cpp Adds NumPy array support and improves error handling with exceptions
src/cpp/TimeSeries/rsi.cpp Adds NumPy array support for RSI calculation with bug fixes in the NumPy version
src/cpp/TimeSeries/rolling_volatility.cpp Adds NumPy array support with enhanced input validation
src/cpp/TimeSeries/ema.cpp Adds NumPy array support for both window-based and smoothing-factor-based EMA
include/finmath/TimeSeries/*.h Updates headers to declare NumPy overloads for time series functions
src/python_bindings.cpp Binds PSGD optimizer and adds overloaded bindings for NumPy/Pandas time series functions
test/TimeSeries//Python/.py Comprehensive Python tests for time series functions with list, NumPy, and Pandas inputs
test/TimeSeries//C++/.cpp C++ unit tests for time series functions
test/Optimization/Python/test_psgd.py Python tests for PSGD optimizer including convergence and parameter validation tests
CMakeLists.txt Updates build configuration to include PSGD sources and reorganizes test structure
.gitignore Adds Python build artifacts to ignore list

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


if (smoothing_factor <= 0 || smoothing_factor >= 1)
{
throw std::runtime_error("EMA smoothing factor must be between 0 and 1 (exclusive).");
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation message states "EMA smoothing factor must be between 0 and 1" but the actual validation checks for <= 0 || >= 1, which means values of exactly 0 and 1 are rejected. The error message should be more precise: "EMA smoothing factor must be between 0 and 1 (exclusive)" to match the actual validation logic.

Copilot uses AI. Check for mistakes.
if (ell <= 0)
throw std::invalid_argument("Smoothness parameter ell must be positive.");
if (rho <= 0)
throw std::invalid_argument("Hessian Lipschitz parameter rho must be positive.");
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message validation for rho states "Hessian Lipschitz parameter rho must be positive" but the validation actually checks rho <= 0, which would also reject exactly 0. If 0 is a valid value (though mathematically unlikely for a Lipschitz parameter), the message should say ">= 0" or "non-negative". If 0 is not valid, the message is correct but the comment should clarify that strictly positive means > 0.

Suggested change
throw std::invalid_argument("Hessian Lipschitz parameter rho must be positive.");
throw std::invalid_argument("Hessian Lipschitz parameter rho must be strictly positive (> 0).");

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +65
if (num_data < window_size)
{
// Return empty vector if not enough data for one window
// Alternatively, throw: throw std::runtime_error("Data size is smaller than the window size.");
return {};
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent handling of insufficient data between list and NumPy versions. The list version (line 27) returns an empty vector when data.size() < window_size, while line 64 comments suggest throwing is an alternative. Both approaches should be consistently documented, and the rationale for the chosen behavior should be clear.

Copilot uses AI. Check for mistakes.
Comment on lines +39 to 53
# add_executable(black_scholes_test test/OptionPricing/black_scholes_test.cpp)
# target_link_libraries(black_scholes_test finmath_library)
#
# add_executable(binomial_option_pricing_test test/OptionPricing/binomial_option_pricing_test.cpp)
# target_link_libraries(binomial_option_pricing_test finmath_library)
#
# add_executable(compound_interest_test test/InterestAndAnnuities/compound_interest_test.cpp)
# target_link_libraries(compound_interest_test finmath_library)
#
# add_executable(rsi_test test/TimeSeries/rsi_test.cpp) # This was the problematic one
# target_link_libraries(rsi_test finmath_library)
#
# # Test runner (can be removed if using ctest directly)
# add_executable(run_all_tests test/test_runner.cpp)

Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Large blocks of commented-out code (lines 39-52) should be removed rather than left in the codebase. If these test executables are temporarily disabled, a comment should explain why and when they'll be re-enabled. Otherwise, remove the commented code to improve maintainability.

Suggested change
# add_executable(black_scholes_test test/OptionPricing/black_scholes_test.cpp)
# target_link_libraries(black_scholes_test finmath_library)
#
# add_executable(binomial_option_pricing_test test/OptionPricing/binomial_option_pricing_test.cpp)
# target_link_libraries(binomial_option_pricing_test finmath_library)
#
# add_executable(compound_interest_test test/InterestAndAnnuities/compound_interest_test.cpp)
# target_link_libraries(compound_interest_test finmath_library)
#
# add_executable(rsi_test test/TimeSeries/rsi_test.cpp) # This was the problematic one
# target_link_libraries(rsi_test finmath_library)
#
# # Test runner (can be removed if using ctest directly)
# add_executable(run_all_tests test/test_runner.cpp)

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +47
// Return empty vector if not enough data
// Could also throw: throw std::runtime_error("Insufficient data for the given window size.");
return {};
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment on lines 46-47 says "Could also throw" but the code returns an empty vector. Similar to the simple_moving_average implementation, this inconsistency should be resolved - either document why returning empty is preferred, or choose one approach consistently across all functions.

Suggested change
// Return empty vector if not enough data
// Could also throw: throw std::runtime_error("Insufficient data for the given window size.");
return {};
throw std::runtime_error("Insufficient data for the given window size.");

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +128
// This check might be redundant now given the earlier checks, but keep for safety
if (log_returns.size() < window_size)
{
throw std::runtime_error("Window size is larger than the number of log returns.");
}
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment on line 124 "This check might be redundant now given the earlier checks, but keep for safety" indicates potential code duplication. If the check is truly redundant due to earlier validation (lines 100-103), it should be removed. If it serves a distinct purpose, the comment should explain what edge case it's protecting against.

Suggested change
// This check might be redundant now given the earlier checks, but keep for safety
if (log_returns.size() < window_size)
{
throw std::runtime_error("Window size is larger than the number of log returns.");
}

Copilot uses AI. Check for mistakes.
m.doc() = "Financial Math Library";
PYBIND11_MODULE(finmath, m)
{
m.doc() = "Financial Math Library";
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Inconsistent formatting: Lines 17-19 have different indentation (6 spaces) compared to the rest of the code which uses standard indentation. The opening brace and doc string should follow the same indentation pattern as the rest of the file.

Suggested change
m.doc() = "Financial Math Library";
m.doc() = "Financial Math Library";

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +38
* @param delta Confidence parameter (probability of failure).
* @param batch_size Mini-batch size used by stochastic_grad (influences g_thresh).
* @param step_size_coeff Coefficient 'c' to calculate step size eta = c / ell.
* @param ema_beta Decay factor for the exponential moving average of the gradient (typically 0.8-0.95).
* @param max_iters Maximum number of iterations to perform.
* @param grad_clip_norm Maximum L2 norm for the raw stochastic gradient (g_max). Set <= 0 to disable.
* @param param_clip_norm Maximum L2 norm for the parameter vector x (x_max). Set <= 0 to disable.
* @return The final optimized parameter vector x.
*
* @throws std::invalid_argument if input parameters are inconsistent (e.g., ell <= 0, rho <= 0, eps <= 0, sigma < 0, 0 < delta < 1, batch_size <= 0, c <= 0, 0 <= ema_beta < 1).
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header documentation states that delta is the "Confidence parameter (probability of failure)" and mentions "0 < delta < 1" in the @throws section. However, these descriptions are contradictory. If delta represents a "probability of failure", it would typically be expected to be in (0,1). The description should be clarified - is delta meant to represent confidence level (1-probability_of_failure) or probability of failure?

Copilot uses AI. Check for mistakes.
Comment on lines +63 to +64
// Return empty vector if not enough data for one window
// Alternatively, throw: throw std::runtime_error("Data size is smaller than the window size.");
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment on lines 63-64 mentions "Alternatively, throw" but the code is already returning an empty vector. This comment should either be removed or clarified. If throwing an exception is a viable alternative, document why the current approach (returning empty vector) was chosen.

Suggested change
// Return empty vector if not enough data for one window
// Alternatively, throw: throw std::runtime_error("Data size is smaller than the window size.");
// Return empty vector if not enough data for one window.
// Chosen for consistency with the non-Numpy version, which also returns an empty vector in this case.

Copilot uses AI. Check for mistakes.
Comment on lines +48 to +49
// Option 2: Return empty vector
// return {};
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comments on lines 45-49 discuss two options for handling insufficient data but don't explain the choice. Lines 46-47 state "Cannot compute volatility if not enough log returns for the window" followed by "Option 1: Throw error" (which is implemented) and a commented "Option 2: Return empty vector". The commented alternative should be removed for clarity.

Suggested change
// Option 2: Return empty vector
// return {};

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants