Skip to content

validation

optimizer.validation

Model selection and cross-validation for portfolio backtesting.

Includes Walk-Forward backtesting, Combinatorial Purged Cross-Validation (CPCV), and Multiple Randomized Cross-Validation.

CPCVConfig dataclass

Configuration for :class:skfolio.model_selection.CombinatorialPurgedCV.

Generates a population of backtest paths from all combinatorial selections of test folds, with purging and embargoing to prevent information leakage.

Parameters

n_folds : int Number of non-overlapping temporal blocks. n_test_folds : int Number of blocks assigned to the test set in each combination. purged_size : int Number of observations excised on each side of the train-test boundary. embargo_size : int Number of observations embargoed immediately following each test block to avoid autocorrelation contamination.

for_statistical_testing() classmethod

High-path-count configuration for significance testing.

Uses C(12, 2) = 66 paths with 10 training folds per split, providing high statistical power for backtest overfitting tests.

for_small_sample() classmethod

Fewer folds for shorter time series.

MultipleRandomizedCVConfig dataclass

Configuration for :class:skfolio.model_selection.MultipleRandomizedCV.

Dual randomisation across temporal windows and asset subsets to test robustness of the strategy to both dimensions.

Parameters

walk_forward_config : WalkForwardConfig Inner walk-forward configuration for temporal splitting. n_subsamples : int Number of random trials. asset_subset_size : int Number of assets drawn per trial. window_size : int or None Length of the random temporal window drawn per trial. None uses the full sample. random_state : int or None Seed for reproducibility.

for_robustness_check(n_subsamples=20, asset_subset_size=10) classmethod

Standard robustness check with 20 trials.

WalkForwardConfig dataclass

Immutable configuration for :class:skfolio.model_selection.WalkForward.

Walk-forward backtesting partitions time series into successive train/test windows that respect the causal arrow of time.

Parameters

test_size : int Number of observations in each test window. train_size : int Number of observations in each training window. When expend_train is True, this is the initial training window size. purged_size : int Number of observations purged between train and test windows to prevent look-ahead bias. expend_train : bool When True, the training window expands as new data arrives (expanding window). When False, the training window rolls forward (rolling window). reduce_test : bool When True, the last test window may be shorter than test_size to avoid discarding data.

for_monthly_rolling() classmethod

Monthly test windows with one-year rolling training.

for_quarterly_rolling() classmethod

Quarterly test windows with one-year rolling training.

for_quarterly_expanding() classmethod

Quarterly test windows with expanding training.

build_cpcv(config=None)

Build a skfolio :class:CombinatorialPurgedCV cross-validator from config.

Parameters

config : CPCVConfig or None CPCV configuration. Defaults to CPCVConfig() (10 folds, 8 test folds).

Returns

CombinatorialPurgedCV A skfolio combinatorial purged cross-validator.

build_multiple_randomized_cv(config=None)

Build a :class:MultipleRandomizedCV cross-validator from config.

Parameters

config : MultipleRandomizedCVConfig or None Multiple randomised CV configuration. Defaults to MultipleRandomizedCVConfig().

Returns

MultipleRandomizedCV A skfolio multi-randomised cross-validator.

build_walk_forward(config=None)

Build a skfolio :class:WalkForward cross-validator from config.

Parameters

config : WalkForwardConfig or None Walk-forward configuration. Defaults to WalkForwardConfig() (quarterly rolling with one-year training window).

Returns

WalkForward A skfolio temporal cross-validator.

compute_optimal_folds(n_observations, target_train_size, target_n_test_paths, weight_train_size=1.0, weight_n_test_paths=1.0)

Compute optimal fold counts for CPCV.

Wraps :func:skfolio.model_selection.optimal_folds_number.

Parameters

n_observations : int Total number of observations. target_train_size : int Desired training window size. target_n_test_paths : int Desired number of backtest paths. weight_train_size : float Relative importance of matching train size. weight_n_test_paths : float Relative importance of matching path count.

Returns

tuple[int, int] (n_folds, n_test_folds) optimal parameters.

run_cross_val(estimator, X, *, cv=None, y=None, params=None, n_jobs=None, portfolio_params=None)

Run cross-validated prediction with a temporal cross-validator.

Thin wrapper around :func:skfolio.model_selection.cross_val_predict that enforces temporal splitting (no random shuffle).

Parameters

estimator : BaseEstimator A fitted-ready skfolio optimisation estimator or pipeline. X : array-like Return matrix (observations x assets). cv : temporal cross-validator or None Cross-validator. Defaults to WalkForward with quarterly test windows. y : array-like or None Benchmark returns or factor returns (for models that require fit(X, y)). params : dict or None Auxiliary metadata forwarded to nested estimators via sklearn metadata routing (e.g. {"implied_vol": implied_vol_df}). Requires sklearn.set_config(enable_metadata_routing=True) and the relevant set_fit_request calls on sub-estimators. n_jobs : int or None Number of parallel jobs. portfolio_params : dict or None Additional parameters forwarded to the portfolio constructor.

Returns

MultiPeriodPortfolio or Population Out-of-sample portfolio predictions. WalkForward returns a MultiPeriodPortfolio; CombinatorialPurgedCV and MultipleRandomizedCV return a Population.