pipeline¶

`optimizer.pipeline` ¶

End-to-end portfolio pipeline orchestration.

Composes pre-selection, optimisation, validation, scoring, hyperparameter tuning, and rebalancing into a single workflow.

`PortfolioResult` `dataclass` ¶

Container for the output of a full portfolio optimisation run.

Attributes¶

weights : pd.Series Final asset weights (ticker → weight). portfolio : object Skfolio Portfolio from predict() on the full dataset. Exposes .sharpe_ratio, .sortino_ratio, .max_drawdown, .composition, etc. backtest : object or None Out-of-sample MultiPeriodPortfolio (walk-forward) or Population (CPCV / MultipleRandomizedCV). None when backtesting was skipped. pipeline : object The fitted sklearn Pipeline (pre-selection + optimiser). Can be reused for predict() on new data. summary : dict[str, float] Key performance metrics extracted from the in-sample portfolio. rebalance_needed : bool or None Whether the portfolio exceeds drift thresholds relative to previous_weights. None when no previous weights were provided. turnover : float or None One-way turnover between previous_weights and the new weights. None when no previous weights were provided. fx_decomposition : FxReturnDecomposition or None FX return decomposition when FxConfig.mode == DECOMPOSE. currency : str or None Base currency used for FX conversion (e.g. "EUR"). net_returns : pd.Series or None Net backtest portfolio returns after transaction cost deduction. None when no backtest was run. net_sharpe_ratio : float or None Annualized Sharpe ratio computed from net_returns. None when no backtest was run. weight_history : pd.DataFrame or None Absolute portfolio weights at each walk-forward rebalancing date. Rows are rebalancing dates; columns are asset tickers. Compatible with compute_net_alpha(weights_history=...). None when no backtest was run.

`build_portfolio_pipeline(optimizer, pre_selection_config=None, sector_mapping=None)` ¶

Compose a full sklearn Pipeline: pre-selection → optimiser.

The resulting pipeline is a single estimator for cross-validation and hyperparameter tuning. Pre-selection is performed within each CV fold, preventing data leakage.

Parameters¶

optimizer : BaseOptimization A skfolio optimiser (e.g. from build_mean_risk(), build_hrp(), etc.) used as the final pipeline estimator. pre_selection_config : PreSelectionConfig or None Pre-selection configuration. None uses default settings. sector_mapping : dict[str, str] or None Ticker → sector mapping for :class:SectorImputer.

Returns¶

sklearn.pipeline.Pipeline A fitted-ready pipeline whose fit(X) cleans and filters returns then optimises, and whose predict(X) produces a skfolio Portfolio.

Examples¶

from optimizer.optimization import MeanRiskConfig, build_mean_risk from optimizer.pipeline import build_portfolio_pipeline optimizer = build_mean_risk(MeanRiskConfig.for_max_sharpe()) pipeline = build_portfolio_pipeline(optimizer) pipeline.fit(X) # X = returns DataFrame portfolio = pipeline.predict(X) print(portfolio.sharpe_ratio)

`backtest(pipeline, X, *, cv_config=None, y=None, n_jobs=None)` ¶

Run walk-forward backtest on a portfolio pipeline.

Parameters¶

pipeline : Pipeline A fitted-ready sklearn Pipeline (from build_portfolio_pipeline). X : pd.DataFrame Return matrix (observations x assets). cv_config : WalkForwardConfig or None Walk-forward configuration. Defaults to quarterly rolling. y : pd.DataFrame or None Benchmark or factor returns for models that require fit(X, y). n_jobs : int or None Number of parallel jobs.

Returns¶

MultiPeriodPortfolio or Population Out-of-sample portfolio predictions.

`compute_net_backtest_returns(gross_returns, weight_changes, cost_bps=10.0)` ¶

Deduct proportional transaction costs from gross backtest returns.

For each date with weight changes, the one-way turnover (half the sum of absolute weight deltas, consistent with compute_turnover()) is multiplied by cost_bps / 10_000 and subtracted from the gross return at that date. A shift of weight w from one asset to another incurs a cost of w * cost_bps / 10_000, not 2w.

Parameters¶

gross_returns : pd.Series Gross portfolio returns indexed by date. weight_changes : pd.DataFrame Weight change matrix (dates x assets). Only dates present in this DataFrame incur transaction costs. cost_bps : float Transaction cost in basis points (default 10 bps).

Returns¶

pd.Series Net returns with costs deducted.

`optimize(pipeline, X, *, y=None)` ¶

Fit pipeline on full data and return final weights.

Parameters¶

pipeline : Pipeline A fitted-ready sklearn Pipeline. X : pd.DataFrame Return matrix (observations x assets). y : pd.DataFrame or None Benchmark or factor returns.

Returns¶

PortfolioResult Weights, in-sample portfolio, and fitted pipeline.

`run_full_pipeline(prices, optimizer, *, pre_selection_config=None, sector_mapping=None, cv_config=None, previous_weights=None, rebalancing_config=None, current_date=None, last_review_date=None, y_prices=None, risk_free_rate=0.0, delisting_returns=None, fx_config=None, currency_map=None, fx_rates=None, benchmark_currency=None, cost_bps=10.0, n_jobs=None)` ¶

End-to-end: prices → validated weights + backtest + rebalancing.

This is the single entry point for producing a portfolio from raw price data. It:

Converts prices to linear returns. 1b. Applies delisting returns (survivorship-bias correction).
Builds the full pipeline (pre-selection + optimiser).
Backtests via walk-forward (if cv_config is provided).
Fits on full data to produce final weights.
Checks rebalancing thresholds (if previous_weights given).

Parameters¶

prices : pd.DataFrame Price matrix (dates x tickers). optimizer : BaseOptimization A skfolio optimiser instance (e.g. from build_mean_risk()). pre_selection_config : PreSelectionConfig or None Pre-selection configuration. sector_mapping : dict[str, str] or None Ticker → sector mapping for imputation. cv_config : WalkForwardConfig or None Walk-forward backtest configuration. None skips backtesting. previous_weights : ndarray or None Current portfolio weights for rebalancing analysis. rebalancing_config : ThresholdRebalancingConfig or HybridRebalancingConfig or None Rebalancing configuration. Pass a ThresholdRebalancingConfig for pure drift-based rebalancing or a HybridRebalancingConfig for calendar-gated threshold rebalancing. current_date : pd.Timestamp or None Evaluation date for hybrid rebalancing. Defaults to the last date in the return series when not provided. last_review_date : pd.Timestamp or None Date of the last hybrid review. When None with a HybridRebalancingConfig, the calendar gate is treated as already elapsed (threshold alone decides). y_prices : pd.DataFrame or None Benchmark or factor price series. Converted to returns alongside asset prices. delisting_returns : dict[str, float] or None Mapping of ticker → terminal delisting return. When provided, each ticker's last valid return is replaced with this value after prices_to_returns() (survivorship-bias correction, issue #274). Tickers not present in the returns columns are silently ignored. fx_config : FxConfig or None Multi-currency FX conversion configuration (issue #283). When provided with mode != NONE, prices are converted to the base currency before prices_to_returns(). None disables conversion (default, backward-compatible). currency_map : dict[str, str] or None Ticker → ISO currency code mapping. Required when fx_config is provided. fx_rates : pd.DataFrame or None Pre-loaded FX rate DataFrame (dates x currencies). Each column holds units-of-base per one unit-of-foreign. Required when fx_config is provided. benchmark_currency : str | None ISO currency code for the benchmark in y_prices (issue #308). When provided and FX conversion is active, all columns of y_prices are treated as denominated in this currency and converted to fx_config.base_currency before returns are computed. None (default) preserves existing behaviour: the benchmark is converted only if its ticker already appears in currency_map. cost_bps : float One-way transaction cost in basis points applied to each walk-forward rebalancing event. Subtracted from gross backtest returns to produce result.net_returns and result.net_sharpe_ratio. Default 10 bps. n_jobs : int or None Number of parallel jobs for backtesting.

Returns¶

PortfolioResult Complete result with weights, portfolio metrics, optional backtest, net returns, and rebalancing signals.

Examples¶

from optimizer.optimization import MeanRiskConfig, build_mean_risk from optimizer.validation import WalkForwardConfig from optimizer.pipeline import run_full_pipeline

optimizer = build_mean_risk(MeanRiskConfig.for_max_sharpe()) result = run_full_pipeline( ... prices=price_df, ... optimizer=optimizer, ... cv_config=WalkForwardConfig.for_quarterly_rolling(), ... ) print(result.weights) print(result.summary) print(result.backtest.sharpe_ratio) # out-of-sample

run_full_pipeline_with_selection(prices, optimizer, *, fundamentals=None, volume_history=None, financial_statements=None, analyst_data=None, insider_data=None, macro_data=None, regime_data=None, investability_config=None, factor_config=None, standardization_config=None, scoring_config=None, selection_config=None, regime_config=None, integration_config=None, sector_mapping=None, pre_selection_config=None, cv_config=None, previous_weights=None, rebalancing_config=None, current_date=None, last_review_date=None, y_prices=None, current_members=None, ic_history=None, risk_free_rate=0.0, delisting_returns=None, market_returns=None, fx_config=None, currency_map=None, fx_rates=None, benchmark_currency=None, cost_bps=10.0, n_jobs=None) ¶

End-to-end: fundamentals + prices → stock selection → optimization.

Extends :func:run_full_pipeline with upstream stock pre-selection:

Screen universe for investability (if fundamentals provided).
Compute and standardize factor scores.
Apply macro regime tilts (if macro_data + regime_config).
Compute composite score and select stocks.
Run existing run_full_pipeline on selected tickers.

Parameters¶

prices : pd.DataFrame Price matrix (dates x tickers). optimizer : BaseOptimization A skfolio optimiser instance. fundamentals : pd.DataFrame or None Cross-sectional data indexed by ticker (market_cap, ratios). If None, skips screening and factor selection. volume_history : pd.DataFrame or None Volume matrix (dates x tickers). financial_statements : pd.DataFrame or None Statement-level data for screening. analyst_data : pd.DataFrame or None Analyst recommendation data for factor construction. insider_data : pd.DataFrame or None Insider transaction data for factor construction. macro_data : pd.DataFrame or None Macro indicators for regime classification. regime_data : pd.DataFrame or None Merged macro indicators (pmi, spread_2s10s, hy_oas, etc.) for composite regime classification. When provided and non-empty, takes precedence over macro_data for regime classification. Receives the same publication lag filtering. investability_config : InvestabilityScreenConfig or None Universe screening configuration. factor_config : FactorConstructionConfig or None Factor construction parameters. standardization_config : StandardizationConfig or None Factor standardization parameters. scoring_config : CompositeScoringConfig or None Composite scoring parameters. selection_config : SelectionConfig or None Stock selection parameters. regime_config : RegimeTiltConfig or None Regime tilt parameters. integration_config : FactorIntegrationConfig or None Factor-to-optimization bridge parameters. sector_mapping : dict[str, str] or None Ticker -> sector mapping. pre_selection_config : PreSelectionConfig or None Return-data pre-selection configuration. cv_config : WalkForwardConfig or None Walk-forward backtest configuration. previous_weights : ndarray or None Current portfolio weights for rebalancing. rebalancing_config : ThresholdRebalancingConfig or None Rebalancing threshold configuration. y_prices : pd.DataFrame or None Benchmark or factor price series. current_members : pd.Index or None Currently selected tickers for hysteresis. ic_history : pd.DataFrame or None IC history for IC-weighted scoring. market_returns : pd.Series or None Pre-computed market return series for beta estimation. When provided, used as the benchmark instead of the equal-weight cross-sectional mean. Pass a currency- consistent broad index (e.g. SPY daily returns) when prices spans multiple currency zones. benchmark_currency : str | None ISO currency code for the benchmark in y_prices. Forwarded verbatim to :func:run_full_pipeline; see that function's documentation for full semantics (issue #308). n_jobs : int or None Number of parallel jobs.

Returns¶

PortfolioResult Complete result with weights, metrics, backtest, and rebalancing signals.

`tune_and_optimize(pipeline, X, param_grid, *, tuning_config=None, y=None, risk_free_rate=0.0)` ¶

Tune hyperparameters via grid or randomized search, then optimise.

Parameters¶

pipeline : Pipeline A fitted-ready sklearn Pipeline. X : pd.DataFrame Return matrix (observations x assets). param_grid : dict Parameter grid for GridSearchCV or distributions for RandomizedSearchCV. Keys use sklearn double-underscore notation for nested parameters. tuning_config : GridSearchConfig or RandomizedSearchConfig or None Search configuration. Defaults to quarterly walk-forward with Sharpe ratio scoring (grid search). y : pd.DataFrame or None Benchmark or factor returns. risk_free_rate : float Daily risk-free rate for consistent Sharpe scoring (issue #272). When non-zero and the scorer uses Sharpe ratio, the scorer config is updated to use this rate.

Returns¶

PortfolioResult Weights from the best estimator, with backtest from CV.

pipeline¶

optimizer.pipeline ¶

PortfolioResult dataclass ¶

Attributes¶

build_portfolio_pipeline(optimizer, pre_selection_config=None, sector_mapping=None) ¶

Parameters¶

Returns¶

Examples¶

backtest(pipeline, X, *, cv_config=None, y=None, n_jobs=None) ¶

Parameters¶

Returns¶

compute_net_backtest_returns(gross_returns, weight_changes, cost_bps=10.0) ¶

Parameters¶

Returns¶

optimize(pipeline, X, *, y=None) ¶

Parameters¶

Returns¶

Parameters¶

Returns¶

Examples¶

Parameters¶

Returns¶

tune_and_optimize(pipeline, X, param_grid, *, tuning_config=None, y=None, risk_free_rate=0.0) ¶

Parameters¶

Returns¶

`optimizer.pipeline` ¶

`PortfolioResult` `dataclass` ¶

`build_portfolio_pipeline(optimizer, pre_selection_config=None, sector_mapping=None)` ¶

`backtest(pipeline, X, *, cv_config=None, y=None, n_jobs=None)` ¶

`compute_net_backtest_returns(gross_returns, weight_changes, cost_bps=10.0)` ¶

`optimize(pipeline, X, *, y=None)` ¶

`tune_and_optimize(pipeline, X, param_grid, *, tuning_config=None, y=None, risk_free_rate=0.0)` ¶