pipeline¶
optimizer.pipeline
¶
End-to-end portfolio pipeline orchestration.
Composes pre-selection, optimisation, validation, scoring, hyperparameter tuning, and rebalancing into a single workflow.
PortfolioResult
dataclass
¶
Container for the output of a full portfolio optimisation run.
Attributes¶
weights : pd.Series
Final asset weights (ticker → weight).
portfolio : object
Skfolio Portfolio from predict() on the full dataset.
Exposes .sharpe_ratio, .sortino_ratio, .max_drawdown,
.composition, etc.
backtest : object or None
Out-of-sample MultiPeriodPortfolio (walk-forward) or
Population (CPCV / MultipleRandomizedCV). None when
backtesting was skipped.
pipeline : object
The fitted sklearn Pipeline (pre-selection + optimiser).
Can be reused for predict() on new data.
summary : dict[str, float]
Key performance metrics extracted from the in-sample portfolio.
rebalance_needed : bool or None
Whether the portfolio exceeds drift thresholds relative to
previous_weights. None when no previous weights were
provided.
turnover : float or None
One-way turnover between previous_weights and the new
weights. None when no previous weights were provided.
build_portfolio_pipeline(optimizer, pre_selection_config=None, sector_mapping=None)
¶
Compose a full sklearn Pipeline: pre-selection → optimiser.
The resulting pipeline is a single estimator for cross-validation and hyperparameter tuning. Pre-selection is performed within each CV fold, preventing data leakage.
Parameters¶
optimizer : BaseOptimization
A skfolio optimiser (e.g. from build_mean_risk(),
build_hrp(), etc.) used as the final pipeline estimator.
pre_selection_config : PreSelectionConfig or None
Pre-selection configuration. None uses default settings.
sector_mapping : dict[str, str] or None
Ticker → sector mapping for :class:SectorImputer.
Returns¶
sklearn.pipeline.Pipeline
A fitted-ready pipeline whose fit(X) cleans and filters
returns then optimises, and whose predict(X) produces
a skfolio Portfolio.
Examples¶
from optimizer.optimization import MeanRiskConfig, build_mean_risk from optimizer.pipeline import build_portfolio_pipeline optimizer = build_mean_risk(MeanRiskConfig.for_max_sharpe()) pipeline = build_portfolio_pipeline(optimizer) pipeline.fit(X) # X = returns DataFrame portfolio = pipeline.predict(X) print(portfolio.sharpe_ratio)
backtest(pipeline, X, *, cv_config=None, y=None, n_jobs=None)
¶
Run walk-forward backtest on a portfolio pipeline.
Parameters¶
pipeline : Pipeline
A fitted-ready sklearn Pipeline (from build_portfolio_pipeline).
X : pd.DataFrame
Return matrix (observations x assets).
cv_config : WalkForwardConfig or None
Walk-forward configuration. Defaults to quarterly rolling.
y : pd.DataFrame or None
Benchmark or factor returns for models that require fit(X, y).
n_jobs : int or None
Number of parallel jobs.
Returns¶
MultiPeriodPortfolio or Population Out-of-sample portfolio predictions.
compute_net_backtest_returns(gross_returns, weight_changes, cost_bps=10.0)
¶
Deduct proportional transaction costs from gross backtest returns.
For each date with weight changes, the turnover (sum of absolute
weight deltas) is multiplied by cost_bps / 10_000 and subtracted
from the gross return at that date.
Parameters¶
gross_returns : pd.Series Gross portfolio returns indexed by date. weight_changes : pd.DataFrame Weight change matrix (dates x assets). Only dates present in this DataFrame incur transaction costs. cost_bps : float Transaction cost in basis points (default 10 bps).
Returns¶
pd.Series Net returns with costs deducted.
optimize(pipeline, X, *, y=None)
¶
Fit pipeline on full data and return final weights.
Parameters¶
pipeline : Pipeline A fitted-ready sklearn Pipeline. X : pd.DataFrame Return matrix (observations x assets). y : pd.DataFrame or None Benchmark or factor returns.
Returns¶
PortfolioResult Weights, in-sample portfolio, and fitted pipeline.
run_full_pipeline(prices, optimizer, *, pre_selection_config=None, sector_mapping=None, cv_config=None, previous_weights=None, rebalancing_config=None, current_date=None, last_review_date=None, y_prices=None, n_jobs=None)
¶
End-to-end: prices → validated weights + backtest + rebalancing.
This is the single entry point for producing a portfolio from raw price data. It:
- Converts prices to linear returns.
- Builds the full pipeline (pre-selection + optimiser).
- Backtests via walk-forward (if
cv_configis provided). - Fits on full data to produce final weights.
- Checks rebalancing thresholds (if
previous_weightsgiven).
Parameters¶
prices : pd.DataFrame
Price matrix (dates x tickers).
optimizer : BaseOptimization
A skfolio optimiser instance (e.g. from build_mean_risk()).
pre_selection_config : PreSelectionConfig or None
Pre-selection configuration.
sector_mapping : dict[str, str] or None
Ticker → sector mapping for imputation.
cv_config : WalkForwardConfig or None
Walk-forward backtest configuration. None skips
backtesting.
previous_weights : ndarray or None
Current portfolio weights for rebalancing analysis.
rebalancing_config : ThresholdRebalancingConfig or HybridRebalancingConfig or None
Rebalancing configuration. Pass a ThresholdRebalancingConfig
for pure drift-based rebalancing or a HybridRebalancingConfig
for calendar-gated threshold rebalancing.
current_date : pd.Timestamp or None
Evaluation date for hybrid rebalancing. Defaults to the last
date in the return series when not provided.
last_review_date : pd.Timestamp or None
Date of the last hybrid review. When None with a
HybridRebalancingConfig, the calendar gate is treated as
already elapsed (threshold alone decides).
y_prices : pd.DataFrame or None
Benchmark or factor price series. Converted to returns
alongside asset prices.
n_jobs : int or None
Number of parallel jobs for backtesting.
Returns¶
PortfolioResult Complete result with weights, portfolio metrics, optional backtest, and rebalancing signals.
Examples¶
from optimizer.optimization import MeanRiskConfig, build_mean_risk from optimizer.validation import WalkForwardConfig from optimizer.pipeline import run_full_pipeline
optimizer = build_mean_risk(MeanRiskConfig.for_max_sharpe()) result = run_full_pipeline( ... prices=price_df, ... optimizer=optimizer, ... cv_config=WalkForwardConfig.for_quarterly_rolling(), ... ) print(result.weights) print(result.summary) print(result.backtest.sharpe_ratio) # out-of-sample
run_full_pipeline_with_selection(prices, optimizer, *, fundamentals=None, volume_history=None, financial_statements=None, analyst_data=None, insider_data=None, macro_data=None, investability_config=None, factor_config=None, standardization_config=None, scoring_config=None, selection_config=None, regime_config=None, integration_config=None, sector_mapping=None, pre_selection_config=None, cv_config=None, previous_weights=None, rebalancing_config=None, current_date=None, last_review_date=None, y_prices=None, current_members=None, ic_history=None, n_jobs=None)
¶
End-to-end: fundamentals + prices → stock selection → optimization.
Extends :func:run_full_pipeline with upstream stock pre-selection:
- Screen universe for investability (if
fundamentalsprovided). - Compute and standardize factor scores.
- Apply macro regime tilts (if
macro_data+regime_config). - Compute composite score and select stocks.
- Run existing
run_full_pipelineon selected tickers.
Parameters¶
prices : pd.DataFrame
Price matrix (dates x tickers).
optimizer : BaseOptimization
A skfolio optimiser instance.
fundamentals : pd.DataFrame or None
Cross-sectional data indexed by ticker (market_cap, ratios).
If None, skips screening and factor selection.
volume_history : pd.DataFrame or None
Volume matrix (dates x tickers).
financial_statements : pd.DataFrame or None
Statement-level data for screening.
analyst_data : pd.DataFrame or None
Analyst recommendation data for factor construction.
insider_data : pd.DataFrame or None
Insider transaction data for factor construction.
macro_data : pd.DataFrame or None
Macro indicators for regime classification.
investability_config : InvestabilityScreenConfig or None
Universe screening configuration.
factor_config : FactorConstructionConfig or None
Factor construction parameters.
standardization_config : StandardizationConfig or None
Factor standardization parameters.
scoring_config : CompositeScoringConfig or None
Composite scoring parameters.
selection_config : SelectionConfig or None
Stock selection parameters.
regime_config : RegimeTiltConfig or None
Regime tilt parameters.
integration_config : FactorIntegrationConfig or None
Factor-to-optimization bridge parameters.
sector_mapping : dict[str, str] or None
Ticker -> sector mapping.
pre_selection_config : PreSelectionConfig or None
Return-data pre-selection configuration.
cv_config : WalkForwardConfig or None
Walk-forward backtest configuration.
previous_weights : ndarray or None
Current portfolio weights for rebalancing.
rebalancing_config : ThresholdRebalancingConfig or None
Rebalancing threshold configuration.
y_prices : pd.DataFrame or None
Benchmark or factor price series.
current_members : pd.Index or None
Currently selected tickers for hysteresis.
ic_history : pd.DataFrame or None
IC history for IC-weighted scoring.
n_jobs : int or None
Number of parallel jobs.
Returns¶
PortfolioResult Complete result with weights, metrics, backtest, and rebalancing signals.
tune_and_optimize(pipeline, X, param_grid, *, tuning_config=None, y=None)
¶
Tune hyperparameters via grid or randomized search, then optimise.
Parameters¶
pipeline : Pipeline
A fitted-ready sklearn Pipeline.
X : pd.DataFrame
Return matrix (observations x assets).
param_grid : dict
Parameter grid for GridSearchCV or distributions for
RandomizedSearchCV. Keys use sklearn double-underscore
notation for nested parameters.
tuning_config : GridSearchConfig or RandomizedSearchConfig or None
Search configuration. Defaults to quarterly walk-forward
with Sharpe ratio scoring (grid search).
y : pd.DataFrame or None
Benchmark or factor returns.
Returns¶
PortfolioResult Weights from the best estimator, with backtest from CV.