Moment Estimation¶

The optimizer.moments module provides expected return estimation, covariance estimation, Hidden Markov Model regime blending, Deep Markov Models, and multi-period log-normal scaling. Every component follows the library-wide pattern of frozen dataclass config + factory function, and all estimators conform to the skfolio BaseMu / BaseCovariance API so they compose directly inside sklearn pipelines.

Module Layout¶

File	Contents
`optimizer/moments/_config.py`	`MuEstimatorType`, `CovEstimatorType`, `ShrinkageMethod` enums; `MomentEstimationConfig` frozen dataclass
`optimizer/moments/_factory.py`	`build_mu_estimator()`, `build_cov_estimator()`, `build_prior()` factories
`optimizer/moments/_hmm.py`	`HMMConfig`, `HMMResult`, `fit_hmm()`, `select_hmm_n_states()`, `blend_moments_by_regime()`, `HMMBlendedMu`, `HMMBlendedCovariance`
`optimizer/moments/_dmm.py`	`DMMConfig`, `DMMResult`, `fit_dmm()`, `blend_moments_dmm()` (optional; requires `torch` + `pyro-ppl`)
`optimizer/moments/_scaling.py`	`apply_lognormal_correction()`, `scale_moments_to_horizon()`

Expected Return Estimators¶

The MuEstimatorType enum selects which skfolio BaseMu estimator build_mu_estimator() instantiates.

Enum value	skfolio class	Key parameter(s)	Description
`EMPIRICAL`	`EmpiricalMu`	--	Sample mean of historical returns
`SHRUNK`	`ShrunkMu`	`shrinkage_method`	Shrinkage toward a structured target (see table below)
`EW`	`EWMu`	`ew_mu_alpha` (default 0.2)	Exponentially weighted mean; higher alpha puts more weight on recent observations
`EQUILIBRIUM`	`EquilibriumMu`	`risk_aversion` (default 1.0)	Implied equilibrium returns from market-cap weights; Black-Litterman starting point
`HMM_BLENDED`	`HMMBlendedMu`	`hmm_config`	Regime-probability-weighted blend of per-state means (see HMM section)

Shrinkage methods¶

When mu_estimator = MuEstimatorType.SHRUNK, the ShrinkageMethod enum controls which shrinkage flavour is used:

Enum value	skfolio method	Reference
`JAMES_STEIN`	`ShrunkMuMethods.JAMES_STEIN`	James & Stein (1961)
`BAYES_STEIN`	`ShrunkMuMethods.BAYES_STEIN`	Jorion (1986)
`BODNAR_OKHRIN`	`ShrunkMuMethods.BODNAR_OKHRIN`	Bodnar & Okhrin (2011)

Covariance Estimators¶

The CovEstimatorType enum selects which skfolio BaseCovariance estimator build_cov_estimator() instantiates.

Enum value	skfolio class	Key parameter(s)	Description
`EMPIRICAL`	`EmpiricalCovariance`	--	Sample covariance matrix
`LEDOIT_WOLF`	`LedoitWolf`	--	Analytical shrinkage (Ledoit & Wolf, 2004); optimal bias-variance trade-off without cross-validation
`OAS`	`OAS`	--	Oracle Approximating Shrinkage (Chen et al., 2010); similar to Ledoit-Wolf but with a different analytical formula
`SHRUNK`	`ShrunkCovariance`	`shrunk_cov_shrinkage` (default 0.1)	Fixed shrinkage intensity toward a diagonal target
`EW`	`EWCovariance`	`ew_cov_alpha` (default 0.2)	Exponentially weighted covariance; recent observations receive higher weight
`GERBER`	`GerberCovariance`	`gerber_threshold` (default 0.5)	Gerber statistic-based covariance; only co-movements that exceed the threshold contribute
`GRAPHICAL_LASSO_CV`	`GraphicalLassoCV`	--	Sparse inverse covariance via L1-penalised MLE with cross-validated penalty
`DENOISE`	`DenoiseCovariance`	inner: `EmpiricalCovariance`	Random matrix theory denoising; filters eigenvalues below the Marchenko-Pastur threshold
`DETONE`	`DetoneCovariance`	inner: `EmpiricalCovariance`	Market factor removal; strips the largest eigenvalue (market mode) from the covariance
`IMPLIED`	`ImpliedCovariance`	--	Implied covariance from option-market data
`HMM_BLENDED`	`HMMBlendedCovariance`	`hmm_config`	Full law-of-total-variance blend of regime covariances (see HMM section)

MomentEstimationConfig¶

MomentEstimationConfig is a frozen dataclass that bundles all moment estimation parameters into a single serialisable object. Non-serialisable objects (estimator instances, numpy arrays) are never stored in the config; they are constructed by the factory functions.

Fields¶

@dataclass(frozen=True)
class MomentEstimationConfig:
    # Expected return estimator
    mu_estimator: MuEstimatorType = MuEstimatorType.EMPIRICAL
    shrinkage_method: ShrinkageMethod = ShrinkageMethod.JAMES_STEIN
    ew_mu_alpha: float = 0.2
    risk_aversion: float = 1.0

    # Covariance estimator
    cov_estimator: CovEstimatorType = CovEstimatorType.LEDOIT_WOLF
    ew_cov_alpha: float = 0.2
    shrunk_cov_shrinkage: float = 0.1
    gerber_threshold: float = 0.5

    # Prior assembly
    is_log_normal: bool = False
    investment_horizon: float | None = None

    # HMM blended estimators
    hmm_config: HMMConfig = field(default_factory=HMMConfig)

    # Factor model
    use_factor_model: bool = False
    residual_variance: bool = True

Presets¶

Preset method	mu estimator	cov estimator	Use case
`for_equilibrium_ledoitwolf()`	`EquilibriumMu`	`LedoitWolf`	Black-Litterman-ready prior; equilibrium returns serve as the neutral starting point
`for_shrunk_denoised()`	`ShrunkMu` (James-Stein)	`DenoiseCovariance`	Conservative prior; shrinks expected returns and removes noise from the covariance spectrum
`for_adaptive()`	`EWMu`	`EWCovariance`	Responsive prior; exponentially weighted moments adapt quickly to regime changes
`for_hmm_blended(n_states=2)`	`HMMBlendedMu`	`HMMBlendedCovariance`	Regime-aware prior; probability-weighted blend of per-regime moments

Usage¶

from optimizer.moments import MomentEstimationConfig

# Use a preset
config = MomentEstimationConfig.for_equilibrium_ledoitwolf()

# Or build from scratch
config = MomentEstimationConfig(
    mu_estimator=MuEstimatorType.SHRUNK,
    shrinkage_method=ShrinkageMethod.BAYES_STEIN,
    cov_estimator=CovEstimatorType.DENOISE,
)

Factory Functions¶

build_mu_estimator¶

Maps the mu_estimator field of a MomentEstimationConfig to a concrete skfolio BaseMu instance.

from optimizer.moments import MomentEstimationConfig, build_mu_estimator

config = MomentEstimationConfig(mu_estimator=MuEstimatorType.EW, ew_mu_alpha=0.3)
mu_est = build_mu_estimator(config)
# Returns an EWMu(alpha=0.3) instance ready for .fit(X)

build_cov_estimator¶

Maps the cov_estimator field to a concrete skfolio BaseCovariance instance.

from optimizer.moments import MomentEstimationConfig, build_cov_estimator

config = MomentEstimationConfig(cov_estimator=CovEstimatorType.GERBER, gerber_threshold=0.4)
cov_est = build_cov_estimator(config)
# Returns a GerberCovariance(threshold=0.4) instance

build_prior¶

Composes build_mu_estimator and build_cov_estimator into an EmpiricalPrior, and optionally wraps it in a FactorModel.

from optimizer.moments import MomentEstimationConfig, build_prior

# Default prior: EmpiricalMu + LedoitWolf
prior = build_prior()

# Prior from a preset
config = MomentEstimationConfig.for_shrunk_denoised()
prior = build_prior(config)

# Factor model prior
config = MomentEstimationConfig(
    mu_estimator=MuEstimatorType.EMPIRICAL,
    cov_estimator=CovEstimatorType.LEDOIT_WOLF,
    use_factor_model=True,
    residual_variance=True,
)
prior = build_prior(config)
# Returns a FactorModel wrapping EmpiricalPrior

When use_factor_model=True, the resulting FactorModel expects factor returns as the y argument during fit(X, y). The fitted prior attribute is return_distribution_ (not prior_model_), containing mu, covariance, returns, sample_weight, and cholesky.

Hidden Markov Model Regime Blending¶

The HMM subsystem fits a Gaussian Hidden Markov Model to a panel of asset returns, extracts regime-conditional moments, and produces probability-weighted blended estimates suitable for portfolio optimization.

HMMConfig¶

@dataclass(frozen=True)
class HMMConfig:
    n_states: int = 2          # Number of latent regimes
    n_iter: int = 100          # Max Baum-Welch EM iterations
    tol: float = 1e-4          # Convergence tolerance on log-likelihood
    covariance_type: str = "full"  # "full", "diag", "tied", or "spherical"
    random_state: int | None = None

HMMResult¶

After fitting, fit_hmm() returns an HMMResult dataclass containing:

Attribute	Shape	Description
`transition_matrix`	`(n_states, n_states)`	Row-stochastic matrix \( A_{ij} = P(z_t = j \mid z_{t-1} = i) \)
`regime_means`	`DataFrame (n_states, n_assets)`	Per-regime expected return vectors \( \mu_s \)
`regime_covariances`	`(n_states, n_assets, n_assets)`	Per-regime covariance matrices \( \Sigma_s \)
`filtered_probs`	`DataFrame (n_dates, n_states)`	Forward-only causal probabilities \( \alpha_t(s) \)
`smoothed_probs`	`DataFrame (n_dates, n_states)`	Full-sequence posterior probabilities \( \gamma_t(s) \)
`log_likelihood`	`float`	Log-likelihood of the data under the fitted model

Fitting an HMM¶

from optimizer.moments import HMMConfig, fit_hmm

config = HMMConfig(n_states=2, random_state=42)
result = fit_hmm(returns, config)

print(result.transition_matrix)
print(result.regime_means)
print(result.filtered_probs.tail())

The function drops NaN rows before fitting and raises DataError if fewer than n_states + 1 observations remain.

Filtered vs. Smoothed Probabilities¶

The HMM produces two sets of state probabilities, and the distinction between them is critical for correct usage:

Filtered probabilities \( \alpha_t(s) \propto P(r_t \mid z_t = s) \sum_{s'} A_{s',s} \cdot \alpha_{t-1}(s') \) are computed via the forward algorithm and are conditioned only on past and current observations \( r_{1:t} \). They are causal -- they do not use future data -- and are therefore the correct choice for online blending in backtests and live trading.

Smoothed probabilities \( \gamma_t(s) = P(z_t = s \mid r_{1:T}) \) are computed via the Baum-Welch forward-backward algorithm and are conditioned on the entire observation sequence. They provide the best point estimate of the regime at each time step but introduce look-ahead bias and must only be used for diagnostics, regime labelling, and parameter estimation -- never for causal blending in backtests.

Property	Filtered	Smoothed
Conditioning	\( r_{1:t} \) (past + present)	\( r_{1:T} \) (full sequence)
Causal	Yes	No
Look-ahead bias	None	Yes
Use for backtests	Yes	No
Use for diagnostics	Yes	Yes

Model Selection: AIC / BIC¶

select_hmm_n_states() evaluates multiple candidate state counts and returns the one that minimises the chosen information criterion.

Free parameters for a model with \( S \) states and \( d \) assets:

\[ k = S(S - 1) + S \cdot d + S \cdot \frac{d(d+1)}{2} \]

where the three terms count the transition matrix rows (each sums to 1, so \( S - 1 \) free per row), per-regime means, and per-regime full covariance (lower triangle).

The criteria are:

[ \text{AIC} = -2 \ln L + 2k ] [ \text{BIC} = -2 \ln L + k \ln T ]

from optimizer.moments import select_hmm_n_states

best_n = select_hmm_n_states(
    returns,
    candidate_n_states=(2, 3, 4),
    criterion="bic",
)
print(f"Optimal number of regimes: {best_n}")

Blending Moments by Regime¶

Simple blend: `blend_moments_by_regime()`¶

Computes a probability-weighted blend using the filtered probabilities at the final time step:

[ \mu = \sum_s p_s \cdot \mu_s ] [ \Sigma = \sum_s p_s \cdot \Sigma_s ]

where \( p_s = \alpha_T(s) \) are the last-period filtered regime probabilities.

from optimizer.moments import fit_hmm, blend_moments_by_regime

result = fit_hmm(returns, HMMConfig(n_states=2, random_state=42))
mu_blended, cov_blended = blend_moments_by_regime(result)

Gotcha: This function computes only the within-regime weighted covariance and omits the between-regime mean-dispersion term. The blended covariance will underestimate total uncertainty when regime means differ materially. For optimizer inputs, use HMMBlendedCovariance instead.

Full blend: `HMMBlendedCovariance`¶

The HMMBlendedCovariance class implements the full law of total variance:

\[ \Sigma = \sum_s p_s \left[ \Sigma_s + (\mu_s - \mu)(\mu_s - \mu)^\top \right] \]

The second term \( (\mu_s - \mu)(\mu_s - \mu)^\top \) captures the between-regime mean dispersion -- the additional uncertainty that arises because the true mean itself is uncertain across regimes. This term can be substantial when regime means differ (e.g., bull vs. bear markets) and omitting it leads to systematically underestimated risk.

skfolio-Compatible Estimator Classes¶

Both HMMBlendedMu and HMMBlendedCovariance conform to the skfolio BaseMu / BaseCovariance API, which means they expose the standard mu_ and covariance_ attributes after .fit(X) and can be plugged directly into EmpiricalPrior or any skfolio pipeline.

HMMBlendedMu¶

from optimizer.moments import HMMBlendedMu, HMMConfig

mu_est = HMMBlendedMu(hmm_config=HMMConfig(n_states=2, random_state=42))
mu_est.fit(X_returns)

print(mu_est.mu_)              # ndarray of shape (n_assets,)
print(mu_est.hmm_result_)      # Full HMMResult for inspection

After fitting, mu_ contains the probability-weighted blended expected return vector:

\[ \mu = \sum_s p(z_T = s \mid r_{1:T}) \cdot \mu_s \]

HMMBlendedCovariance¶

from optimizer.moments import HMMBlendedCovariance, HMMConfig

cov_est = HMMBlendedCovariance(
    hmm_config=HMMConfig(n_states=2, random_state=42),
    nearest=True,      # project to nearest PSD if needed
    higham=False,       # use eigenvalue clipping (not Higham)
)
cov_est.fit(X_returns)

print(cov_est.covariance_)     # ndarray of shape (n_assets, n_assets)
print(cov_est.hmm_result_)     # Full HMMResult for inspection

The nearest parameter controls whether the blended covariance is projected to the nearest positive semi-definite matrix (via eigenvalue clipping or the Higham algorithm). This is enabled by default because the law-of-total- variance blend is not guaranteed to be PSD in finite samples.

Using HMM Blending in the Prior¶

The recommended way to use HMM blending is through the MomentEstimationConfig preset, which wires everything together:

from optimizer.moments import MomentEstimationConfig, build_prior

config = MomentEstimationConfig.for_hmm_blended(n_states=2)
prior = build_prior(config)

# Use in a MeanRisk optimizer
from skfolio.optimization import MeanRisk
model = MeanRisk(prior_estimator=prior)
model.fit(X_returns)

Deep Markov Model (Optional)¶

The DMM module implements the architecture from Krishnan et al. (2016), "Structured Inference Networks for Nonlinear State Space Models," using Pyro's stochastic variational inference (SVI) with KL annealing.

Dependency note: The DMM requires torch and pyro-ppl, which are not declared in pyproject.toml. The module is effectively optional. Import it with:
pip install torch pyro-ppl
If the dependencies are missing, importing DMMConfig or fit_dmm from optimizer.moments will silently be suppressed (via contextlib.suppress in __init__.py).

Architecture¶

The generative model factorises as:

\[ p(x_{1:T}, z_{1:T}) = p(z_1) \prod_{t=2}^{T} p(z_t \mid z_{t-1}) \prod_{t=1}^{T} p(x_t \mid z_t) \]

The variational guide uses a backward-RNN inference network:

\[ q(z_{1:T} \mid x_{1:T}) = \prod_{t=1}^{T} q(z_t \mid z_{t-1}, h_t^{\text{rnn}}) \]

where \( h_t^{\text{rnn}} \) encodes the future context \( x_t, \ldots, x_T \) via a GRU running backward over the sequence.

Component	Class	Role
Emitter	`Emitter`	Maps \( z_t \to (\text{loc}, \text{scale}) \) of emission distribution \( p(x_t \mid z_t) \)
Transition	`GatedTransition`	Gated residual MLP for \( p(z_t \mid z_{t-1}) \); gate interpolates between linear identity and nonlinear proposal
Combiner	`Combiner`	Fuses \( z_{t-1} \) with backward-RNN context for variational posterior \( q(z_t \mid z_{t-1}, h_t) \)
Inference RNN	`nn.GRU`	Backward-running GRU encoding \( x_t, \ldots, x_T \) into context vectors

DMMConfig¶

@dataclass(frozen=True)
class DMMConfig:
    z_dim: int = 16                    # Latent state dimension
    emission_dim: int = 64             # Emitter hidden layer size
    transition_dim: int = 64           # Transition hidden layer size
    rnn_dim: int = 128                 # GRU hidden state size
    num_epochs: int = 1000             # SVI training epochs
    learning_rate: float = 3e-4        # ClippedAdam learning rate
    annealing_epochs: int = 50         # KL annealing ramp length
    minimum_annealing_factor: float = 0.2  # Starting KL weight
    random_state: int | None = None

DMMResult¶

Attribute	Shape	Description
`latent_means`	`DataFrame (T, z_dim)`	Variational posterior means for each time step
`latent_stds`	`DataFrame (T, z_dim)`	Variational posterior standard deviations
`elbo_history`	`list[float]`	ELBO value per training epoch (for convergence monitoring)
`model`	`DMM`	Trained PyTorch module instance
`tickers`	`list[str]`	Asset names in training order
`input_mean`	`ndarray (n_assets,)`	Per-asset mean used for input standardisation
`input_std`	`ndarray (n_assets,)`	Per-asset std used for input standardisation

Fitting and Blending¶

from optimizer.moments import DMMConfig, fit_dmm, blend_moments_dmm

config = DMMConfig(z_dim=16, num_epochs=500, random_state=42)
result = fit_dmm(returns, config)

# Check convergence
import matplotlib.pyplot as plt
plt.plot(result.elbo_history)
plt.xlabel("Epoch")
plt.ylabel("ELBO")
plt.title("DMM Training Convergence")
plt.show()

# Produce blended moments via Monte Carlo posterior-predictive sampling
mu, cov = blend_moments_dmm(result, n_mc_samples=500, seed=42)

blend_moments_dmm() works by:

Sampling \( z_T \sim q(z_T \mid x_{1:T}) \) from the variational posterior at the last time step.
Propagating through the transition: \( z_{T+1} \sim p(z_{T+1} \mid z_T) \).
Emitting: \( x_{T+1} \sim p(x_{T+1} \mid z_{T+1}) \).
Applying the law of total variance across Monte Carlo samples.
Un-standardising back to the original return scale.

Critical limitation: The DMM produces diagonal covariance only. The law-of-total-variance computation yields \( \text{diag}(\mathbb{E}[\text{Var}[X \mid Z]] + \text{Var}[\mathbb{E}[X \mid Z]]) \), so all off-diagonal entries are zero. This makes the DMM unsuitable as a standalone covariance estimator for portfolio optimization -- it should be combined with a separate cross-sectional covariance estimate.

Log-Normal Multi-Period Scaling¶

When working with multi-period investment horizons, daily log-return moments must be scaled to the target horizon. The _scaling module provides two methods: an exact log-normal formula and a linear (delta-method) approximation.

The Scaling Problem¶

If daily log-returns \( r_t \sim N(\mu, \Sigma) \) are i.i.d., then the cumulative simple return over \( T \) days is:

\[ R_T = \exp\left(\sum_{t=1}^{T} r_t\right) - 1 \]

Because the sum of normals is normal, \( \sum r_t \sim N(\mu T, \Sigma T) \), but \( R_T \) is not normal -- it is log-normally distributed. The scaling functions convert log-return parameters to simple-return space.

Expected Return (Both Methods)¶

Jensen's inequality correction gives the expected simple return:

\[ \mathbb{E}[R_T^i] = \exp\!\left(\mu_i T + \frac{1}{2} \sigma_i^2 T\right) - 1 \]

where \( \sigma_i^2 = \Sigma_{ii} \) is the daily variance of asset \( i \). This formula is identical for both the exact and linear methods.

Covariance: Exact Method¶

The exact log-normal covariance is:

\[ \text{Cov}[R_T^i, R_T^j] = \exp\!\left((\mu_i + \mu_j) T + \frac{1}{2}(\sigma_i^2 + \sigma_j^2) T\right) \cdot \left(\exp(\sigma_{ij} \cdot T) - 1\right) \]

where \( \sigma_{ij} = \Sigma_{ij} \) is the daily covariance between assets \( i \) and \( j \).

Covariance: Linear Method¶

The delta-method (first-order Taylor) approximation:

\[ \Sigma_T \approx \Sigma \cdot T \]

This is accurate for short horizons and small variances but increasingly biased as \( T \) grows or volatility increases.

Function Signatures¶

`apply_lognormal_correction`¶

def apply_lognormal_correction(
    mu: pd.Series,        # Daily log-return expected values
    cov: pd.DataFrame,    # Daily log-return covariance matrix
    horizon: int,         # Trading days (21=monthly, 63=quarterly, 252=annual)
    method: str = "exact" # "exact" or "linear"
) -> tuple[pd.Series, pd.DataFrame]:
    ...

`scale_moments_to_horizon`¶

A higher-level wrapper that validates inputs (square covariance, aligned indices, non-negative diagonal) before delegating to apply_lognormal_correction.

def scale_moments_to_horizon(
    mu: pd.Series,
    cov: pd.DataFrame,
    daily_horizon: int,
    method: str = "exact"
) -> tuple[pd.Series, pd.DataFrame]:
    ...

Usage¶

import pandas as pd
from optimizer.moments import apply_lognormal_correction, scale_moments_to_horizon

# Daily log-return moments
mu_daily = pd.Series({"AAPL": 0.0005, "MSFT": 0.0004, "GOOG": 0.0003})
cov_daily = pd.DataFrame(
    [[0.0004, 0.0001, 0.0001],
     [0.0001, 0.0003, 0.0001],
     [0.0001, 0.0001, 0.0005]],
    index=mu_daily.index,
    columns=mu_daily.index,
)

# Scale to quarterly horizon (63 trading days), exact method
mu_q, cov_q = apply_lognormal_correction(mu_daily, cov_daily, horizon=63, method="exact")

# Scale to annual horizon (252 trading days), linear approximation
mu_a, cov_a = scale_moments_to_horizon(mu_daily, cov_daily, daily_horizon=252, method="linear")

Important: Inputs must be log-return parameters (mean and covariance of log-returns). The outputs are in simple-return space (\( \mathbb{E}[R_T] \) and \( \text{Cov}[R_T] \)). Feeding simple-return moments into these functions will produce incorrect results.

Common Gotchas¶

1. `blend_moments_by_regime()` vs. `HMMBlendedCovariance`¶

blend_moments_by_regime() computes only the within-regime weighted covariance:

\[ \Sigma_{\text{simple}} = \sum_s p_s \cdot \Sigma_s \]

HMMBlendedCovariance adds the between-regime mean-dispersion term:

\[ \Sigma_{\text{full}} = \sum_s p_s \left[ \Sigma_s + (\mu_s - \mu)(\mu_s - \mu)^\top \right] \]

The difference \( \Sigma_{\text{full}} - \Sigma_{\text{simple}} = \sum_s p_s (\mu_s - \mu)(\mu_s - \mu)^\top \) is a positive semi-definite matrix. When regime means differ materially (e.g., 10% annualised spread between bull and bear), this term can be a significant fraction of total variance.

Rule: For optimizer inputs, always use HMMBlendedCovariance. Reserve blend_moments_by_regime() for quick diagnostics or situations where you explicitly want to ignore between-regime uncertainty.

2. Filtered vs. Smoothed Probabilities for Backtests¶

The HMMBlendedMu and HMMBlendedCovariance classes use filtered (forward-only) probabilities from the last time step. This is the correct causal choice for backtesting. If you manually call blend_moments_by_regime(), it also uses the filtered probabilities from result.filtered_probs.iloc[-1].

Never use result.smoothed_probs for weight computation in a backtest -- it conditions on the entire sequence and introduces look-ahead bias.

3. Log-Return vs. Simple-Return Inputs for Scaling¶

The apply_lognormal_correction and scale_moments_to_horizon functions expect log-return (continuously compounded) parameters as input. The output is in simple-return space. If you accidentally pass simple-return moments as input, the resulting expected returns and covariances will be biased upward.

4. DMM Produces Diagonal Covariance¶

The Deep Markov Model's blend_moments_dmm() returns a diagonal covariance matrix. All off-diagonal covariances are zero. This means the DMM cannot capture cross-asset dependencies and should not be used as a standalone covariance estimator. Consider combining the DMM's variance estimates with a separate cross-sectional covariance model.

5. The Fitted Prior Attribute¶

After fitting a prior with build_prior(), the estimated distribution is stored in the return_distribution_ attribute (not prior_model_). It contains mu, covariance, returns, sample_weight, and cholesky.

prior = build_prior(config)
prior.fit(X_returns)
print(prior.return_distribution_.mu)
print(prior.return_distribution_.covariance)

6. Factor Model Views¶

When the prior is wrapped in a FactorModel (via use_factor_model=True), any downstream views (e.g., Black-Litterman) must reference factor names (e.g., MTUM, QUAL), not asset names.

Complete Example¶

import pandas as pd
from optimizer.moments import (
    MomentEstimationConfig,
    MuEstimatorType,
    CovEstimatorType,
    ShrinkageMethod,
    HMMConfig,
    build_prior,
    build_mu_estimator,
    build_cov_estimator,
    fit_hmm,
    select_hmm_n_states,
    apply_lognormal_correction,
)

# --- 1. Basic prior construction ---
config = MomentEstimationConfig.for_shrunk_denoised()
prior = build_prior(config)
prior.fit(X_returns)
print("Expected returns:", prior.return_distribution_.mu)

# --- 2. HMM regime analysis ---
hmm_cfg = HMMConfig(n_states=2, random_state=42)
result = fit_hmm(returns, hmm_cfg)

# Transition probabilities
print("Transition matrix:\n", result.transition_matrix)

# Current regime belief (causal)
print("Filtered probs (last):", result.filtered_probs.iloc[-1].to_dict())

# --- 3. Model selection ---
best_n = select_hmm_n_states(returns, candidate_n_states=(2, 3, 4), criterion="bic")
print(f"BIC-optimal states: {best_n}")

# --- 4. HMM-blended prior in optimizer ---
config_hmm = MomentEstimationConfig.for_hmm_blended(n_states=best_n)
prior_hmm = build_prior(config_hmm)

from skfolio.optimization import MeanRisk
model = MeanRisk(prior_estimator=prior_hmm)
model.fit(X_returns)

# --- 5. Multi-period scaling ---
mu_daily = pd.Series({"AAPL": 0.0005, "MSFT": 0.0004})
cov_daily = pd.DataFrame(
    [[0.0004, 0.0001], [0.0001, 0.0003]],
    index=mu_daily.index,
    columns=mu_daily.index,
)
mu_annual, cov_annual = apply_lognormal_correction(
    mu_daily, cov_daily, horizon=252, method="exact"
)
print("Annual expected return:", mu_annual)
print("Annual covariance:\n", cov_annual)

Moment Estimation¶

Module Layout¶

Expected Return Estimators¶

Shrinkage methods¶

Covariance Estimators¶

MomentEstimationConfig¶

Fields¶

Presets¶

Usage¶

Factory Functions¶

build_mu_estimator¶

build_cov_estimator¶

build_prior¶

Hidden Markov Model Regime Blending¶

HMMConfig¶

HMMResult¶

Fitting an HMM¶

Filtered vs. Smoothed Probabilities¶

Model Selection: AIC / BIC¶

Blending Moments by Regime¶

Simple blend: blend_moments_by_regime()¶

Full blend: HMMBlendedCovariance¶

skfolio-Compatible Estimator Classes¶

HMMBlendedMu¶

HMMBlendedCovariance¶

Using HMM Blending in the Prior¶

Deep Markov Model (Optional)¶

Architecture¶

DMMConfig¶

DMMResult¶

Fitting and Blending¶

Log-Normal Multi-Period Scaling¶

The Scaling Problem¶

Expected Return (Both Methods)¶

Covariance: Exact Method¶

Covariance: Linear Method¶

Function Signatures¶

apply_lognormal_correction¶

scale_moments_to_horizon¶

Usage¶

Common Gotchas¶

1. blend_moments_by_regime() vs. HMMBlendedCovariance¶

2. Filtered vs. Smoothed Probabilities for Backtests¶

3. Log-Return vs. Simple-Return Inputs for Scaling¶

4. DMM Produces Diagonal Covariance¶

5. The Fitted Prior Attribute¶

6. Factor Model Views¶

Complete Example¶

Simple blend: `blend_moments_by_regime()`¶

Full blend: `HMMBlendedCovariance`¶

`apply_lognormal_correction`¶

`scale_moments_to_horizon`¶

1. `blend_moments_by_regime()` vs. `HMMBlendedCovariance`¶