Feature Engineering¶
Feature Engineering Module
This module provides classes for time series feature engineering and transformation, including autoregressive features, differencing, seasonality, and trend decomposition.
- class predspot.feature_engineering.AR(lags, tfreq, debug=False)[source]
Bases:
TimeSeriesFeatures
Autoregressive features implementation.
- apply_ts_decomposition(ts)[source]
Apply autoregressive transformation (identity).
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Original time series
- Return type:
pandas.Series
- property label
Feature label for autoregressive features
- Type:
str
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') AR
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseries
parameter intransform
.- Returns:
self – The updated object.
- Return type:
object
- class predspot.feature_engineering.Diff(lags, tfreq, debug=False)[source]
Bases:
TimeSeriesFeatures
Difference features implementation.
- apply_ts_decomposition(ts)[source]
Apply difference transformation.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Differenced time series
- Return type:
pandas.Series
- property label
Feature label for difference features
- Type:
str
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') Diff
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseries
parameter intransform
.- Returns:
self – The updated object.
- Return type:
object
- class predspot.feature_engineering.FeatureScaling(estimator, debug=False)[source]
Bases:
TransformerMixin
,BaseEstimator
Feature scaling transformer.
- Parameters:
estimator – Scikit-learn compatible scaling estimator
debug (bool, optional) – Enable debug printing. Defaults to False
- set_transform_request(*, x: bool | None | str = '$UNCHANGED$') FeatureScaling
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
x
parameter intransform
.- Returns:
self – The updated object.
- Return type:
object
- transform(x)[source]
Transform features using the scaling estimator.
- Parameters:
x (pandas.DataFrame) – Input features
- Returns:
Scaled features
- Return type:
pandas.DataFrame
- class predspot.feature_engineering.Seasonality(lags, tfreq, debug=False)[source]
Bases:
TimeSeriesFeatures
Seasonal decomposition features implementation.
- apply_ts_decomposition(ts)[source]
Extract seasonal component from time series.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Seasonal component
- Return type:
pandas.Series
- property label
Feature label for seasonal features
- Type:
str
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') Seasonality
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseries
parameter intransform
.- Returns:
self – The updated object.
- Return type:
object
- class predspot.feature_engineering.TimeSeriesFeatures(lags, tfreq, debug=False)[source]
Bases:
BaseEstimator
,TransformerMixin
Base class for time series feature engineering.
- Parameters:
lags (int) – Number of time lags to use for feature creation
tfreq (str) – Time frequency (‘D’ for daily, ‘W’ for weekly, ‘M’ for monthly)
debug (bool, optional) – Enable debug printing. Defaults to False
- Raises:
AssertionError – If lags is not a positive integer or tfreq is invalid
- abstract apply_ts_decomposition(ts)[source]
Apply time series decomposition.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Transformed time series
- Return type:
pandas.Series
- property label
Feature label identifier
- Type:
str
- property lags
Number of time lags
- Type:
int
- make_lag_df(ts)[source]
Create lagged features dataframe.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
(lag_df, aligned_ts) - Lagged features and aligned original series
- Return type:
tuple
- Raises:
AssertionError – If series length is less than number of lags
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') TimeSeriesFeatures
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseries
parameter intransform
.- Returns:
self – The updated object.
- Return type:
object
- transform(stseries)[source]
Transform the input series into lagged features.
- Parameters:
stseries (pandas.Series) – Input time series with multi-index (time, places)
- Returns:
Transformed features
- Return type:
pandas.DataFrame
- class predspot.feature_engineering.Trend(lags, tfreq, debug=False)[source]
Bases:
TimeSeriesFeatures
Trend decomposition features implementation.
- apply_ts_decomposition(ts)[source]
Extract trend component from time series.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Trend component
- Return type:
pandas.Series
- property label
Feature label for trend features
- Type:
str
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') Trend
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it totransform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseries
parameter intransform
.- Returns:
self – The updated object.
- Return type:
object