Feature Engineering¶
Feature Engineering Module
This module provides classes for time series feature engineering and transformation, including autoregressive features, differencing, seasonality, and trend decomposition.
- class predspot.feature_engineering.AR(lags, tfreq, debug=False)[source]
Bases:
TimeSeriesFeaturesAutoregressive features implementation.
- apply_ts_decomposition(ts)[source]
Apply autoregressive transformation (identity).
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Original time series
- Return type:
pandas.Series
- property label
Feature label for autoregressive features
- Type:
str
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') AR
Request metadata passed to the
transformmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseriesparameter intransform.- Returns:
self – The updated object.
- Return type:
object
- class predspot.feature_engineering.Diff(lags, tfreq, debug=False)[source]
Bases:
TimeSeriesFeaturesDifference features implementation.
- apply_ts_decomposition(ts)[source]
Apply difference transformation.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Differenced time series
- Return type:
pandas.Series
- property label
Feature label for difference features
- Type:
str
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') Diff
Request metadata passed to the
transformmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseriesparameter intransform.- Returns:
self – The updated object.
- Return type:
object
- class predspot.feature_engineering.FeatureScaling(estimator, debug=False)[source]
Bases:
TransformerMixin,BaseEstimatorFeature scaling transformer.
- Parameters:
estimator – Scikit-learn compatible scaling estimator
debug (bool, optional) – Enable debug printing. Defaults to False
- set_transform_request(*, x: bool | None | str = '$UNCHANGED$') FeatureScaling
Request metadata passed to the
transformmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
xparameter intransform.- Returns:
self – The updated object.
- Return type:
object
- transform(x)[source]
Transform features using the scaling estimator.
- Parameters:
x (pandas.DataFrame) – Input features
- Returns:
Scaled features
- Return type:
pandas.DataFrame
- class predspot.feature_engineering.Seasonality(lags, tfreq, debug=False)[source]
Bases:
TimeSeriesFeaturesSeasonal decomposition features implementation.
- apply_ts_decomposition(ts)[source]
Extract seasonal component from time series.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Seasonal component
- Return type:
pandas.Series
- property label
Feature label for seasonal features
- Type:
str
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') Seasonality
Request metadata passed to the
transformmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseriesparameter intransform.- Returns:
self – The updated object.
- Return type:
object
- class predspot.feature_engineering.TimeSeriesFeatures(lags, tfreq, debug=False)[source]
Bases:
BaseEstimator,TransformerMixinBase class for time series feature engineering.
- Parameters:
lags (int) – Number of time lags to use for feature creation
tfreq (str) – Time frequency (‘D’ for daily, ‘W’ for weekly, ‘M’ for monthly)
debug (bool, optional) – Enable debug printing. Defaults to False
- Raises:
AssertionError – If lags is not a positive integer or tfreq is invalid
- abstract apply_ts_decomposition(ts)[source]
Apply time series decomposition.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Transformed time series
- Return type:
pandas.Series
- property label
Feature label identifier
- Type:
str
- property lags
Number of time lags
- Type:
int
- make_lag_df(ts)[source]
Create lagged features dataframe.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
(lag_df, aligned_ts) - Lagged features and aligned original series
- Return type:
tuple
- Raises:
AssertionError – If series length is less than number of lags
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') TimeSeriesFeatures
Request metadata passed to the
transformmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseriesparameter intransform.- Returns:
self – The updated object.
- Return type:
object
- transform(stseries)[source]
Transform the input series into lagged features.
- Parameters:
stseries (pandas.Series) – Input time series with multi-index (time, places)
- Returns:
Transformed features
- Return type:
pandas.DataFrame
- class predspot.feature_engineering.Trend(lags, tfreq, debug=False)[source]
Bases:
TimeSeriesFeaturesTrend decomposition features implementation.
- apply_ts_decomposition(ts)[source]
Extract trend component from time series.
- Parameters:
ts (pandas.Series) – Input time series
- Returns:
Trend component
- Return type:
pandas.Series
- property label
Feature label for trend features
- Type:
str
- set_transform_request(*, stseries: bool | None | str = '$UNCHANGED$') Trend
Request metadata passed to the
transformmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
stseries (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
stseriesparameter intransform.- Returns:
self – The updated object.
- Return type:
object