ML Modelling

Machine Learning Modelling Module

This module provides classes for machine learning model pipelines, feature selection, and prediction functionality for crime density forecasting.

class predspot.ml_modelling.FeatureSelection(estimator, debug=False)[source]

Bases: TransformerMixin, BaseEstimator

Feature selection transformer.

Parameters:
  • estimator – Scikit-learn compatible feature selector

  • debug (bool, optional) – Enable debug printing. Defaults to False

fit(x, y=None)[source]

Fit the feature selector.

Parameters:
  • x (pandas.DataFrame) – Input features

  • y (pandas.Series, optional) – Target variable

Returns:

The fitted instance

Return type:

self

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') FeatureSelection

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_transform_request(*, x: bool | None | str = '$UNCHANGED$') FeatureSelection

Request metadata passed to the transform method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in transform.

Returns:

self – The updated object.

Return type:

object

transform(x)[source]

Transform features using the feature selector.

Parameters:

x (pandas.DataFrame) – Input features

Returns:

Selected features

Return type:

pandas.DataFrame

class predspot.ml_modelling.Model(estimator, debug=False)[source]

Bases: RegressorMixin, BaseEstimator

Model wrapper for crime density prediction.

Parameters:
  • estimator – Scikit-learn compatible regression estimator

  • debug (bool, optional) – Enable debug printing. Defaults to False

fit(x, y=None)[source]

Fit the regression model.

Parameters:
  • x (pandas.DataFrame) – Input features

  • y (pandas.Series, optional) – Target variable

Returns:

The fitted instance

Return type:

self

predict(x)[source]

Make predictions using the fitted model.

Parameters:

x (pandas.DataFrame) – Input features

Returns:

Predictions with ‘crime_density’ column

Return type:

pandas.DataFrame

set_fit_request(*, x: bool | None | str = '$UNCHANGED$') Model

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, x: bool | None | str = '$UNCHANGED$') Model

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') Model

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class predspot.ml_modelling.PredictionPipeline(mapping, fextraction, estimator, debug=False)[source]

Bases: RegressorMixin, BaseEstimator

Complete pipeline for crime density prediction.

Parameters:
  • mapping – Spatial mapping transformer

  • fextraction – Feature extraction transformer

  • estimator – Scikit-learn compatible pipeline or estimator

  • debug (bool, optional) – Enable debug printing. Defaults to False

property dataset

Current dataset being used

Type:

Dataset

evaluate(scoring, cv=5)[source]

Evaluate model performance using time series cross-validation.

Parameters:
  • scoring (str) – Scoring metric (‘r2’ or ‘mse’)

  • cv (int) – Number of cross-validation folds

Returns:

Scores for each fold

Return type:

list

Raises:

Exception – If scoring metric is invalid

property feature_importances

Get feature importance scores.

Returns:

Feature importance scores

Return type:

pandas.DataFrame

Raises:

Exception – If model hasn’t been fitted or doesn’t support feature importances

fit(dataset, y=None)[source]

Fit the complete prediction pipeline.

Parameters:
  • dataset – Input dataset containing crimes and study area

  • y – Ignored, present for scikit-learn compatibility

Returns:

The fitted instance

Return type:

self

Raises:

Exception – If fitting fails

property grid

Spatial grid used for mapping

Type:

GeoDataFrame

predict()[source]

Make predictions for the next time step.

Returns:

Predictions for next time step

Return type:

pandas.DataFrame

set_fit_request(*, dataset: bool | None | str = '$UNCHANGED$') PredictionPipeline

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

dataset (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for dataset parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') PredictionPipeline

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

property stseries

Spatio-temporal series

Type:

pandas.Series