Predspot¶
Overview¶
Predspot is an early project for a Python library for spatio-temporal crime prediction and hotspot detection. It combines machine learning techniques with spatial analysis to help predict and visualize crime patterns across time and space. To properly use this library, you preferably need to have a crime dataset and a study area.
The results of this library are not guaranteed to be good, but it is a good starting point for spatio-temporal crime prediction. This project is barely maintained, so please if you want to collaborate, open an issue or a PR.
Important Notice¶
Warning
This project was developed as part of a master’s thesis and is currently in an archived state. While the core functionality exists, you may encounter compatibility issues with newer Python package versions. The code can work with some effort, but please note:
This is not production-ready software
Some dependencies are outdated and may require specific versions
You might need to modify some code to work with newer package versions
The project was created for research purposes
However, we believe the methodologies and approaches used here are still valuable! If you’re interested in crime hotspot prediction, feel free to:
Use this as a reference implementation
Adapt the code to modern dependencies
Build upon these concepts for your own projects
Contribute to modernizing the codebase
We welcome anyone interested in reviving or learning from this project!
Key Features¶
Spatial and temporal crime mapping
Feature engineering for time series data
Machine learning-based prediction pipeline
Crime hotspot detection using Kernel Density Estimation
Visualization tools for crime patterns
Quick Start¶
Basic usage example:
from predspot import Dataset, PredictionPipeline
from predspot.crime_mapping import KDE, create_gridpoints
from predspot.feature_engineering import Seasonality, Trend, Diff
# Load and prepare data
dataset = Dataset(crimes_df, study_area_gdf)
# Create prediction pipeline
pipeline = PredictionPipeline(
mapping=KDE(tfreq='M', grid=create_gridpoints(study_area, resolution=250)),
fextraction=PandasFeatureUnion([
('seasonal', Seasonality(lags=12)),
('trend', Trend(lags=12)),
('diff', Diff(lags=12))
]),
estimator=your_favorite_sklearn_model
)
# Fit and predict
pipeline.fit(dataset)
predictions = pipeline.predict()
Modules¶
The library consists of four main modules:
Dataset Preparation: Module for preparing and managing crime datasets and study areas.
Crime Mapping: Module for spatial and temporal crime mapping, including KDE-based hotspot detection.
Feature Engineering: Module for time series feature engineering, including seasonality, trend, and difference features.
ML Modelling: Module that implements the prediction pipeline and model evaluation.
And two utilities:
Installation¶
Create conda env and install requirements:
conda create -n predspot python=3.8
conda activate predspot
conda install -y rtree geopandas # if doesnt work, do: `conda clean --all`
pip install pandas statsmodels==0.10.2 geojsoncontour stldecompose scikit-learn matplotlib descartes
pip install .
Required dependencies:
pandas
geopandas
numpy
scikit-learn
scipy
stldecompose
matplotlib
Input Data Format¶
The crime data should be a pandas DataFrame with the following required columns:
tag
: Crime typet
: Timestamplon
: Longitudelat
: Latitude
The study area should be a GeoDataFrame defining the geographical boundaries of interest.
Resources¶
For more information on the methods used in Predspot, please search more about these methods:
Kernel Density Estimation for crime hotspot detection
Time series decomposition for feature engineering
Spatio-temporal crime prediction techniques
License¶
BSD 3-Clause.
Contributing¶
Contributions are welcome! Please feel free to submit a Pull Request.
Guidelines for contributing:
Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a new Pull Request
Citation¶
If you use Predspot in your research, please cite us:
Araujo, A., & Cacho, N. (2019). Predspot: Predicting crime hotspots with machine learning.
Master's thesis, UFRN (Universidade Federal do Rio Grande do Norte), Natal, Brazil.
Araújo, A., Cacho, N., Bezerra, L., Vieira, C., & Borges, J. (2018, June).
Towards a crime hotspot detection framework for patrol planning.
In 2018 IEEE 20th International Conference on High Performance Computing and Communications;
IEEE 16th International Conference on Smart City;
IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) (pp. 1256-1263). IEEE.
Documentation: