Project information

  • Category: Classification
  • Project date: Dec, 2019
  • Project URL: Github
Photo by Etienne Martin on Unsplash

Identifying Suspicious Activities In Financial Data

Exploring Machine Learning algorithms potential to differentiate between regular ETL approach to detect suspicious patterns of customer activity. This project was completed under Dr. Atif Farid Mohammad's guidance at UNC Charlotte in Machine Learning course.

Project Requirements

Finding, suspicious activity patterns in historical customer transactions under the compliance department of a bank.

  • Find best suited Supervised Machine Learning algorithm to reduce False Positives.

Skills Required

Python
Pandas
numPy
seaborn
Logistic Regression
RandomForest Classifier
Matplotlib

Techniques

Major Tasks

  • Checking for Data summary, correlation, etc.
  • Creating visualization to explore features correlation.
  • Comparing results of Logistic Regression and RandomForest vs Naive ETL approach using Recall score and f1 score.
  • Results:False Positives reduced significantly using Machine Learning Approach. False Positive generated by Naive logic is 17.47% vs False Positive generated by RandomForest Classifier is 0.2817%

Completed using Jupyter Notebook. Dataset was prepared by myself to mimic real world transactions data.