Project information

  • Category: Classification
  • Project date: March, 2020
  • Project URL: Github
Photo by Marcus Kauffman on Unsplash

Disaster Response Pipeline Project

To analyze disaster data from Figure Eight to build a model for an API that classifies disaster messages. This project is completed under Udacity Data Scientist Nanodegree Requirement.

Project Requirements

Performing given tasks to build a disaster response pipeline system.

  • Create an ETL pipeline for a data engineering job and create a Machine Learning pipeline to train a model that can read text data and predict 36 classification categories.
  • Creating a front-end application using Flask that should predict disaster category given message content.
  • Showcase visualization and model disaster category prediction on a webpage.

Skills Required

Python
sqlalchemy
pandas
plotly
Matplotlib
numPy
NLP
MultiOutputClassifier
AdaBoostClassifier
RandomForestClassifier
OneVsRestClassifier
LinearSVC
GridSearchCV
Pipeline
pickle
Flask
HTML
CSS

Techniques

ETL Pipeline

  • Building a Data cleaning pipeline as a python script.
  • It loads messages and categories datasets, merge them, clean it, and save it in an SQLite database.
  • ML Pipeline

  • Building a machine learning pipeline as a python script.
  • It loads data from the SQLite database, splits the dataset into training and test sets.
  • Builds a text processing and machine learning pipeline, trains and tunes a model using GridSearchCV.
  • Outputs results on the test set and exports the final model as a pickle file.
  • Flask Web App

  • Adding data visualizations using Plotly in the web app(Locally deployed).


  • Screenshots