Project information

  • Category: NLP
  • Project date: May, 2020
  • Project URL: Github
Photo by Pascal Meier on Unsplash

Market Analysis of US Airline Industry with Emphasis on American Airlines

This project was completed under Dr. Sangkil Moon's guidance at UNC Charlotte in Innovation Analytics course.

Project Requirements

This in-depth data analysis on Airline Industry shows what customer prefer when it comes to fly based on reviews data and how it impacts airline revenue.

  • Writing a detailed report to company executive.
  • Paraphrasing technical terms in layman style.
  • Work in a team of 5 people.

Skills Required

Python
Pandas
Plotly
Matplotlib
numPy
seaborn
NLP
WordCloud
Logistic Regression
RandomForest Classifier
SAS Sentiment Analysis Studio
SAS Text Mining
SAS Enterprise Guide

Techniques

Major Tasks

  • Performed Data Cleaning.
  • Creating visualizations using plotly.
  • Predicting missing values for important features, feature engineering.
  • Generating WordClouds from customer reviews using NLP.
  • Generating list of most frequent two words using TF-IDF.
  • Trained a Sentiment Analysis model using SAS Sentiment Analysis studio to categorize .
  • Performed Cluster Analysis on customer reviews in order to identify groups of descriptive terms which share underlying characteristics.
  • Comparing top 3 airlines ratings across each given clusters to see airline performances.
  • Performed text categorization on reviews to see what customers are talking about the most.

Initial Data analysis was performed using Jupyter Notebook. Please visit github for more details. Github