Project information

  • Category: Regression
  • Project date: March, 2020
  • Project URL: Github
  • Blog Post: Medium
Photo by harnil patel on Unsplash

Stack Overflow Survey - Data Analysis

In this project, I have used 2017 StackOverflow survey data to perform necessary data analysis steps. This project is completed under Udacity Data Scientist Nanodegree Requirement.

Project Requirements

Find answers to given questions using survey data.

  • How other developers suggested breaking into the field (what education to pursue)?
  • What factors about an individual contributed to salary?
  • What was the state of bootcamps for assisting individuals with breaking into developer roles?
  • According to EmploymentStatus, which group has the highest average job satisfaction?

Skills Required

Python
numPy
pandas
plotly
Matplotlib
Seaborn

Techniques

Major Tasks

  • Business Understanding: Started overall analysis with posed questions in mind.
  • Data Understanding: Making sense of data points given by Stackoverflow to use it for analysis. For example: which columns will be helpful to answer a particular question?
  • Data Preparation: Performed data wrangling and data transformation. Written a function to draw plotly barchart for repeated code, keeping DRY techniques in mind.
  • Data Visualizations: Provided data visualization for a deeper understanding of data like barchart and piechart to convey my findings. Added result statements at the end of every visualization for easy understanding of thought process.

I have followed the CRISP-DM process throughout the project. Completed using Jupyter Notebook.