Project information

Photo by Spencer Davis on Unsplash

Auto Insurance Portfolio Loss Ratio Prediction

The goal of this project is to predict the natural logarithm of the loss ratio of a portfolio of auto insurance policies. The testing data contains a set of 330 policy portfolios, each having at least 1,000 auto policies. The training data contains of a set of auto policies including a number of policy level attributes as well as Annual Premium and Loss Amount.

Project Requirements

The Loss Ratio of a policy is just the Loss Amount divided by the Premium. The Loss Ratio of a portfolio of policies is the sum of all the Loss Amounts of all the policies in the portfolio divided by the sum of all the Premiums in the portfolio. Your target is the natural logarithm of the the loss Ratio of a portfolio.

  • Competition on Kaggle platform.
  • Skills Required

    Python
    Pandas
    numPy
    seaborn
    matplotlib
    Linear Regression
    RandomForestRegressor
    Ridge Regression
    Lasso Regression
    AdaBoostRegressor
    DecisionTreeRegressor
    winsorize
    StandardScaler

    Techniques

    Major Tasks

    • Performing data cleaning, fixing missing values, looking for data summary, correlation, etc.
    • Creating visualizations to explore most important features for better prediction rate.
    • Performing feature transformation and OneHotEncoding to transform categorical variables into numerical.
    • Comparing results of various supervised regression algorithms to achieve lowest possible Mean Absolute Error.

    This project's Jupyter Notebook is not available on my Github.