Employee Salary Prediction Model

                To view the code, please click on the GitHub repository link. Click Here 


PROBLEM STATEMENT?


  • In this competitive world, everyone holds high goals and expectations for their dream jobs. For an employer, predicting a justified salary for an employee is a challenging task. 
  • They cannot arbitrarily offer the expected salary to everyone. 
  • Now, the question arises: How can the right salary be determined for an employee?

SOLUTION:


  • Design a system/website that takes inputs such as employee details (experience, qualifications, skills, test score, interview score, etc.) and automatically determines/predicts the employee's salary.

OBJECTIVE:


  • To build a machine learning model that predicts the salary of an employee based on interview score, test score, and experience.

TASK PERFORMED


DATA UNDERSTANDING AND CLEANING : 


  • Data understanding is the first step in data science project. After properly understand the data, move to the next step i.e data cleaning.
  • To prepare the data model, proper data cleaning is necessary. In our case, one column contains some missing values. 
  • We have two methods to address these missing values: deletion and imputation. We will opt for the imputation method due to the limited size of our dataset. 
  • By calculating the mean of column values, missing values can be effectively imputed.

FEATURE TRANSFORMATION:


  • Feature transformation is a critical step in machine learning since, ultimately, building a regression model requires numeric data. 
  • In our dataset, the 'experience' column is a categorical variable, and we will transform it into a numerical format.

DATA SPLITTING:


  • Now that we have cleaned the data, it's time to split it into training and testing datasets. 
  • When splitting the dataset, we maintain a ratio of 0.8 to ensure that the model is trained on a substantial portion of the data.

MODEL BUILDING:


  • I used the linear regression algorithm to build the model because the target variable 'salary' in the dataset is of a numerical nature.

MODEL EVALUATION:


  • To evaluate the performance of the model, I utilized evaluation metrics such as mean absolute error and mean squared error, which indicate the extent of deviation between our predicted and actual values. 
  • Additionally, I employed the r2_score metric, which gauges the extent to which the target variable is explained by the independent variables

RESULTS:


  • Successfully built a machine learning model to predict employee salaries with an accuracy of 83%.