When someone borrows some money from someone or some organization, in financial term it is known as loan. Sanctioning a loan isn’t an easy job, there are some procedures on which it depends whether the person or eligible or not. With machine learning, one can predict the person lending money will be a defaulter or not. Machine learning is like a hot cake in today’s technical field. Name something and machine learning has the answer to most of the questions. From detecting face to lane detection to suggesting products, machine learning is used everywhere.
The objective of the problem is to pick out which customer will be able to pay the debt and which customer is likely will not be able to pay the debts. Clearly we have to create a classification model here. We have to use algorithms like logistic regression, decision tree or random forest. We need to create a model that is accurate and the error percentage should be less.
Machine Learning Kit will be shipped to you and you can learn and build using tutorials. You can start for free today!
Concepts That Will Be Used:
1. Download the loan prediction data set from kaggle.
2. Import necessary python libraries. Import numpy, matplotli, pandas and seaborne. The code is given below.
Import numpy as np
Import pandas as pd
Import matplotlib.pyplot as plt
Import seaborne as sns
We have renamed the libraries with aliases for simplicity.
3. Read test data set and training data set. The data set has to be divided into two parts. One is a training data set and another is training data set. Preferably you should take 80% as a training set and 20% as a test data set.
df_train = pd.read_csv(“training.csv”)
df_test = pd.read_csv(“test.csv”)
4. Check for missing values. Drop the rows and columns that are not needed in this project.
5. You can also fix the values that are not present in the data set or have null values.
Want to develop practical skills on Machine Learning? Checkout our latest projects and start learning for free
6. Now let us visualize the data. In x-axis, we will be plotting count and in the y-axis, we will plot gender. The code is given below.
Sns.countplot(y = ‘gender’, hue = ‘loan’, data = df_train). Now continue the same process with married, age, employment status, credit history and other data (fields) you have in the data set. Just change the y-axis accordingly while the x-axis will remain the same. Credit history includes data of the customers’ previous loan whether the person was a defaulter in his previous transactions.
7. Logistic regression is used in that problem where the outcome is a binary type (either 0 or 1). The decision tree is also used in classification problems. The accuracy of logistic regression is 77%, whereas the accuracy of the decision tree is 64%. So you should use logistic regression for more accurate results.
Predicting a person will be able to pay the debts manually is a tiring job and not always as accurate as we get when we use Machine Learning.
Skyfi Labs helps students learn practical skills by building real-world projects.
You can enrol with friends and receive kits at your doorstep
You can learn from experts, build working projects, showcase skills to the world and grab the best jobs.
Get started today!
Stay up-to-date and build projects on latest technologies