top of page

Python ML Guided Project Credit Card Approvals Classification - Level 4, 40 min

Updated: Aug 21, 2023

In this supervised learning Python ML guided project we will be predicting whether or not someone was approved for a credit card. In this Python project, we will only be using the LogisticRegession() in Sklearn for this supervised classification problem. Although this project would work with RandomForestClassifier() or GradientBoostingClassifier() we will be focusing on error analysis in this classification problem. Understanding precision versus recall and why we would want to focus on one versus the other. Understanding which errors matter to us allows us to improve either recall or precision score, whichever score is important for our business use case.






Part 1



Part 2




Part 3


Part 4







Send data science teacher brandyn a message if you have any questions



dataGroups:








Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze, yellowbrick, descrimination threshold, error analysis
Descrimination Threshold

As part of our error analysis in this supervised classification Python project we will use Yellowbrick's DescriminationThreshold() to understand how to changing our decision threshold will affect our final precision or recall scores. This is usually a give-and-take type of data relationship.



Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze
plot distributions in for loop

We will use a for loop in Python to plot many distributions altogether.


Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze
plot value_counts in for loop

We will also use a for loop to print out the value_counts() with Pandas of each categorical feature.


Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze
treating outliers

Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze, treating outliers,  outlier, truncate
use .clip in pandas to truncated outliers

After identifying outliers we will use .clip() in Pandas to truncated our outliers to get our data ready for your machine learning model.


Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze,  confusion matrix, error analysis, classification problem
function to build confusion matrix

Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze, confusion matrix
confusion matrix

We will build a user defined function in Python that will plot our confusion matrix of the train and test data sets.

Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze, error analysis, manual error analysis
manual error analysis

We will use logical indexing in Pandas to isolated the rows with incrroect predictions and analyze what is different with them compared to the dataset as a whole. The hope is this will give us clues on features to engineer for better predictions.


Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze, classification report,  yellowbrick
Yellowbrick classification report

Use ClassificationReport from Yellowbrick to look at precision and recall from the perspective of each class in your predictions. Precision and Recall are in terms of the positive prediction by default.


Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze, yellowbrick, rocauc, erroranalysis
ROCAUC plot in Yellowbrick

The ROCAUC plot is Yellowbrick can be valuable in understanding our Logistic Regression predictions.


Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze, yellowbrick, ml, precisionrecallcurve, erroranalysis
PrecisionRecallCurve in Yellowbrick

Next we will use Yellowbrick's PrecisionRecallCurve to understand the relationship between precision and recall better.


Instruct, instructional, instructional education,free python learn, seaborn, python, project, data analysis project, pandas, analyze, sklearn,  engineer features for ML
Engineering Features to help our Machine Learning Model make Predictions

With our error analysis done, we can now engineer features from those insights to make our model predict better.


589 views2 comments
bottom of page