top of page

DatosConsejos simples para el análisis de datos

Explora en detalle Pandas,nacido en el mar, Yellowbrick, Plotly y Shap, Aprenda cómo hacer hermosos gráficos y cómo extraer información de su análisis de datos.  Un analista de datos necesita proporcionar información a los socios comerciales y a un ingeniero de aprendizaje automático.   Los conocimientos necesarios pueden ser muy diferentes y la comprensión de los datos se utilizará de diferentes maneras.  Perfeccionemos nuestras habilidades de análisis de datos de Python y parcelas connacido en el mar, Pandas, Plotly y Shap.

Python Data Anaysis Guided Projects

Consejos para el análisis de datos de Seaborn en Python

Sumérjase en el análisis de datos con Seaborn.  La biblioteca de Python crea hermosos gráficos pero también mejora la capacidad de extraer información de su análisis de datos.  repasar consejos desde principiante hasta avanzado sobre cómo aprovechar al máximo su análisis de datos de Python en seaborn.

  Análisis univariante


Drugged Islanders

 Level 1, 15 minutes

Use your Python data analysis skills to study what happened when isolated islanders were given drugs.  Perform data analysis on a medical dataset with Pandas and learn how to build your first workflow to extract insights to will provide real-world understanding.





Aprenda a usar el diagrama de caja de Seaborn para resaltar los valores atípicos que le preocupan.


Netflix Movies and Tv Shows

Level 5, 33 minutes

We will use Pandas and Seaborn to perform our data analysis and better understand when new titles are released. Use WordCloud in Python to better understand the most common words in Movie and TV Show titles.  


Sri Lanka Economic Analysis

Level 7, 26 minutes

From the years 1966 to 2022 we will explore the Sri Lanka economy in our economic data analysis. We will attempt to develop a theory as to why the Sri Lanka economy turn into the 2022 economic crisis.  Use the data to understand this real would economic situation.

Beginner Python Data Analysis of Dog Breed

Dog Breed Analysis

Level 2, 29 minutes

We will use our Python data analysis skills in this beginner data analysis project to understand the eye color, fur color, and height of common dog breeds.

To start our Python data analysis project we will start by doing a little processing to enable our analyses.  This is needed because of the semi-structured data format that happens when we have a list of different sizes.


In the development of a marketing strategy or atleast many different ideas for a marketing strategy based on the data.  We get practice collecting observations and then at the end of our project we put these insights together in creative ways to offer ideas for a potential marketing strategy based on our data analysis.


Ukraine War War Data Analysis 

Level 5, 35 minutes

Write your own user defined and use the .apply function in Pandas to apply the functions and provide valuable business insights to this Supermarket chain using your data analysis skills.  Extract insights and compile them in interesting ways in the summary.


Data Analyst Job Listings

Level 6, 23 minutes

Use your data analysis skills to understand Data Analyst job listing and understand Data Job Market better.  Explore job titles of different analyst positions business analyst, data analyst, BI analyst, and see what in what industries these sub-job groups are popular in


Spaceship Titanic Data Analysis Machine Learning Prep - Level 8, 23 minutes

Use pandas and seaborn in Python to analyze the spaceship titanic data from the Kaggle competition as if you were prepping it for a data scientist. extract insights using pandas seaborn and created user-defined functions to keep your analyses clean and to the point coding.  

Python Real Data Analysis
Guided Projects

Real data sets have added difficulties that datasets found on Kaggle just don't have.  On the job you will be working with these issues and in the Python Real Data Analysis we will our real datasets, Google Forms Survey Responses, Google Analytics and so much more and analyze real-world business problems.

  Análisis univariante

Real World Data Analysis

We will use Pandas and Seaborn to understand the responses on Googles form.  The goal of this survey was to understand how to improve these guided projects.  However to analyze the data we need to spend a little extra time cleaning up the data which is common in real world datasets.

Python Machine Learning Guided Projects

Explore the many Machine Learning models in Python with Sklearn.  Machine Learning is very powerful is the tasks it can handle.  Let's look at regression and classification problems with Sklearn and use models like LinearRegression, ARDRegression, DecisionTrees, RandomForest, GradientBoosting, and NuSVR.

  Análisis univariante

This starter project is great for those new to Sklearn and machine learning.  Learn how to set up an ML workflow.  Use pandas and seaborn in Python to perform your data analysis.  Then use Sklearn to do the train test split and make your final test predictions.


Mobile Phones Price Prediction

Level 3, 30 minutes

Ever wonder why cell phones are so expensive? In this Python ML guided project we will predict cell phone prices using Sklearn ensemble methods. In our supervised learning workflow.  We will use RandomForest, Bagging AdaBoost and GradientBoosting.  It's important to understand how each ensemble method performs.


Credit Card Approvals 

Level 4, 40 minutes

 In this Python project, we will use Sklearn for this supervised classification problem.  We will focus on error analysis in this classification problem. Understanding precision versus recall and why we would want to focus on one versus the other.  Will we be using the error analysis tool in Yellowbrick to try and improve our model's score.


 In this Python project, we will use Sklearn for this supervised classification problem.  We will focus on error analysis in this classification problem. Understanding precision versus recall and why we would want to focus on one versus the other.  Will we be using the error analysis tool in Yellowbrick to try and improve our model's score.

shaply values

Follow along with this Python Regression Project.  Here we will deal with a common problem in house price predictions.  Too many features and how to choose which to use.  We will use Shaply values to help us determine the real impact of each feature on the final prediction and then which can be removed as they don't help.

Python Simple Intructional ML Random Forest Project

In this Python guided project, you can follow along and build your first Simple Random Forest machine-learning model. In this Python project, we will use RandomForestClassifier from Sklearn. In is a good idea when doing an ML workflow to have a simple base model that your more robust model will try and beat. In this situation, Logistic Regression acts like our base model and Random Forest acts like our robust complex model.


Classic Car MPG

Level 3, 24 minutes

In this Python Regression project, we will be predicting the MPG of classic cars.  Use ensemble methods like RandomForestRegressor,  and GradientBoostingRegressor in the supervised machine learning project.  This is a great beginner Python project to practice machine learning with ensemble methods.


Polish Car Price Regression

Level 5, 40 minutes

Predict the price of cars in Poland.  This supervised learning problem in Python is a regression problem.  In this project with will focus on linear regression techniques including PassiveAgressiveRegressor and ARDRegression models.  Ever wonder which is the best machine learning regression model?

coming soon

coming soon


In the second part of the Python Guided Machine Learning Project, the data scientist picks up where the data analyst left off.  We use the data analyst's sights to guide the data scientist.
This is extremely helpful in a team setting so the data scientist can focus on building the model. And as we see there is a lot to try when building a model.  

Python Deep Learning
Guided Projects

Type of data for your deep learning model:

In the Python Deep Learning projects, we will explore how to build computer vision and NLP models in Tensorflow.  Neural Networks can handle a large number of regular machine-learning task like classification and regression.  In our Python Deep Learning Projects we will explore all that is possible with artificial intelligence. 

  Análisis univariante

DataFrame / Spreadsheet Deep Learning Model

In the first part of our LSTM stock prediction series we will start by getting our deep-learning neural network. In our Tensorflow model, we use the LSTM layer but either the SImpleRNN or GRU layers can easily be substituted in the Python code.  This is a great opportunity to build intuition around the impact of each recurrent layer on our time series prediction. 

LSTM Google Stock Prediction  part 2 -Seasonal Decomposition Time Series

Level 2, 22 minutes

In our second part of the LSTM stock prediction using Tensorflow.  Now that we have our deep learning model with LSTM recurrent layer set up we focus on the times series part of our project and dive deep into the seasonal decomposition of our MACD, RSI, Fast Stochastic, and pct_change indicators...


In this Python deep learning guided project we will first predict with Sklearn's Random Forest 

to set a baseline prediction for our Tensorflow's Sequential model. In this Python deep learning project we will also set up model experiments to iteratively find the best architecture and try and best the Sklearn models. 

Continuing our LSTM stock prediction Python guided deep learning project.  In our part 4 part of our Tensorflow Stock prediction project, we start with the functional API in Tensorflow.  The functional API will allow us to ingest 3 different sources of data and extract higher-level features from each dataset before combining it with the next channel inside your neural network.


In the 3rd part of our stock market prediction use the LSTM layer to predict google stock.  use TensorFlow and build a neural network to predict.  In the project, we pick up where we left off and include economics like Average Weekly Hours economic statistics and more with our times series prediction.  We also focus on finding the right architecture for your neural network with TensorFlow in Python.  



  Análisis univariante

NLP Deep Learning Tensorflow

In this Deep Learning TensorFlow Guided Project, we will use the LSTM recurrent layer along with an embedding layer and dense fully connected layer in Python to predict if the news Headlines are sarcastic or not.

After we've done the essential NLP processing for deep learning we are ready to start building our architecture. We start with an embedding layer and next in our deep learning TensorFlow model we add a recurrent layer. You can use SimpleRNN, GRU, or LSTM layers but here we choose to use the LSTM layer.

In this simple TensorFlow multiclass prediction problem in Python we will be an NLP model and attempt to classify poems as Affection, Death, Environment, or Love. Poems are a very difficult NLP classification due to the abstract nature of the writing makes this a very difficult dataset to get a very high score. Our goal will be an accuracy of .50, which is normally very low but very difficult to achieve on Poem classification.


We will build our model today in the Sequential model in TensorFlow. The first layer in our model will be an Embedding layer. We will follow our embedding layer with a special type of dropout layer called Spatial Dropout and this drops each neuron by changing not a certain percentage of each layer. This allows some epochs to have more or less total neurons turned off. This type of dropout works for NLP problems.

Follow along with this Free Guided TensorFlow project. In this free Python Project, we will use TensorFlow to build a deep learning model that will generate names. Here we will mix names of Indian Gods with Anime characters, why? Well because we wanted to have a little fun and be a little creative with our NLP TensorFlow Text Generation Project.


Name generation is actually easier than you would think, simply put your deep learning model that predicts the next character in a loop. The hard part is designing your deep learning model to be accurate but also a little crazy so we get some interesting generated names.


Natural language projects are a little tricky and can be quite different than working with a typical dataset.  We explore the data with WordCloud to understand and look for spelling errors.  We then use Tensorflow to sequence and pad the data to get it ready for our neural network.    

  Análisis univariante

Computer Vision Deep Learning Model Tensorflow

In this simple TensorFlow Computer Vision Guided Project in Python we a convolution neural network or deep learning model to predict a binary classification problem and predict with an X-ray image containing a chest X-ray image with or without pneumonia. 

Medical Image classification is a common and important type of computer vision problem  In future projects we will explore the many various types of image preprocessing available on our images to enhance our predictions.

In this Simple Computer Vision guided project in TensorFlow using Python we will build a deep learning model to classify whether a dessert is an Apple Pie, French Toast, Donuts, or Cheesecake. In the near future, a company is building a machine to automate serving food to customers. The problem is that the Ai needs to be able to tell what food serving it is serving to customers.


Follow along in the Simple Python Guided Project Multiclass Computer Vision and classify which dessert is in the images. In the deep learning project, we will use the Sequential model in TensorFlow. To our Sequential model, we will add a set of two Convolutional layers. Convolutional layers tend to work well as pairs. Each convolution is followed with max-pooling layers and drop-out layers. We use max pooling because it was the same effect as sharpening images in a simple photo editor. These are following my dropout layers to prevent overfitting.


The hardest part of Computer Vision isn't the architecture is setting up the data for the model.  Too many pictures are needed than can fit in your memory, TensorFlow's ImageDataGenerator allows you to connect your model with the folder that contain your images and allows you to pull images and augment new ones each epoch.

bottom of page