Descriptive Statistical Analysis helps you to understand your data and is a very important part of Machine Learning. This is due to Machine Learning being all about making predictions. On the other hand, statistics is all about drawing conclusions from data, which is a necessary initial step. In this post you will learn about the most important... Continue Reading →

# The Random Forest Algorithm

Random Forest is a flexible, easy to use machine learning algorithm that produces, even without hyper-parameter tuning, a great result most of the time. It is also one of the most used algorithms, because it's simplicity and the fact that it can be used for both classification and regression tasks. In this post, you are going... Continue Reading →

# Binary Classification Project (Titanic Dataset)

In this post I will go through the whole process of creating a machine learning model on a given dataset. I used the Titanic dataset, which is very famous among the machine learning scene and used by many beginners all over the world. It provides information on the fate of passengers on the Titanic, summarized according... Continue Reading →

# Linear Regression

Linear regression is one of the most popular and best understood algorithms in the machine learning landscape. Since regression tasks belong to the most common machine learning problems in supervised learning, every Machine Learning Engineer should have a thorough understanding of how it works. This blogpost covers how the linear regression algorithm works, where it is... Continue Reading →

# Introduction to Pandas

In this blogpost we will go through an introduction of the basic commands of Pandas. If you are using the Python stack for machine learning, then there is probably no way around this useful tool. Pandas is one of the most popular open source python libraries for data analysis that provides high performance and easy-to-use data structures.... Continue Reading →

# Time Series Forecasting

Time Series forecasting is a very important area of machine learning, because there are a lot of prediction tasks that involve a time component. Examples are the prediction of a stocks closing price or forecasting a companies sales. After reading this post you will know about the basic concepts of Time Series Forecasting and how... Continue Reading →

# Predicting Housing Prices with Linear Regression

In this Post I will go through the workflow of a full machine learning project with the Ames housing dataset, using Linear Regression. This post was initially created with Jupyter Notebook. Unfortunately with WordPress, it is only possible to display a Jupyter Notebook in a small window, like you can see below. Therefore I would recommend... Continue Reading →

# Converting categorical features (Label Encoding, One-Hot-Encoding)

Most of the machine learning algorithms can only process numerical values. Since a lot of the datasets out there have categorical variables, a Machine Learning engineer needs to be able to convert these categorical values into numerical ones, using the right approach. Therefore he needs to know the tools that are out there and also... Continue Reading →

# Visualizing data with Matplotlib

Data visualization is the process of understanding data better and gaining insights from it by placing it in a visual context. Data visualization has become one of the most sought after skills. If you understand your data well, you will know what you have to do with it, to build a cutting edge machine learning... Continue Reading →