In this blogpost we will go through an introduction of the basic commands of Pandas. If you are using the Python stack for machine learning, then there is probably no way around this useful tool. Pandas is one of the most popular open source python libraries for data analysis that provides high performance and easy-to-use data structures.
But first of all, what actually is data analysis ? Data analysis is the process of analyzing, cleaning, modeling and transforming data with the intention to gain useful insights about your data. With the help of Pandas, we will make a quick transformation of the Rossmann Store Sales Dataset, which is available on Kaggle.
This post was initially created with Jupyter Notebook at Kaggle.com as part of a Competition. Unfortunately with WordPress, it is only possible to display a Jupyter Notebook in a small window, like you can see below. Therefore I would recommend you to view it on Github, because it is better displayed there.