
    
        
        
        
            
{"version":"https:\/\/jsonfeed.org\/version\/1","title":"mathspp.com feed","home_page_url":"https:\/\/mathspp.com\/blog\/tags\/data-science","feed_url":"https:\/\/mathspp.com\/blog\/tags\/data-science.json","description":"Stay up-to-date with the articles on mathematics and programming that get published to mathspp.com.","author":{"name":"Rodrigo Gir\u00e3o Serr\u00e3o"},"items":[{"title":"Learn pandas and matplotlib with Pok\u00e9mon","date_published":"2023-12-17T00:00:00+01:00","id":"https:\/\/mathspp.com\/blog\/learn-pandas-and-matplotlib-with-pokemon","url":"https:\/\/mathspp.com\/blog\/learn-pandas-and-matplotlib-with-pokemon","content_html":"<p>This tutorial uses Pok&eacute;mon to introduce readers to data science with pandas and matplotlib.<\/p>\n\n<p>This tutorial will teach you the basics of data science with pandas and matplotlib with Pok&eacute;mon as an example.\nWe will use Pok&eacute;mon data from the first 8 generations to learn what are pandas series and dataframes, what's categorical data and how broadcasting works, and more.\nWe'll also use matplotlib to learn about line and bar plots, scatter plots, and violin plots, all while studying the strengths and weaknesses of Pok&eacute;mon.<\/p>\n<p>By the time you're done with this tutorial, you'll know enough to get started with pandas on your own data science projects, you'll be able to use matplotlib to create publication-ready plots, and as a byproduct you will have learned a bit more about Pok&eacute;mon.<\/p>\n<p>You can also <a href=\"\/blog\/learn-pandas-and-matplotlib-with-pokemon\/analysis.ipynb\">download the notebook<\/a> if you want to have an easier time testing the code I'm showing and <a href=\"https:\/\/github.com\/mathspp\/mathspp\/blob\/master\/pages\/02.blog\/learn-pandas-and-matplotlib-with-pokemon\/pokemon.csv\" target=\"_blank\" rel=\"nofollow noopener noreferrer\" class=\"external-link no-image\">be sure to download the data we'll be using<\/a>!<\/p>\n<h2 id=\"objectives\">Objectives<a href=\"#objectives\" class=\"toc-anchor after\" data-anchor-icon=\"#\" aria-label=\"Anchor\"><\/a><\/h2>\n<p>In this tutorial, you will learn the basics of pandas and matplotlib.\nYou'll learn:<\/p>\n<ul><li>how to load data into pandas and get a first feel for what the data is;<\/li>\n<li>how pandas handles data types and how those differ from the built-in Python types;<\/li>\n<li>what a pandas series is;<\/li>\n<li>what a pandas dataframe is;<\/li>\n<li>how to manipulate columns of a dataframe; and<\/li>\n<li>what broadcasting is and how it works.<\/li>\n<\/ul><h2 id=\"setup\">Setup<a href=\"#setup\" class=\"toc-anchor after\" data-anchor-icon=\"#\" aria-label=\"Anchor\"><\/a><\/h2>\n<p>Pandas is a library that is the de facto standard to do data analysis in Python, and we'll be using it here.\nTo start off, make sure it's installed (<code>pip install pandas<\/code>):<\/p>\n<pre><code class=\"language-python\">import pandas as pd<\/code><\/pre>\n<p>Importing pandas as <code>pd<\/code> is a common abbreviation, since you'll be using pandas a lot.<\/p>\n<p>Then, the best way to start is to grab some data (<a href=\"https:\/\/github.com\/mathspp\/mathspp\/blob\/master\/pages\/02.blog\/learn-pandas-and-matplotlib-with-pokemon\/pokemon.csv\" target=\"_blank\" rel=\"nofollow noopener noreferrer\" class=\"external-link no-image\">the file <code>pokemon.csv<\/code><\/a>) and load it in with the function <code>read_csv<\/code>:<\/p>\n<pre><code class=\"language-python\">pokemon = pd.read_csv(\"pokemon.csv\")<\/code><\/pre>\n<h2 id=\"first-look-at-the-data\">First look at the data<a href=\"#first-look-at-the-data\" class=\"toc-anchor after\" data-anchor-icon=\"#\" aria-label=\"Anchor\"><\/a><\/h2>\n<p>When you load a new dataset, the first thing you want to do is to take a global look at the data, to get an idea for what you have.<\/p>\n<p>The best first thing you can do is inspect the &ldquo;head&rdquo; of the data, which corresponds to the first few rows:<\/p>\n<pre><code class=\"language-python\">pokemon.head()<\/code><\/pre>\n<div>\n<style scoped>\n    .dataframe {\n        display: block;\n        overflow-x: auto;\n        white-space: nowrap;\n    }\n\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n<\/style><table border=\"1\" class=\"dataframe\"><thead><tr style=\"text-align: right;\"><th><\/th>\n      <th>national_number<\/th>\n      <th>gen<\/th>\n      <th>english_name<\/th>\n      <th>japanese_name<\/th>\n      <th>primary_type<\/th>\n      <th>secondary_type<\/th>\n      <th>classification<\/th>\n      <th>percent_male<\/th>\n      <th>percent_female<\/th>\n      <th>height_m<\/th>\n      <th>...<\/th>\n      <th>evochain_1<\/th>\n      <th>evochain_2<\/th>\n      <th>evochain_3<\/th>\n      <th>evochain_4<\/th>\n      <th>evochain_5<\/th>\n      <th>evochain_6<\/th>\n      <th>gigantamax<\/th>\n      <th>mega_evolution<\/th>\n      <th>mega_evolution_alt<\/th>\n      <th>description<\/th>\n    <\/tr><\/thead><tbody><tr><th>0<\/th>\n      <td>1<\/td>\n      <td>I<\/td>\n      <td>Bulbasaur<\/td>\n      <td>Fushigidane<\/td>\n      <td>grass<\/td>\n      <td>poison<\/td>\n      <td>Seed Pok&eacute;mon<\/td>\n      <td>88.14<\/td>\n      <td>11.86<\/td>\n      <td>0.7<\/td>\n      <td>...<\/td>\n      <td>Level<\/td>\n      <td>Ivysaur<\/td>\n      <td>Level<\/td>\n      <td>Venusaur<\/td>\n      <td>NaN<\/td>\n      <td>NaN<\/td>\n      <td>NaN<\/td>\n      <td>NaN<\/td>\n      <td>NaN<\/td>\n      <td>There is a plant seed on its back right from t...<\/td>\n    <\/tr><tr><th>1<\/th>\n      <td>2<\/td>\n      <td>I<\/td>\n      <td>Ivysaur<\/td>\n      <td>Fushigisou<\/td>\n      <td>grass<\/td>\n      <td>poison<\/td>\n      <td>Seed Pok&eacute;mon<\/td>\n      <td>88.14<\/td>\n      <td>11.86<\/td>\n      <td>1.0<\/td>\n      <td>...<\/td>\n      <td>Level<\/td>\n      <td>Ivysaur<\/td>\n      <td>Level<\/td>\n      <td>Venusaur<\/td>\n      <td>NaN<\/td>\n      <td>NaN<\/td>\n      <td>NaN<\/td>\n      <td>NaN<\/td>\n      <td>NaN<\/td>\n      <td>When the bulb on its back grows large, it appe...<\/td>\n    <\/tr><tr><th>2<\/th>\n      <td>3<\/td>\n      <td>I<\/td>\n      <td>Venusaur<\/td>\n      <td>Fushigibana<\/td>\n      <td>grass<\/td>\n      <td>poison<\/td>\n      <td>Seed Pok&eacute;mon<\/td>\n      <td>88.14<\/td>\n      <td>11.86<\/td>\n      <td>2.0<\/td>\n      <td>......<\/td><\/tr><\/tbody><\/table><\/div>","summary":"This tutorial uses Pok\u00e9mon to introduce readers to data science with pandas and matplotlib.","date_modified":"2025-07-23T16:49:02+02:00","tags":["data science","matplotlib","pandas","programming","python"],"image":"\/user\/pages\/02.blog\/learn-pandas-and-matplotlib-with-pokemon\/thumbnail.webp"}]}
