One of my favorite things about tech is the ability to work with data. It’s everywhere, and it has stories to tell. Data analysts, engineers, and scientists can peek at the data, shape the data, and create visualizations that tell those stories.
Kaggle is one of the many resources I use as I work on building a practical data engineering curriculum. Recently, I discovered the Ramen Ratings dataset. I grew up with Maruchan, and I married into a Chinese family, so I’ve been exposed to various other brands as well.
I wanted to poke around the data and generate some visualizations to see what stories I could find about ramen.
With the 2580 records, I was curious to see which brands had the most entries in the dataset. Using pandas, I created a bar graph to show the top 10 brands with the most amount of frequency in the dataset.
Looking at this graph, Nissin had the most ratings by a longshot. Do we know why this is? No. But there are quite a few hundred more Nissin ratings compared to the others. Also, keep in mind that I am looking at the top 10 Brands with values – there are many others. In fact, there are 355 different Brand values in this dataset.
I was curious to see if the country mattered when it came to 5-star rated ramen. There are many countries in the list with 5-star rated ramen. This horizontal bar graph below shows that Japan had the most 5-star ratings, with mostly Asian countries except the USA.
As I wanted to see how the ratings were distributed in the dataset, I created a histogram to understand this:
Looking at the histogram, there are a higher amount of ratings between 3.5-5.0.
Since Style is one of the fields we can access, I was curious to see how the ratings stacked up for each style. I created a boxplot, styled with Seaborn, to see how the ratings were by Style:
Looking at this box plot, most of the styles rated between 3 and 4.5, with Box ramen having a lot of 5-star ratings. There is one Can entry in this dataset, with a 3.5-star rating – Pringles Nissin Top Ramen Chicken Flavor Potato Crisps. And that Bar… for ramen eaters who are thinking… “Ramen Bar?!? What…?!?” Komforte Chockolate released a Ramen Bar that has a 5-star rating in this dataset.
This gives you a high level overview of what to expect if you ever visit The Ramen Rater. It looks like he reviews a variety of brands – though Nissin is the most common by a longshot. There are a variety of forms that ramen comes in, and the form doesn’t seem to matter – the reviews are still across the board in the average to awesome range, as opposed to less than 3.
This also gives you some insight on creating visualizations – these were created using Python, pandas, and Seaborn.
If you want to learn more about how I created this visualizations, reach out to my teammates at Stage 3 Talent via email at firstname.lastname@example.org and ask them about the data engineering course that is coming soon!