Code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from plotnine import *
Kamble Pushkar Sidharth
Kathan Vishal Shah
Ramji Purwar
The Grammar of Graphics is a systematic approach to creating data visualizations by breaking charts into different components.
The Grammar of Graphics breaks a chart into different components such as:
The Grammar of Graphics allows users to build complex plots easily by layering components instead of hardcoding each visualization.
Let’s see that with an example:
mpg | cylinders | displacement | horsepower | weight | acceleration | model_year | origin | name | |
---|---|---|---|---|---|---|---|---|---|
0 | 18.0 | 8 | 307.0 | 130.0 | 3504 | 12.0 | 70 | usa | chevrolet chevelle malibu |
1 | 15.0 | 8 | 350.0 | 165.0 | 3693 | 11.5 | 70 | usa | buick skylark 320 |
2 | 18.0 | 8 | 318.0 | 150.0 | 3436 | 11.0 | 70 | usa | plymouth satellite |
3 | 16.0 | 8 | 304.0 | 150.0 | 3433 | 12.0 | 70 | usa | amc rebel sst |
4 | 17.0 | 8 | 302.0 | 140.0 | 3449 | 10.5 | 70 | usa | ford torino |
The Grammar of Graphics approach allows us to easily add additional information (car origin) to our plot without significantly changing the code structure.
Plotnine is a Python library based on the Grammar of Graphics, providing a structured and systematic way to create data visualizations.
Open command prompt or teminal and give this command:
In this section, we’ll demonstrate each of the key features of Plotnine, showing how they contribute to creating powerful and flexible visualizations using the Grammar of Graphics approach.
The Grammar of Graphics approach allows us to build plots layer by layer. And as plotnine is build on the principles of Grammer of Graphics, it is easier to appply it with help of plotnine.
Let’s start with a basic scatter plot and then add layers to it.
Notice how we can easily add layers to our plot using the +
operator. This makes the syntax simple and intuitive.
plotnine supports various geometries through its geom_* functions, which allow you to create different types of plots. These geometries are based on the Grammar of Graphics concept, similar to ggplot2 in R. Some key geometries include:
These geometries can be combined and layered to create complex visualizations. Let’s combine points and lines in one plot.
np.random.seed(42)
df = pd.DataFrame({
'x': np.random.normal(0, 1, 100),
'y': np.random.normal(0, 1, 100),
'category': np.random.choice(['A', 'B', 'C'], 100)
})
p = (ggplot(df, aes(x='x', y='y', color='category'))
+ geom_point()
+ geom_smooth(method='lm', se=False)
+ labs(title="Scatter Plot with Trend Lines",
x="X-axis",
y="Y-axis",
color="Category")
+ theme_minimal()
)
p.draw()
p.show()
Statistical transformations in plotnine are an important feature that allow you to aggregate and transform your data before plotting. Statistical transformations can compute new values based on the input data, enabling you to display summary statistics or derived metrics instead of raw data points. Here are some common transformations:
By leveraging statistical transformations, plotnine enables you to create informative visualizations that go beyond simply plotting raw data, allowing for more insightful data exploration and presentation. Here’s an example…
Faceting in plotnine is a powerful technique that allows you to create multiple subplots based on categorical variables in your dataset. This feature enables you to split your main plot into several smaller plots, each representing a different category or combination of categories.
Types of faceting:
Plotnine offers a variety of themes for customizing the appearance of your plots.
We can easily customize labels and titles for our plots.
As you’ve seen in all examples, Plotnine works seamlessly with Pandas DataFrames.
Plotnine allows you to save plots easily. Here’s how you can save a plot:
This demonstration showcases the power and flexibility of Plotnine in implementing the Grammar of Graphics. Each feature contributes to making data visualization more intuitive, customizable, and powerful in Python.
Plotnine brings the Grammar of Graphics to Python, offering a clear and flexible way to create data visualizations. It breaks down plots into components like data, aesthetics, and geometries, allowing users to build visualizations layer by layer.
With seamless integration into Pandas and support for statistical transformations, Plotnine is a valuable tool for data scientists. From simple scatter plots to complex faceted charts, it enables customization and clarity in data presentation.
As data visualization remains key to analysis and communication, Plotnine helps create clear, reproducible, and visually appealing charts with ease.