The best way to Use the pivot_table Operate for Superior Knowledge Summarization in Pandas - KDnuggets

Picture by Creator | Midjourney

Let me information you on find out how to use the Pandas pivot_table perform on your information summarization.

Preparation

Let’s begin with putting in the mandatory packages.

pip set up pandas seaborn

Then, we might load the packages and the dataset instance, which is Titanic.

import pandas as pd
import seaborn as sns

titanic = sns.load_dataset('titanic')

Let’s transfer on to the subsequent part after efficiently putting in the package deal and loading the dataset.

Pivot Desk with Pandas

Pivot tables in Pandas permit for versatile information reorganization and evaluation. Let’s study some sensible functions, beginning with the easy one.

pivot = pd.pivot_table(titanic, values="age", index='class', columns="sex", aggfunc="mean")
print(pivot)

Output>>>
intercourse        feminine       male
class                       
First   34.611765  41.281386
Second  28.722973  30.740707
Third   21.750000  26.507589

The ensuing pivot desk shows common ages, with passenger lessons on the vertical axis and gender classes throughout the highest.

We are able to go even additional with the pivot desk to calculate each the imply and the sum of fares.

pivot = pd.pivot_table(titanic, values="fare", index='class', columns="sex", aggfunc=['mean', 'sum'])
print(pivot)

Output>>>
             imply                   sum           
intercourse         feminine       male     feminine       male
class                                              
First   106.125798  67.226127  9975.8250  8201.5875
Second   21.970121  19.741782  1669.7292  2132.1125
Third    16.118810  12.661633  2321.1086  4393.5865

We are able to create our perform. For instance, we create a perform that takes the info most and minimal values variations and divides them by two.

def data_div_two(x):
    return (x.max() - x.min())/2

pivot = pd.pivot_table(titanic, values="age", index='class', columns="sex", aggfunc=data_div_two)
print(pivot)

Output>>>
intercourse     feminine    male
class                 
First   30.500  39.540
Second  27.500  34.665
Third   31.125  36.790

Lastly, you possibly can add the margins to see the variations between the general grouping common and the precise sub-group.

pivot = pd.pivot_table(titanic, values="age", index='class', columns="sex", aggfunc="mean", margins=True)
print(pivot)

Output>>>
intercourse        feminine       male        All
class                                  
First   34.611765  41.281386  38.233441
Second  28.722973  30.740707  29.877630
Third   21.750000  26.507589  25.140620
All     27.915709  30.726645  29.699118

Mastering the pivot_table perform would mean you can get perception out of your dataset.

Further Sources

Cornellius Yudha Wijaya is a knowledge science assistant supervisor and information author. Whereas working full-time at Allianz Indonesia, he likes to share Python and information suggestions through social media and writing media. Cornellius writes on quite a lot of AI and machine studying subjects.

The best way to Use the pivot_table Operate for Superior Knowledge Summarization in Pandas – KDnuggets

Preparation

Pivot Desk with Pandas

Further Sources

Recent articles

U.S. Sanctions Chinese language Cybersecurity Agency Over Treasury Hack Tied to Silk Hurricane

FTC cracks down on Genshin Impression gacha loot field practices

Malicious PyPi bundle steals Discord auth tokens from devs

New ‘Sneaky 2FA’ Phishing Package Targets Microsoft 365 Accounts with 2FA Code Bypass

Otelier knowledge breach exposes information, lodge reservations of tens of millions

About us

Company

Must Read

Ollama Tutorial: Operating LLMs Domestically Made Tremendous Easy – KDnuggets

Brazilian Hacker Charged for Extorting $3.2M in Bitcoin After Breaching 300,000 Accounts

Singing River Well being System: Information of 895,000 stolen in ransomware assault

Subscribe