Как найти процент pandas - Сайт, где вы сможете решить свои вопросы

У меня есть DataFrame, который содержит различные значения.

import pandas as pd 
df = pd.DataFrame({"data": [1, 1, 1, 1, 0, 0, 0, 2, 2, 3]})

Я хочу посчитать, сколько процентов от общих данных занимает каждое значение, то есть получить таблицу вида:

value | percent
_____________________
0     | 30 ( или 0.3)
1     | 40 ( или 0.4)
2     | 20 ( или 0.2)
3     | 10 ( или 0.1)

Я могу посчитать так:

# Добавляю еще одну колонку, чтобы нормально посчитать count()
df['column'] = 1
df2 = df.groupby('data').count()
df2['percent'] = df2['column'] / len(df.index)

И получаю искомое:

      column  percent
data                 
0          3      0.3
1          4      0.4
2          2      0.2
3          1      0.1

Однако, меня не покидает ощущение, что я все делаю не так. И подобные вопросы должны решаться намного проще.
Подскажите, как лучше решить мою задачу?

Источник

Improve Article

Save Article

Like Article

Read

Discuss

Improve Article

Save Article

Like Article

A Percentage is calculated by the mathematical formula of dividing the value by the sum of all the values and then multiplying the sum by 100. This is also applicable in Pandas Dataframes. Here, the pre-defined sum() method of pandas series is used to compute the sum of all the values of a column.

Syntax: Series.sum()

Return: Returns the sum of the values.

Formula:

df[percent] = (df['column_name'] / df['column_name'].sum()) * 100

Example 1:

Python3

import pandas as pd

import numpy as np

df1 = {

'Name': ['abc', 'bcd', 'cde',

'def', 'efg', 'fgh',

'ghi'],

'Math_score': [52, 87, 49,

74, 28, 59,

48]}

df1 = pd.DataFrame(df1,

columns = ['Name',

'Math_score'])

df1['percent'] = (df1['Math_score'] /

df1['Math_score'].sum()) * 100

df1

Output:

Example 2:

Python3

import pandas as pd

df1 = {

'Name': ['abc', 'bcd', 'cde',

'def', 'efg', 'fgh',

'ghi'],

'Math_score': [52, 87, 49,

74, 28, 59,

48],

'Eng_score': [34, 67, 25,

89, 92, 45,

86]

}

df1 = pd.DataFrame(df1,

columns = ['Name',

'Math_score','Eng_score'])

df1['Eng_percent'] = (df1['Eng_score'] /

df1['Eng_score'].sum()) * 100

df1

Output:

Last Updated :
28 Jul, 2020

Like Article

Save Article

Источник

17 авг. 2022 г.
читать 1 мин

Вы можете использовать следующий синтаксис для расчета процента от общего числа внутри групп в pandas:

df['values_var'] / df.groupby('group_var')['values_var']. transform('sum')

В следующем примере показано, как использовать этот синтаксис на практике.

Пример. Вычисление процента от общего числа внутри группы

Предположим, у нас есть следующий кадр данных pandas, который показывает очки, набранные баскетболистами в разных командах:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
 'points': [12, 29, 34, 14, 10, 11, 7, 36, 34, 22]})

#view DataFrame
print(df)

 team points
0 A 12
1 A 29
2 A 34
3 A 14
4 A 10
5 B 11
6 B 7
7 B 36
8 B 34
9 B 22

Мы можем использовать следующий синтаксис для создания нового столбца в DataFrame, который показывает процент от общего количества набранных очков, сгруппированных по командам:

#calculate percentage of total points scored grouped by team
df['team_percent'] = df['points'] / df.groupby('team')['points']. transform('sum')

#view updated DataFrame
print(df)

 team points team_percent
0 A 12 0.121212
1 A 29 0.292929
2 A 34 0.343434
3 A 14 0.141414
4 A 10 0.101010
5 B 11 0.100000
6 B 7 0.063636
7 B 36 0.327273
8 B 34 0.309091
9 B 22 0.200000

Столбец team_percent показывает процент от общего количества очков, набранных этим игроком в своей команде.

Например, игроки команды А набрали в сумме 99 очков.

Таким образом, игрок в первой строке DataFrame, набравший 12 очков, набрал в сумме 12/99 = 12,12% от общего количества очков для команды A.

Точно так же игрок во второй строке DataFrame, набравший 29 очков, набрал в общей сложности 29/99 = 29,29% от общего количества очков для команды A.

И так далее.

Примечание.Полную документацию по функции GroupBy можно найти здесь .

Дополнительные ресурсы

В следующих руководствах объясняется, как выполнять другие распространенные операции в pandas:

Pandas: как рассчитать совокупную сумму по группе
Pandas: как подсчитать уникальные значения по группам
Pandas: как рассчитать режим по группе
Pandas: как рассчитать корреляцию по группе

Источник

You can use the following syntax to calculate the percentage of a total within groups in pandas:

df['values_var'] / df.groupby('group_var')['values_var'].transform('sum')

The following example shows how to use this syntax in practice.

Example: Calculate Percentage of Total Within Group

Suppose we have the following pandas DataFrame that shows the points scored by basketball players on various teams:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
                   'points': [12, 29, 34, 14, 10, 11, 7, 36, 34, 22]})

#view DataFrame
print(df)

  team  points
0    A      12
1    A      29
2    A      34
3    A      14
4    A      10
5    B      11
6    B       7
7    B      36
8    B      34
9    B      22

We can use the following syntax to create a new column in the DataFrame that shows the percentage of total points scored, grouped by team:

#calculate percentage of total points scored grouped by team
df['team_percent'] = df['points'] / df.groupby('team')['points'].transform('sum')

#view updated DataFrame
print(df)

  team  points  team_percent
0    A      12      0.121212
1    A      29      0.292929
2    A      34      0.343434
3    A      14      0.141414
4    A      10      0.101010
5    B      11      0.100000
6    B       7      0.063636
7    B      36      0.327273
8    B      34      0.309091
9    B      22      0.200000

The team_percent column shows the percentage of total points scored by that player within their team.

For example, players on team A scored a total of 99 points.

Thus, the player in the first row of the DataFrame who scored 12 points scored a total of 12/99 = 12.12% of the total points for team A.

Similarly, the player in the second row of the DataFrame who scored 29 points scored a total of 29/99 = 29.29% of the total points for team A.

And so on.

Note: You can find the complete documentation for the GroupBy function here.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

Pandas: How to Calculate Cumulative Sum by Group
Pandas: How to Count Unique Values by Group
Pandas: How to Calculate Mode by Group
Pandas: How to Calculate Correlation By Group

Источник

Pandas is a powerful Python library for data manipulation and analysis, and one of its most useful features is the ability to create pivot tables. Pivot tables can help you summarize and analyze large datasets quickly and efficiently, and Pandas makes it easy to create them using the pivot_table() function. You can check out the Pandas Pivot Tutorial if you haven’t already

One common task when working with pivot tables is to calculate percentages. In this blog post, we’ll look at a few examples of how to do that using Pandas.

Lets create some data for us to then pivot

import pandas as pd

data = {'salesperson': ['Alice', 'Bob', 'Charlie', 'Alice', 'Charlie', 'Bob'],
        'product': ['A', 'B', 'C', 'A', 'B', 'C'],
        'sales_amount': [100, 200, 300, 150, 250, 200]}

df = pd.DataFrame(data)

Use Lambda to Create Percentage in Python Pandas Pivot Tables

We use a lambda function to calculate the percentage of column for each dimension in the row. The lambda function takes the values in the pivot table as input and applies the calculation that we can insert into aggfunc section of the pivot table. Here is an example of a lambda function lambda x: sum(x) / sum(df['column']) * 100) to them.

Suppose you have a dataset of sales data, with columns for the salesperson’s name, the product sold, and the sales amount. You want to create a pivot table that shows the percentage of total sales for each salesperson. Here’s how to do it:

Percent of Grand Total in a Pandas Pivot Table

pivot_table = pd.pivot_table(df, 
                             values='sales_amount',
                             index='salesperson',
                             columns='product',
                             fill_value=0, aggfunc=lambda x: round(sum(x)/sum(df['sales_amount']) * 100, 2))

pivot_table

Note that we are using the fill_value parameter to add zeros to avoid the NA values.

This the percent of all sales

Percent of Column Total in a Pandas Pivot Table

To get the percent of the column total we can use the same lambda function. However, we will need to apply it to the function to the columns. We can do this using the apply function. This needs to be applied after the table has been created. Use the pandas data frame from above to practice.

pd.pivot_table(data=df, index='Salesperson',
               columns='Product',
               values='Sales',
               aggfunc=sum,
               fill_value=0).apply(lambda x: x*100/sum(x))

Percent of Row Total in a Pandas Pivot Table

Now we will need to take a few extra steps to achieve this. We will create the pandas pivot table as we did in the first section. Once the table has been completed, we can now use the div() function. The div function will allow us specify how to divide the values in one DataFrame by the values in another DataFrame or a scalar value. The div() function takes several parameters, including other, fill_value, level, and axis. Since we can specify the axis we do take the percent of the row total. However, we need the row total which we can get with the margins function in our pivot table. Lets see how all this works together

table = pd.pivot_table(data=df, index='Salesperson',
               columns='Product',
               values='Sales',
               aggfunc=sum,
               fill_value=0,
               margins=True)
table.div(table.iloc[:,-1], axis=0 )

Using margins = True parameter will give us the total sum of the columns in a new column and specifying thata column using iloc[:,-1] within the divide function to get the percent of row total.

Источник

Python3

Python3

Пример. Вычисление процента от общего числа внутри группы

Дополнительные ресурсы

Example: Calculate Percentage of Total Within Group

Additional Resources

Use Lambda to Create Percentage in Python Pandas Pivot Tables

Percent of Grand Total in a Pandas Pivot Table

Percent of Column Total in a Pandas Pivot Table

Percent of Row Total in a Pandas Pivot Table

Вам также может понравиться

Как найти телевизионный сигнал

Как найти чужие деньги на улице

Как найти награды тружеников тыла

Добавить комментарий Отменить ответ