Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article
In this article, we will see how to search a value within Pandas DataFrame row in Python.
Importing Libraries and Data
Here we are going to import the required module and then read the data file as dataframe.
The link to dataset used is here
Python3
import
pandas as pd
df
=
pd.read_csv(
"data.csv"
)
Output:
Searching a Value
Here we will search the column name with in the dataframe.
Syntax : df[df[‘column_name’] == value_you_are_looking_for]
where df is our dataFrame
We will search all rows which have a value “Yes” in purchased column.
Python3
df[df[
"Purchased"
]
=
=
"Yes"
]
Output:
We can also use more than one condition to search a value. Lets see a example to find all rows which have Age value between 35 and 40 inclusive.
Syntax : df[condition]
where df is our dataFrame
Python3
df[(df[
"Age"
] >
=
35
) & (df[
"Age"
] <
=
40
)]
Output:
Last Updated :
01 Dec, 2021
Like Article
Save Article
Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article
Prerequisites: pandas
In this article let’s discuss how to search data frame for a given specific value using pandas.
Function used
- where() -is used to check a data frame for one or more condition and return the result accordingly. By default, The rows not satisfying the condition are filled with NaN value.
- dropna() -This method allows the user to analyze and drop Rows/Columns with Null values. In this article it is used to deal with the cases where the rows that will have value as NaN because they will not satisfy a certain condition.
Approach
- Import modules
- Create data
- Traverse through the column looking for a specific value
- If matched, select
There is a basic difference between selecting a specific values and selecting rows that have a specific value. For the later case the indices to be retrieved has to be stored in a list. Implementation of both cases is included in this article:
Data frame in use:
Example 1: Select tuple containing salary as 200
Python3
import
pandas as pd
x
=
pd.DataFrame([[
"A"
,
100
,
"D"
], [
"B"
,
200
,
"E"
], [
"C"
,
100
,
"F"
]],
columns
=
[
"Name"
,
"Salary"
,
"Department"
])
for
i
in
range
(
len
(x.Name)):
if
200
=
=
x.Salary[i]:
indx
=
i
x.iloc[indx]
Output:
Example 2: Search for people having salary of 100 and store the output in a dataframe again.
Python3
import
pandas as pd
x
=
pd.DataFrame([[
"A"
,
100
,
"D"
], [
"B"
,
200
,
"E"
], [
"C"
,
100
,
"F"
]],
columns
=
[
"Name"
,
"Salary"
,
"Department"
])
indx
=
[]
for
i
in
range
(
len
(x.Name)):
if
100
=
=
x.Salary[i]:
indx.append(i)
df
=
pd.DataFrame()
for
indexes
in
indx:
df
=
df.append(x.iloc[indexes])
df
=
x.where(x.Salary
=
=
100
)
df.dropna()
Output:
Last Updated :
17 Dec, 2020
Like Article
Save Article
dframe['Last Name'] == 'Turner'
The line above produces a pandas.Series
of boolean items, that represent whether or not each entry in the 'Last Name'
column matches 'Turner'
You can use that pandas.Series
of boolean items to index your dataframe:
dframe[dframe['Last Name'] == 'Turner']
That should leave you with your desired selection of rows.
Now, if you only wish to look at the 'First Name'
for the selected rows, you can do
dframe[dframe['Last Name'] == 'Turner']['First Name']
If you want to do a compound search for both first name and last name, you need to perform a bitwise boolean operation between results of individual searches:
dframe[(dframe['First Name'] == 'John') & (dframe['Last Name'] == 'Turner')]
Finally, to give you a little bonus, if you wish to find all last names that contain 'Turner'
, say something like 'Turner-Jones'
, you can do the following:
dframe[dframe['Last Name'].str.contains('Turner')]
In the line above you are using the .str
accessor on the pandas.Series
, which gives you access to a set of very convenient string methods. You can read more about it in the documentation.
Below I show a working example from an IPython session:
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: first_names = ['John', 'Tom', 'Fred', 'Michael', 'Andrew']
In [4]: last_names = ['Turner', 'Harden', 'Bryant', 'Davis', 'Turner']
In [5]: df = pd.DataFrame(list(zip(first_names, last_names)), columns=['First Na
me', 'Last Name'])
In [6]: df
Out[6]:
First Name Last Name
0 John Turner
1 Tom Harden
2 Fred Bryant
3 Michael Davis
4 Andrew Turner
In [7]: df[df['Last Name'] == 'Turner']
Out[7]:
First Name Last Name
0 John Turner
4 Andrew Turner
In [8]: df[(df['First Name'] == 'John') & (df['Last Name'] == 'Turner')]
Out[8]:
First Name Last Name
0 John Turner
In [9]: df[df['Last Name'].str.contains('r')]
Out[9]:
First Name Last Name
0 John Turner
1 Tom Harden
2 Fred Bryant
4 Andrew Turner
In [10]: (df['Last Name'] == 'Turner').any()
Out[10]: True
Notice that in the input labeled In[10]
I went ahead and verified if there were any matches by calling the any()
method on the boolean pandas.Series
. This can be a helpful way of debugging your search if you are having issues getting the results you expect.
Pandas
17 авг. 2022 г.
читать 1 мин
Часто вам может понадобиться выбрать строки кадра данных pandas, в которых определенное значение появляется в любом из столбцов.
К счастью, это легко сделать с помощью функции .any pandas. В этом руководстве объясняется несколько примеров использования этой функции на практике.
Пример 1: найти значение в любом столбце
Предположим, у нас есть следующие Pandas DataFrame:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19],
'assists': [5, 7, 7, 9, 12],
'rebounds': [11, 8, 10, 6, 6]})
#view DataFrame
print(df)
# points assists rebounds
#0 25 5 11
#1 12 7 8
#2 15 7 10
#3 14 9 6
#4 19 12 6**
Следующий синтаксис показывает, как выбрать все строки DataFrame, содержащие значение 25 в любом из столбцов:
df[df.isin([25]).any(axis= 1 )]
points assists rebounds
0 25 5 11
Следующий синтаксис показывает, как выбрать все строки DataFrame, содержащие значения 25, 9 или 6 в любом из столбцов:
df[df.isin([25, 9, 6 ]).any(axis= 1 )]
# points assists rebounds
#0 25 5 11
#3 14 9 6
#4 19 12 6**
Пример 2: поиск символа в любом столбце
Предположим, у нас есть следующие Pandas DataFrame:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19],
'assists': [5, 7, 7, 9, 12],
'position': ['G', 'G', 'F', 'F', 'C']})
#view DataFrame
print(df)
# points assists position
#0 25 5 G
#1 12 7 G
#2 15 7 F
#3 14 9 F
#4 19 12 C**
Следующий синтаксис показывает, как выбрать все строки DataFrame, содержащие символ G в любом из столбцов:
df[df.isin(['G']).any(axis= 1 )]
points assists position
0 25 5 G
1 12 7 G
Следующий синтаксис показывает, как выбрать все строки DataFrame, содержащие значения G или C в любом из столбцов:
df[df.isin(['G', 'C']).any(axis= 1)]
points assists position
0 25 5 G
1 12 7 G
4 19 12 C
Дополнительные ресурсы
Как фильтровать кадр данных Pandas по нескольким условиям
Как найти уникальные значения в нескольких столбцах в Pandas
Как получить номера строк в кадре данных Pandas
Python Pandas Code Example to Search for a Value in a DataFrame Column
When working with a large dataset on any machine learning or data science project, there is a need to search for some values in a feature, and for that values, we need to get the values from other features. Searching for values within a dataset might sound complicated but Python Pandas makes it easy.
The Python Pandas Code below does the following:
1. Creates data dictionary and converts it into DataFrame
2. Uses the “where” function to filter out desired data columns. The pandas.DataFrame.where() function is like the if-then idiom which checks for a condition to return the result accordingly.
Python Pandas Sample Code to Find Value in DataFrame
Below is the pandas code in python to search for a value within a Pandas DataFrame column –
Table of Contents
- Python Pandas Code Example to Search for a Value in a DataFrame Column
- Python Pandas Sample Code to Find Value in DataFrame
- Step 1 – Import the library
- Step 2 – Setting up the Data
- Step 3 – Searching the Values in the DataFrame
Step 1 – Import the library
import pandas as pd
We have only imported the python pandas library which is needed for this code example.
Step 2 – Setting up the Data
We have created a dictionary of data and passed it to pd.DataFrame to make a dataframe with columns ‘first_name’, ‘last_name’, ‘age’, ‘Comedy_Score’ and ‘Rating_Score’.
raw_data = {'first_name': ['Sheldon', 'Raj', 'Leonard', 'Howard', 'Amy'],
'last_name': ['Copper', 'Koothrappali', 'Hofstadter', 'Wolowitz', 'Fowler'],
'age': [42, 38, 36, 41, 35],
'Comedy_Score': [9, 7, 8, 8, 5],
'Rating_Score': [25, 25, 49, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age',
'Comedy_Score', 'Rating_Score'])
print(df)
Try A Few More Pandas Code Examples With These Python Pandas Projects with Source Code
Step 3 – Searching the Values in the DataFrame
We are searching the data in the feature Rating_Score which have values less than 50 and for those values, we are selecting the corresponding values in comedy_Score.
print(df['Comedy_Score'].where(df['Rating_Score'] < 50))
The output is as shown below –
first_name last_name age Comedy_Score Rating_Score
0 Sheldon Copper 42 9 25
1 Raj Koothrappali 38 7 25
2 Leonard Hofstadter 36 8 49
3 Howard Wolowitz 41 8 62
4 Amy Fowler 35 5 70
0 9.0
1 7.0
2 8.0
3 NaN
4 NaN
Name: Comedy_Score, dtype: float64