Which operator is used to fetch data when any one of the multiple conditions is true?

This article describes how to select rows of pandas.DataFrame by multiple conditions.

  • Basic method for selecting rows of pandas.DataFrame
  • Select rows with multiple conditions
  • The operator precedence

Two points to note are:

  1. Use &|~ [not and, or, not]
  2. Enclose each conditional expression in parentheses when using comparison operators

Error when using and, or, not:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool[], a.item[], a.any[] or a.all[].

Error when no parentheses:

TypeError: cannot compare a dtyped [object] array with a scalar of type [bool]

In the sample code, the following CSV file is read and used.

  • sample_pandas_normal.csv

import pandas as pd

df = pd.read_csv['data/src/sample_pandas_normal.csv']
print[df]
#       name  age state  point
# 0    Alice   24    NY     64
# 1      Bob   42    CA     92
# 2  Charlie   18    CA     70
# 3     Dave   68    TX     70
# 4    Ellen   24    CA     88
# 5    Frank   30    NY     57

The sample code uses pandas.DataFrame, but the same applies to pandas.Series.

Basic method for selecting rows of pandas.DataFrame

Using a list, array, or pandas.Series of boolean bool, you can select rows that are True.

mask = [True, False, True, False, True, False]
df_mask = df[mask]
print[df_mask]
#       name  age state  point
# 0    Alice   24    NY     64
# 2  Charlie   18    CA     70
# 4    Ellen   24    CA     88

Select rows with multiple conditions

You can get pandas.Series of bool which is an AND of two conditions using &.

Note that == and ~ are used here as the second condition for the sake of explanation, but you can use != as well.

print[df['age'] 

Chủ Đề