Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Python | Pandas DataFrame.where()

  • Last Updated : 17 Sep, 2018

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas where() method is used to check a data frame for one or more condition and return the result accordingly. By default, The rows not satisfying the condition are filled with NaN value.

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course

Syntax:
DataFrame.where(cond, other=nan, inplace=False, axis=None, level=None, errors=’raise’, try_cast=False, raise_on_error=None)
 



Parameters:

cond: One or more condition to check data frame for.
other: Replace rows which don’t satisfy the condition with user defined object, Default is NaN
inplace: Boolean value, Makes changes in data frame itself if True
axis: axis to check( row or columns)

For link to the CSV file used, Click here.

Example #1: Single Condition operation

In this example, rows having particular Team name will be shown and rest will be replaced by NaN using .where() method.




# importing pandas package
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("nba.csv")
  
# sorting dataframe
data.sort_values("Team", inplace = True)
  
# making boolean series for a team name
filter = data["Team"]=="Atlanta Hawks"
  
# filtering data
data.where(filter, inplace = True)
  
# display
data


Output:

As shown in the output image, every row which doesn’t have Team = Atlanta Hawks is replaced with NaN.

 

Example #2: Multi-condition Operations

Data is filtered on the basis of both Team and Age. Only the rows having Team name “Atlanta Hawks” and players having age above 24 will be displayed.




# importing pandas package
import pandas as pd
  
# making data frame from csv file
data = pd.read_csv("nba.csv")
  
# sorting dataframe
data.sort_values("Team", inplace = True)
  
# making boolean series for a team name
filter1 = data["Team"]=="Atlanta Hawks"
  
# making boolean series for age
filter2 = data["Age"]>24
  
# filtering data on basis of both filters
data.where(filter1 & filter2, inplace = True)
  
# display
data


Output:
As shown in the output image, Only the rows having Team name “Atlanta Hawks” and players having age above 24 are displayed.




My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!