Drop multiple columns using Dplyr package in R
In this article, we will discuss how to drop multiple columns using dplyr package in R programming language.
Dataset in use:
Drop multiple columns by using the column name
We can remove a column with select() method by its column name
Syntax:
select(dataframe,-c(column_name1,column_name2,.,column_name n)
Where, dataframe is the input dataframe and -c(column_names) is the collection of names of the column to be removed.
Example: R program to remove multiple columns by column name
R
# load the library library (dplyr) # create dataframe with 3 columns id, # name and address data1= data.frame (id= c (1,2,3,4,5,6,7,1,4,2), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' , 'pinkey' , 'dhanush' , 'sravan' , 'gnanesh' , 'ojaswi' ), address= c ( 'hyd' , 'hyd' , 'ponnur' , 'tenali' , 'vijayawada' , 'vijayawada' , 'guntur' , 'hyd' , 'tenali' , 'hyd' )) # remove name and id column print ( select (data1,- c (id,name))) # remove name and address column print ( select (data1,- c (address,name))) # remove all column print ( select (data1,- c (address,name,id))) |
Output:
Drop multiple columns by using column index
We can remove a column with select() method by its column index/position. Index starts with 1.
Syntax:
select(dataframe,-c(column_index1,column_index2,.,column_index n)
Where, dataframe is the input dataframe and c(column_indexes) is the position of the columns to be removed.
Example: R program to remove multiple columns by position
R
# load the library library (dplyr) # create dataframe with 3 columns # id,name and address data1= data.frame (id= c (1,2,3,4,5,6,7,1,4,2), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' , 'pinkey' , 'dhanush' , 'sravan' , 'gnanesh' , 'ojaswi' ), address= c ( 'hyd' , 'hyd' , 'ponnur' , 'tenali' , 'vijayawada' , 'vijayawada' , 'guntur' , 'hyd' , 'tenali' , 'hyd' )) # remove name and id columns by # its position print ( select (data1,- c (1,2))) |
Output:
Drop column which contains a value or matches a pattern
Let’s see how to remove the column that contains the character/string.
Method 1: Using contains()
Display the column that contains the given substring and then -contains() removes the column that contains the given substring.
Syntax:
select(dataframe,-contains(‘sub_string’))
Here, dataframe is the input dataframe and the sub_string is the string present in the column name that will be removed.
Method 2: Using matches()
Display the column that contains the given substring and then -matches() removes the column that contains the given substring
Syntax:
select(dataframe,-matches(‘sub_string’))
Here, dataframe is the input dataframe and the sub_string is the string present in the column name that will be removed.
Example: R program that removes column using contains() method
R
# load the library library (dplyr) # create dataframe with 3 columns # id,name and address data1= data.frame (id= c (1,2,3,4,5,6,7,1,4,2), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' , 'pinkey' , 'dhanush' , 'sravan' , 'gnanesh' , 'ojaswi' ), address= c ( 'hyd' , 'hyd' , 'ponnur' , 'tenali' , 'vijayawada' , 'vijayawada' , 'guntur' , 'hyd' , 'tenali' , 'hyd' )) # remove column that contains na print ( select (data1,- contains ( 'na' ))) # remove column that contains re print ( select (data1,- contains ( 're' ))) |
Output:
Remove column which starts with or ends with certain character
Here we can also select columns based on starting and ending characters.
- starts_with() is used to return the column that starts with the given character and -starts_with() is used to remove the column that starts with the given character.
Syntax:
select(dataframe,-starts_with(‘substring’))
Where, dataframe is the input dataframe and substring is the character/string that starts with it
- ends_with() is used to return the column that ends with the given character and -ends_with() is used to remove the column that ends with the given character.
Syntax:
select(dataframe,-ends_with(‘substring’))
Where, dataframe is the input dataframe and substring is the character/string that ends with it.
Example 1: R program to remove a column that starts with character/substring
R
# load the library library (dplyr) # create dataframe with 3 columns # id,name and address data1= data.frame (id= c (1,2,3,4,5,6,7,1,4,2), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' , 'pinkey' , 'dhanush' , 'sravan' , 'gnanesh' , 'ojaswi' ), address= c ( 'hyd' , 'hyd' , 'ponnur' , 'tenali' , 'vijayawada' , 'vijayawada' , 'guntur' , 'hyd' , 'tenali' , 'hyd' )) # remove column that starts with na print ( select (data1,- starts_with ( 'na' ))) # remove column that starts with ad print ( select (data1,- starts_with ( 'ad' ))) |
Output:
Example 2: R program to remove column that ends with character/substring
R
# load the library library (dplyr) # create dataframe with 3 columns # id,name and address data1= data.frame (id= c (1,2,3,4,5,6,7,1,4,2), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' , 'pinkey' , 'dhanush' , 'sravan' , 'gnanesh' , 'ojaswi' ), address= c ( 'hyd' , 'hyd' , 'ponnur' , 'tenali' , 'vijayawada' , 'vijayawada' , 'guntur' , 'hyd' , 'tenali' , 'hyd' )) # remove column that ends with d print ( select (data1,- ends_with ( 'd' ))) # remove column that starts with ss print ( select (data1,- ends_with ( 'ss' ))) |
Output:
Drop column name with Regular Expression
Here we are going to drop the column based on the pattern given in grepl() function. It will find a pattern and remove the column based on the given pattern
Syntax:
dataframe[,!grepl(“pattern”,names(dataframe))]
Here, dataframe is the input dataframe and pattern is the expression to remove the column.
Pattern to remove the column where starting character in column starts is
Syntax:
data[,!grepl(“^letter”,names(data))]
Example: R program to remove column that starts with a letter
R
# load the library library (dplyr) # create dataframe with 3 columns # id,name and address data1= data.frame (id= c (1,2,3,4,5,6,7,1,4,2), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' , 'pinkey' , 'dhanush' , 'sravan' , 'gnanesh' , 'ojaswi' ), address= c ( 'hyd' , 'hyd' , 'ponnur' , 'tenali' , 'vijayawada' , 'vijayawada' , 'guntur' , 'hyd' , 'tenali' , 'hyd' )) # drop column that starts with n print (data1[,! grepl ( "^n" , names (data1))]) # remove column that starts with a print (data1[,! grepl ( "^a" , names (data1))]) |
Output:
Please Login to comment...