Skip to content
Related Articles

Related Articles

How to Select Specific Columns in R dataframe?

Improve Article
Save Article
  • Last Updated : 28 Nov, 2021
Improve Article
Save Article

In this article, we will discuss how to select specific columns from dataframe in the R programming language.

Method 1: Selecting specific Columns Using Base R by column name

In this approach to select a specific column, the user needs to write the name of the column name in the square bracket with the name of the given data frame as per the requirement to get those specific columns needed by the user.

Syntax:

data_frame

Example:

R




# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7),
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific Columns Using Base
# R by column name
gfg[c('b', 'd', 'e')]


Output:

Method 2: Selecting specific Columns Using Base R by column index

In this approach to select the specific columns, the user needs to use the square brackets with the data frame given, and. With it, the user also needs to use the index of columns inside of the square bracket where the indexing starts with 1, and as per the requirements of the user has to give the required column index to inside the brackets 

Syntax:

data_frame

Example:

R




# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9), 
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific Columns Using Base R 
# by column index
gfg[c(2, 4, 5)]


Output:

Method 3: Selecting specific columns by subsetting data by column name

In this method of selecting specific columns by subsetting data, the user needs to do the specification of a character vector containing the names of the columns to extract, the user has to enter the vector of the characters which corresponds to the column name in the square bracket with the data frame 

Syntax:

data_frame[,c(column_name_1,column_name_2,...)]

Example:

R




# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns by subsetting 
# data by column name
gfg[, c('b', 'd', 'e')]


Output:

Method 4: Selecting specific columns by subsetting data by column index

In this method of selecting specific columns by subsetting data, the user needs to do the specification of an integer vector containing the index of the columns to extract, the user has to enter the vector of the indexes which corresponds to the column index in the square bracket with the data frame

Syntax:

data_frame[,c(column_index_1,column_index_2,...)]

Example:

R




# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3), 
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns by subsetting data
# by column index:
gfg[, c(2, 4, 5)]


Output:

Method 5: Selecting specific columns by Subsetting Data with select Argument of subset Function:

Subset function: This function will be returning the subsets of data frames that meet conditions.

Syntax:

subset(x, subset, select, drop = FALSE, …)

Parameters:

  • x: object to be subsetted.
  • subset: logical expression indicating elements or rows to keep: missing values are taken as false.
  • select: expression, indicating columns to select from a data frame.
  • drop: passed on to [ indexing operator.
  • …: further arguments to be passed to or from other methods.

Example:

R




# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9), 
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3), 
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns by Subsetting 
# Data with select Argument of subset Function
subset(gfg, select=c('b', 'd', 'e'))


Output:

Method 6: Selecting specific columns using dplyr package by column name

In this approach to select the specific columns of the given data frame, the user needs first install and import the dplyr package in the working R console of the user and then call the select function and pass the name of the required columns as the argument of this function

Syntax:

data_frame %>% select(column_name_1,column_name_2,...)   

Example:

R




# Importing dplyr library
library("dplyr")
  
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9),
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3),
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7),
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns using dplyr 
# package by column name
gfg % > % select(b, d, e)


Output:

Method 7: Selecting specific columns using dplyr package by column index

In this approach to select the specific columns of the given data frame, the user needs first install and import the dplyr package in the working R console of the user and then call the select function and pass the index of the required columns as the argument of this function

Syntax:

data_frame %>% select(column_index_1,column_index_2,...)  

Example:

R




# Importing dplyr library
library("dplyr")
  
# Creating DataFrame
gfg < - data.frame(a=c(5, 1, 1, 5, 6, 7, 5, 4, 7, 9), 
                   b=c(1, 8, 6, 8, 6, 7, 4, 1, 7, 3), 
                   c=c(7, 1, 8, 9, 4, 1, 5, 6, 3, 7),
                   d=c(4, 6, 8, 4, 6, 4, 8, 9, 8, 7), 
                   e=c(3, 1, 6, 4, 8, 9, 7, 8, 9, 4))
  
# Selecting specific columns using dplyr 
# package by column index
gfg % > % select(2, 4, 5)


Output:


My Personal Notes arrow_drop_up
Related Articles

Start Your Coding Journey Now!