Skip to content
Related Articles

Related Articles

How to rename multiple columns in PySpark dataframe ?

Improve Article
Save Article
  • Last Updated : 04 Jul, 2021
Improve Article
Save Article

In this article, we are going to see how to rename multiple columns in PySpark Dataframe.

Before starting let’s create a dataframe using pyspark:


# importing module
import pyspark
from pyspark.sql.functions import col
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
# list  of students  data
data = [["1", "sravan", "vignan"],
        ["2", "ojaswi", "vvit"],
        ["3", "rohith", "vvit"],
        ["4", "sridevi", "vignan"],
        ["1", "sravan", "vignan"],
        ["5", "gnanesh", "iit"]]
# specify column names
columns = ['student ID', 'student NAME', 'college']
# creating a dataframe from the lists of data
dataframe = spark.createDataFrame(data, columns)
print("Actual data in dataframe")
# show dataframe


Method 1: Using withColumnRenamed.

Here we will use withColumnRenamed() to rename the existing columns name.

Syntax: withColumnRenamed( Existing_col, New_col)


  • Existing_col: Old column name.
  • New_col: New column name.

Example 1: Renaming single columns.


                            "College Name").show()


Example 2: Renaming multiple columns.


df2 = dataframe.withColumnRenamed("student ID",


Method 2: Using toDF()

This function returns a new DataFrame that with new specified column names.

Syntax: toDF(*col)

Where, col is a new column name

In this example, we will create an order list of new column names and pass it into toDF function.


Data_list = ["College Id"," Name"," College"]
new_df = dataframe.toDF(*Data_list)


My Personal Notes arrow_drop_up
Related Articles

Start Your Coding Journey Now!