Cumulative Frequency Graph in R

• Last Updated : 28 Sep, 2022

In this article, we are going to plot a cumulative frequency graph using the R programming language.

Cumulative Frequency

When the frequency of the first-class interval is added to the frequency of the second class, this total is added to the third class and so on is known as the cumulative frequency.

Cumulative Frequency Graph

A graph that can show the cumulative frequency distribution of grouped data is called a cumulative frequency graph or an ogive. This is the most effective technique to comprehend cumulative frequency data and arrive at conclusions is to plot the data. Graphs in particular are crucial in the realm of statistics because they enable us to better comprehend the data and depict it.

Functions Used

seq() Method

The seq() method creates a list of values beginning from the lower limit to the higher and segregates them with the difference specified in the “by” parameter.

Syntax: seq( start , end, by )

Parameters :

start – start of the sequence

end – end of the sequence

by – increment value of the sequence

cut() Method

The cut() method in R divides the range of the specified vector of data points into intervals and codes the values in the vector as per which interval in which they belong.

Syntax: cut(x, breaks)

Parameters :

x – The vector of data points.

breaks – The vector of break points.

table(x) Method

The transformed vector is then converted into a table of values, in order to construct a frequency table. The values are mapped according to the interval in which they lie. It is used to create a categorical representation of data with the specified variable name and its corresponding frequency.

Syntax: table(x)

Parameter :

x – The vector of values to be converted.

cumsum(x) Method

The cumulative frequencies can be generated using the cumsum() method for the specified vector. Cumulative frequency for a data point at nth interval is the summation of frequencies till the (n-1)th interval.

Syntax: cumsum(x)

Parameters :

x – A vector of data points.

plot() Method

The plot of cumulative frequencies can then be created using the plot() method in R. The method takes as arguments the breakpoints as the coordinates on the x-axis and their respective cumulative frequencies as the coordinates on the y axis respectively.

Syntax: plot(x-coordinates, y-coordinates, xlab, ylab)

Parameters :

x-coordinates – The vector of x coordinates.

y-coordinates – The vector of y coordinates.

xlab – The labelling of x axis.

ylab – The labelling of y axis.

Creating a frequency table

The frequency table is used to depict the frequency of something or in a particular interval of time or data. Here we are storing data points in a variable “data_points” and then make six breakpoints using the seq() method. Transform it to a table using cut() and table() methods.

R

 `# declaring data points` `data_points < - ``c``(1, 2, 3, 5, 1, 1,` `                  ``2, 4, 5, 1, 2, 3, 3)` `# declaring the break points` `break_points = ``seq``(0, 6, by=1)` `# transforming the data` `data_transform = ``cut``(data_points, breaks,` `                     ``right=``FALSE``)` `# creating the frequency table` `freq_table = ``table``(data_transform)` `# printing the frequency table` `print``(``"Frequency Table"``)` `print``(freq_table)`

Output:

```[1] "Frequency Table"
data_transform
[0,1) [1,2) [2,3) [3,4) [4,5) [5,6)
0     4     3     3     1     2 ```

Explanation :

The number of data points in the interval [1,2) inclusive of 1 and non-inclusive of 2 is 4. Similarly, there are 3 three’s in the vector of data points so the value corresponding to [3,4) = 3

Plotting the cumulative frequency graph

In the continuation with the above code, we are going to make a frequency table first using cumsum() method, and then using that table we are going to plot the cumulative frequency graph by labeling the x-axis as data points and the y-axis as cumulative frequency. The points can then be connected using the lines() method.

R

 `# declaring data points` `data_points < - ``c``(1, 2, 3, 5, 1, 1, 2,` `                  ``4, 5, 1, 2, 3, 3)` `# declaring the break points` `break_points = ``seq``(0, 6, by=1)` `# transforming the data` `data_transform = ``cut``(data_points, breaks,` `                     ``right=``FALSE``)` `# creating the frequency table` `freq_table = ``table``(data_transform)` `# printing the frequency table` `print``(``"Frequency Table"``)` `print``(freq_table)` `# calculating cumulative frequency` `cumulative_freq = ``c``(0, ``cumsum``(freq_table))` `print``(``"Cumulative Frequency"``)` `print``(cumulative_freq)` `# plotting the data` `plot``(break_points, cumulative_freq,` `     ``xlab=``"Data Points"``,` `     ``ylab=``"Cumulative Frequency"``)` `# creating line graph` `lines``(break_points, cumulative_freq)`

Output:

```[1] "Frequency Table"
data_transform
[0,1) [1,2) [2,3) [3,4) [4,5) [5,6)
0     4     3     3     1     2
[1] "Cumulative Frequency"
[0,1) [1,2) [2,3) [3,4) [4,5) [5,6)
0     0     4     7    10    11    13 ```

Explanation :

The value corresponding to the cumulative frequency of [5,6) is the summation of all the previous frequencies.

My Personal Notes arrow_drop_up
Related Articles