R – Pareto Chart
Pareto chart is a combination of a bar chart and a line chart used for visualization.
In Pareto charts, the right vertical axis is used for cumulative frequency while the left vertical axis represents frequency. They basically use the Pareto principle which says that 80% of effects are produced from 20% of causes of systems.
Here, we have a bar chart that indicates the frequency of occurrence of the event in different categories in decreasing order (from left to right), and an overlaid line chart indicates the cumulative percentage of occurrences.
pareto.chart(x, ylab = “Frequency”, ylab2 = “Cumulative Percentage”, xlab, cumperc = seq(0, 100, by = 25), ylim, main, col = heat.colors(length(x)))
x: a vector of values. names(x) are used for labelling the bars.
ylab: a string specifying the label for the y-axis.
ylab2: a string specifying the label for the second y-axis on the right side.
xlab: a string specifying the label for the x-axis.
cumperc: a vector of percentage values to be used as tickmarks for the second y-axis on the right side.
ylim: a numeric vector specifying the limits for the y-axis.
main: a string specifying the main title to appear on the plot.
col: a value for the color, a vector of colors, or a palette for the bars. See the help for colors and palette.
Plotting Pareto Chart
Following are the steps that are required for plotting Pareto Chart:
- A vector (defect <- c(Values…)) is taken which holds the values of counts of different categories.
- A vector (names(defect) <- c(Values…)) is taken which holds the string values specifying
names of different categories.
- This vector “defect” is plot using pareto.chart().
In the chart here, the orange Pareto line indicates that (789 + 621) / 1722 which is approximately 80% of the complaints come from 2 out of 10 = 20% of the complaint types (Overpriced and Small portions).