Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Python – Get word frequency in percentage

  • Difficulty Level : Easy
  • Last Updated : 30 Jun, 2021

Given a list of strings, the task is to write a Python program to get a percentage share of each word in the strings list.

Computational Explanation: (Occurrence of X word / Total words) * 100.

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course

Example:



Input : test_list = [“Gfg is best for geeks”, “All love Gfg”, “Gfg is best for CS”, “For CS geeks Gfg is best”]

Output : {‘Gfg’: 0.21052631578947367, ‘is’: 0.15789473684210525, ‘best’: 0.15789473684210525, ‘for’: 0.10526315789473684, ‘geeks’: 0.10526315789473684, ‘All’: 0.05263157894736842, ‘love’: 0.05263157894736842, ‘CS’: 0.10526315789473684, ‘For’: 0.05263157894736842}

Explanation : Frequency percentage of each word wrt. all other words in list is computed. Gfg occurs 4 times. Total words = 19. 

Input : test_list = [“Gfg is best for geeks”, “All love Gfg”]

Output : {‘Gfg’: 0.25, ‘is’: 0.125, ‘best’: 0.125, ‘for’: 0.125, ‘geeks’: 0.125, ‘All’: 0.125, ‘love’: 0.125}

Explanation : Frequency percentage of each word wrt. all other words in list is computed.

Method #1: Using sum() + Counter()+ join() + split()

In this, we perform the task of getting each word using split() after joining each string using join(). Counter() gets the frequency of each word mapped. Post that all words size computed using sum(), can get the required share of each word, harnessing frequency stored in Counter.



Python3




# Python3 code to demonstrate working of
# Each word frequency percentage
# Using sum() + Counter()+ join() + split()
from collections import Counter
  
# initializing list
test_list = ["Gfg is best for geeks",
             "All love Gfg"
             "Gfg is best for CS",
             "For CS geeks Gfg is best"]
               
# printing original list
print("The original list is : " + str(test_list))
  
# concatenating using join 
joined = " ".join(ele for ele in test_list)
  
# mapping using Counter()
mappd = Counter(joined.split())
  
# getting total using sum 
total_val = sum(mappd.values())
  
# getting share of each word
res = {key: val / total_val for key,
       val in mappd.items()}
  
# printing result
print("Percentage share of each word : " + str(res))


Output:

The original list is : [‘Gfg is best for geeks’, ‘All love Gfg’, ‘Gfg is best for CS’, ‘For CS geeks Gfg is best’]

Percentage share of each word : {‘Gfg’: 0.21052631578947367, ‘is’: 0.15789473684210525, ‘best’: 0.15789473684210525, ‘for’: 0.10526315789473684, ‘geeks’: 0.10526315789473684, ‘All’: 0.05263157894736842, ‘love’: 0.05263157894736842, ‘CS’: 0.10526315789473684, ‘For’: 0.05263157894736842}

Method #2: Using combined one-liner 

Similar to the above method, just combining each segment to provide a compact one liner solution.

Python3




# Python3 code to demonstrate working of
# Each word frequency percentage
# Using combined one-liner 
from collections import Counter
  
# initializing list
test_list = ["Gfg is best for geeks", "All love Gfg"
            "Gfg is best for CS", "For CS geeks Gfg is best"]
               
# printing original list
print("The original list is : " + str(test_list))
  
# mapping using Counter()
mappd = Counter(" ".join(ele for ele in test_list).split())
  
# getting share of each word
res = {key: val / sum(mappd.values()) for key,
       val in mappd.items()}
  
# printing result
print("Percentage share of each word : " + str(res))


Output:

The original list is : [‘Gfg is best for geeks’, ‘All love Gfg’, ‘Gfg is best for CS’, ‘For CS geeks Gfg is best’]

Percentage share of each word : {‘Gfg’: 0.21052631578947367, ‘is’: 0.15789473684210525, ‘best’: 0.15789473684210525, ‘for’: 0.10526315789473684, ‘geeks’: 0.10526315789473684, ‘All’: 0.05263157894736842, ‘love’: 0.05263157894736842, ‘CS’: 0.10526315789473684, ‘For’: 0.05263157894736842}7894736842}




My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!