# Python – Get word frequency in percentage

• Difficulty Level : Easy
• Last Updated : 22 Jul, 2022

Given a list of strings, the task is to write a Python program to get a percentage share of each word in the strings list.

Computational Explanation: (Occurrence of X word / Total words) * 100.

Example:

Input : test_list = [“Gfg is best for geeks”, “All love Gfg”, “Gfg is best for CS”, “For CS geeks Gfg is best”]

Output : {‘Gfg’: 0.21052631578947367, ‘is’: 0.15789473684210525, ‘best’: 0.15789473684210525, ‘for’: 0.10526315789473684, ‘geeks’: 0.10526315789473684, ‘All’: 0.05263157894736842, ‘love’: 0.05263157894736842, ‘CS’: 0.10526315789473684, ‘For’: 0.05263157894736842}

Explanation : Frequency percentage of each word wrt. all other words in list is computed. Gfg occurs 4 times. Total words = 19.

Input : test_list = [“Gfg is best for geeks”, “All love Gfg”]

Output : {‘Gfg’: 0.25, ‘is’: 0.125, ‘best’: 0.125, ‘for’: 0.125, ‘geeks’: 0.125, ‘All’: 0.125, ‘love’: 0.125}

Explanation : Frequency percentage of each word wrt. all other words in list is computed.

Method #1: Using sum() + Counter()+ join() + split()

In this, we perform the task of getting each word using split() after joining each string using join(). Counter() gets the frequency of each word mapped. Post that all words size computed using sum(), can get the required share of each word, harnessing frequency stored in Counter.

## Python3

 `# Python3 code to demonstrate working of` `# Each word frequency percentage` `# Using sum() + Counter()+ join() + split()` `from` `collections ``import` `Counter`   `# initializing list` `test_list ``=` `[``"Gfg is best for geeks"``,` `             ``"All love Gfg"``, ` `             ``"Gfg is best for CS"``,` `             ``"For CS geeks Gfg is best"``]` `             `  `# printing original list` `print``(``"The original list is : "` `+` `str``(test_list))`   `# concatenating using join ` `joined ``=` `" "``.join(ele ``for` `ele ``in` `test_list)`   `# mapping using Counter()` `mappd ``=` `Counter(joined.split())`   `# getting total using sum ` `total_val ``=` `sum``(mappd.values())`   `# getting share of each word` `res ``=` `{key: val ``/` `total_val ``for` `key,` `       ``val ``in` `mappd.items()}`   `# printing result` `print``(``"Percentage share of each word : "` `+` `str``(res))`

Output:

The original list is : [‘Gfg is best for geeks’, ‘All love Gfg’, ‘Gfg is best for CS’, ‘For CS geeks Gfg is best’]

Percentage share of each word : {‘Gfg’: 0.21052631578947367, ‘is’: 0.15789473684210525, ‘best’: 0.15789473684210525, ‘for’: 0.10526315789473684, ‘geeks’: 0.10526315789473684, ‘All’: 0.05263157894736842, ‘love’: 0.05263157894736842, ‘CS’: 0.10526315789473684, ‘For’: 0.05263157894736842}

Method #2: Using combined one-liner

Similar to the above method, just combining each segment to provide a compact one liner solution.

## Python3

 `# Python3 code to demonstrate working of` `# Each word frequency percentage` `# Using combined one-liner ` `from` `collections ``import` `Counter`   `# initializing list` `test_list ``=` `[``"Gfg is best for geeks"``, ``"All love Gfg"``, ` `            ``"Gfg is best for CS"``, ``"For CS geeks Gfg is best"``]` `             `  `# printing original list` `print``(``"The original list is : "` `+` `str``(test_list))`   `# mapping using Counter()` `mappd ``=` `Counter(``" "``.join(ele ``for` `ele ``in` `test_list).split())`   `# getting share of each word` `res ``=` `{key: val ``/` `sum``(mappd.values()) ``for` `key,` `       ``val ``in` `mappd.items()}`   `# printing result` `print``(``"Percentage share of each word : "` `+` `str``(res))`

Output:

The original list is : [‘Gfg is best for geeks’, ‘All love Gfg’, ‘Gfg is best for CS’, ‘For CS geeks Gfg is best’]

Percentage share of each word : {‘Gfg’: 0.21052631578947367, ‘is’: 0.15789473684210525, ‘best’: 0.15789473684210525, ‘for’: 0.10526315789473684, ‘geeks’: 0.10526315789473684, ‘All’: 0.05263157894736842, ‘love’: 0.05263157894736842, ‘CS’: 0.10526315789473684, ‘For’: 0.05263157894736842}7894736842}

The time and space complexity of all the methods is same::

Time Complexity: O(n)

Auxiliary Space: O(n)

Method #3 : Using join(),split() and count()

Initially join all the elements of list by space,  after that split the string by space which will result in a list.Now iterate over a list and check whether element is already present or not in dictionary keys.If not present add element as key to dictionary with occurrence of word divided by length of list as value(nothing but word frequency percentage)

## Python3

 `# Python3 code to demonstrate working of` `# Each word frequency percentage` `# Using count() and split()`   `# initializing list` `test_list ``=` `[``"Gfg is best for geeks"``,` `            ``"All love Gfg"``,` `            ``"Gfg is best for CS"``,` `            ``"For CS geeks Gfg is best"``]` `            `  `# printing original list` `print``(``"The original list is : "` `+` `str``(test_list))`   `# concatenating using join` `joined ``=` `" "``.join(ele ``for` `ele ``in` `test_list)` `p``=``joined.split()` `d``=``dict``()` `for` `i ``in` `p:` `    ``if` `i ``not` `in` `d.keys():` `        ``d[i]``=``p.count(i)``/``len``(p)`   `# printing result` `print``(``"Percentage share of each word : "` `+` `str``(d))`

Output

```The original list is : ['Gfg is best for geeks', 'All love Gfg', 'Gfg is best for CS', 'For CS geeks Gfg is best']
Percentage share of each word : {'Gfg': 0.21052631578947367, 'is': 0.15789473684210525, 'best': 0.15789473684210525, 'for': 0.10526315789473684, 'geeks': 0.10526315789473684, 'All': 0.05263157894736842, 'love': 0.05263157894736842, 'CS': 0.10526315789473684, 'For': 0.05263157894736842}```

My Personal Notes arrow_drop_up
Related Articles