# Ambiguity in Context free Grammar and Context free Languages

Before reading this article, we recommend you to first read about Pushdown Automata and Context Free Languages.

Suppose we have a context free grammar G with production rules : S -> aSb | bSa | SS | e

**Left Most Derivation (LMD) and Derivation Tree :** Leftmost derivation of a string from starting symbol S is done by replacing leftmost non-terminal symbol by RHS of corresponding production rule. For example, the leftmost derivation of string abab from grammar G above is done as :

__S__ => a__S__b => ab__S__ab => abab

The symbols underlined are replaced using production rules.

Derivation Tree : It tells how a string is derived using production rules from S and has been shown in Figure 1.

**Right Most Derivation (RMD) : **Rightmost derivation of a string from starting symbol S is done by replacing rightmost non-terminal symbol by RHS of corresponding production rule. e.g.; The rightmost derivation of string abab from grammar G above is done as :

__S__ => S__S__ => Sa__S__b => __S__ab => a__S__bab => abab

The symbols underlined are replaced using production rules. The derivation tree for abab using rightmost derivation has been shown in Figure 2.

A derivation can be either LMD or RMD or both or none. For example, __S__ => a__S__b => ab__S__ab => abab is LMD as well as RMD but __S__ => S__S__ => Sa__S__b => __S__ab => a__S__bab => abab is RMD but not LMD.

**Ambiguous Context Free Grammar :** A context free grammar is called ambiguous if there exists more than one LMD or more than one RMD for a string which is generated by grammar. There will also be more than one derivation tree for a string in ambiguous grammar. The grammar described above is ambiguous because there are two derivation trees (Figure 1 and Figure 2). There can be more than one RMD for string abab which are:

__S__ => S__S__ => Sa__S__b => __S__ab => a__S__bab => abab

__S__ => a__S__b => ab__S__ab => abab

**Ambiguous Context Free Languages :** A context free language is called ambiguous if there is no unambiguous grammar to define that language and it is also called inherently ambiguous Context Free Languages.

eg- L={a^{n}b^{n}c^{m}} U {a^{n}b^{m}c^{m}}

**Note : **

- If a context free grammar G is ambiguous, language generated by grammar L(G) may or may not be ambiguous.
- It is not always possible to convert ambiguous CFG to unambiguous CFG. Only some ambiguous CFG can be converted to unambiguous CFG.
- There is no algorithm to convert ambiguous CFG to unambiguous CFG.
- There always exists a unambiguous CFG corresponding to unambiguous CFL.
- Deterministic CFL are always unambiguous and are parsed by LR parsers.

**Question :** Consider the following statements about the context free grammar

G = {S -> SS, S -> ab, S -> ba, S -> ?}

I. G is ambiguous

II. G produces all strings with equal number of a’s and b’s

III. G can be accepted by a deterministic PDA

Which combination below expresses all the true statements about G?

A. I only

B. I and III only

C. II and III only

D. I, II and III

**Solution :** There are different LMD’s for string abab which can be

S => __S__S => __S__SS => ab__S__S => abab__S__ => abab

S => __S__S => ab__S__ => abab

So the grammar is ambiguous. So statement I is true.

Statement II states that the grammar G produces all strings with equal number of a’s and b’s but it can’t generate aabb string. So statement II is incorrect.

Statement III is also correct as it can be accepted by deterministic PDA. So correct option is (B).

**Question :** Which one of the following statements is FALSE?

A. There exist context-free languages such that all the context-free grammars generating them are ambiguous.

B. An unambiguous context free grammar always has a unique parse tree for each string of the language generated by it.

C. Both deterministic and non-deterministic pushdown automata always accept the same set of languages.

D. A finite set of string from one alphabet is always a regular language.

**Solution :** (A) is correct because for ambiguous CFL’s, all CFG corresponding to it are ambiguous.

(B) is also correct as unambiguous CFG has a unique parse tree for each string of the language generated by it.

(C) is false as some languages are accepted by Non – deterministic PDA but not by deterministic PDA.

(D) is also true as finite set of string is always regular.

So option (C) is correct option.

Ambiguity is a common feature of natural languages, where it is tolerated and dealt with in a variety of ways. In programming languages, where there should be only one interpretation of each statement, ambiguity must be removed when possible. Often we can achieve this by rewriting the grammar in an equivalent, unambiguous form.

This article has been contributed by **Sonal Tuteja**.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above