# Star Height of Regular Expression and Regular Language

The **star height** relates to the field of theoretical of computation (TOC). It is used to indicate the structural complexity of regular expressions and regular languages. Here complexity, relates to the maximum nesting depth of Kleene stars present in a regular expression.

It may be noted here that a regular language may be represented by regular expressions that are not unique, yet equivalent. These regular expressions may have different star heights depending on their structural complexity(i.e nesting). But **star height of a regular language is a unique number and is equal to the least star height of any regular expression representing that language.** In this context **generalized star height** is an appropriate terminology, that defines the minimum nesting depth of Kleene stars to describe the language by means of a generalized regular expression. For example:

The language “aba” over the set of alphabets {a, b} can be generated using regular expressions,

(a + b)* ...... (1) Star height = 1 (a* b*)* ...... (2) Star height = 2

But we consider the least star height. Therefore the star height of the regular language “aba” is one.

**Star height is also defined for regular expressions as the maximum nesting depth of Kleene stars appearing in that expression.** In order to state star height, “h” of a regular expression formally, one can write as,

h() = 0, where is the empty set

h() = 0, where is the empty string

h(t) = 0, where t may be any terminal symbol of an alphabet set

h(EF) = max(h(E), h(F)), where E, F denotes regular expressions

h(E*) = h(E) + 1

Some examples are:

- h(a*(b a*)*) = 2
- h((a b*) + (a* a b*)*b)*) = 3
- h(a) = 0

Prof. Eggan tried to give a relationship between the loop complexity of an automaton that accepts a language L, and the star height of the language, L.

**The star height of a language, L is equal to the minimal loop complexity of an automaton that accepts, L.** It can also be stated that the loop complexity of an automata that is both **accessible (i.e automata constructed by deleting all non-accessible states and any transitions to or from them)** and **co-accessible (i.e a state q of an automaton is said to be co-accessible if there is a string s that takes us from q to a marked state)** is the minimum of star heights of regular expressions obtained by different possible executions of the state elimination method (or BMC algorithm).

It is apparent that a regular language of star height zero can represent only a finite number of regular languages. The generalized star height considers that taking complement of a regular expression would not lead to an increase in star height. This consideration yields interesting results at its disposal.

For example – consider the set of alphabets {x, y}. The star height of the regular expression for the regular language “all strings beginning and ending with x”, i.e

h(x(x + y)*x+x) = 1, since only one level of Kleene nesting exists

But the same language can also be represented by the regular expression x ^c x + x, because ^c denotes set of all strings over the input alphabets.

Now, h(x ^c x + x) = 0, as no Kleene nesting present

**Therefore the generalized star height of the language is 0 even though its star height is 1.**