# Compilation of basic results related to regular language

* Results related to Pumping Lemma

Theorem.  A language $L$ accepted by a DFA $A$ of $n$ states is

1. nonempty if and only if $A$ accepts a sentence of length less lean $n$.
2. infinite if and only if $A$ accepts a sentence of length $\ell$ with $n \le \ell < 2n$.

The union of the following languages, $L_1 \cup L_2$, is non-regular, but nevertheless it cannot be proven using pumping lemma.

$L_1 = \{u_1 v_1 w v_2 u_2 : u_1,u_2,v_1,v_2,w \in [4]^*,$ and either $v_1=w$, $v_1=w$, or $v_2=v_2\}$

$L_2 = \{w : w\in [4]^*,$ and precisely $1/7$ of symbols in $w$ are 3’s $\}$

* Regular languages are precisely constant-space languages

Theorem.  A regular language can be recognized by a two-way read-only Turing machine.

Proof sketch.  Record the configuration of each time we first visit the $i$-th cell.  Define a finite automaton to simulate the same behavior.

Theorem.  $\textsc{Space}(o(\log\log n))$ is the same as $\textsc{Space}(1)$, both are the class of regular languages.  That is, having an algorithm that uses $o(\log\log n)$-space is equal to not using any space.

Proof sketch.  If a Turing machine uses $s(n)$ space, there are $2^{O(s(n))}$ possible configurations and $2^{2^{O(s(n))}}$ possible crossing sequences.  If $s(n) = o(\log\log n)$ then there are two crossing sequences that are the same, and we can further shorten a shortest string accepted by the machine by the following lemma:

Lemma.  Let $M$ be a machine and $x$ be the input.  Suppose two crossing sequence $C_i$ and $C_j$ are equal; then by removing the substring of $x$ from index $i+1$ to $j-i$ (assume $i < j$), we get another string that is accepted by the machine $M$.

The following language is non-regular but can be decided in $O(\log\log n)$ space:

$L = \{\#(0)_2\#(1)_2\# \ldots \#(2^k-1)_2\# : k \ge 0\}$,

where $(i)_2$ is the binary representation of $i$.

* Generating function of regular language

For a given language $L \subseteq \Sigma^*$,

$S_L(z) = \sum_{n\ge 0} |L \cap \Sigma^n| z^n$

is the generating function of $L$.  Using analytic combinatorics we can prove the following useful facts.

Theorem.  For a regular language $L$, the generating function $S_L(z)$ is rational.

Theorem.  Let $A=(Q,\Sigma,\delta,q_0,F)$ be a finite automaton of a language $L$.  Then the generating function of $L$ is the following rational function determined under matrix form

$S_L(z) = [q_0]^T(I-z[\delta])^{-1}[F]$,

where $[\delta](i,j) = | \{$ number of different labels between $q_i$ and $q_j \}|$; $[q_0]$ and $[F]$ as characteristic vectors.

* Minimizing finite automata

Brzozowski’s algorithm use only power set construction $\textsc{Power}$ and edge reversal $\textsc{Rev}$.  One can observe that reversing edges of a DFA gives an NFA of the reverse language.  Then the power set construction gives a minimum DFA.  This algorithm takes exponential time in the worst case.

Hopcroft’s algorithm is the fastest algorithm known; runs in $O(n \log n)$ time.  Try to partition the states using the Myhill-Nerode equivalence relation.

* Relation with $AC^0$

Theorem.  $\mathsf{REG}$ and $\mathsf{AC}^0$ are incomparable.

Proof.  Parity is not in $\mathsf{AC}^0$.  Palindrome/addition/$\{ww : w \in \Sigma^*\}$ are not regular.