Finite Playground

To verify is human; to prove, divine.


Leave a comment

Compilation of basic results related to regular language

* Results related to Pumping Lemma

Theorem.  A language L accepted by a DFA A of n states is

  1. nonempty if and only if A accepts a sentence of length less lean n.
  2. infinite if and only if A accepts a sentence of length \ell with n \le \ell < 2n.

The union of the following languages, L_1 \cup L_2, is non-regular, but nevertheless it cannot be proven using pumping lemma.

L_1 = \{u_1 v_1 w v_2 u_2 : u_1,u_2,v_1,v_2,w \in [4]^*, and either v_1=w, v_1=w, or v_2=v_2\}

L_2 = \{w : w\in [4]^*, and precisely 1/7 of symbols in w are 3’s \}

* Regular languages are precisely constant-space languages

Theorem.  A regular language can be recognized by a two-way read-only Turing machine.

Proof sketch.  Record the configuration of each time we first visit the i-th cell.  Define a finite automaton to simulate the same behavior.

Theorem.  \textsc{Space}(o(\log\log n)) is the same as \textsc{Space}(1), both are the class of regular languages.  That is, having an algorithm that uses o(\log\log n)-space is equal to not using any space.

Proof sketch.  If a Turing machine uses s(n) space, there are 2^{O(s(n))} possible configurations and 2^{2^{O(s(n))}} possible crossing sequences.  If s(n) = o(\log\log n) then there are two crossing sequences that are the same, and we can further shorten a shortest string accepted by the machine by the following lemma:

Lemma.  Let M be a machine and x be the input.  Suppose two crossing sequence C_i and C_j are equal; then by removing the substring of x from index i+1 to j-i (assume i < j), we get another string that is accepted by the machine M.

This is a contradiction.

The following language is non-regular but can be decided in O(\log\log n) space:

L = \{\#(0)_2\#(1)_2\# \ldots \#(2^k-1)_2\# : k \ge 0\},

where (i)_2 is the binary representation of i.

* Generating function of regular language

For a given language L \subseteq \Sigma^*,

S_L(z) = \sum_{n\ge 0} |L \cap \Sigma^n| z^n

is the generating function of L.  Using analytic combinatorics we can prove the following useful facts.

Theorem.  For a regular language L, the generating function S_L(z) is rational.

Theorem.  Let A=(Q,\Sigma,\delta,q_0,F) be a finite automaton of a language L.  Then the generating function of L is the following rational function determined under matrix form

S_L(z) = [q_0]^T(I-z[\delta])^{-1}[F],

where [\delta](i,j) = | \{ number of different labels between q_i and q_j \}|; [q_0] and [F] as characteristic vectors.

* Minimizing finite automata

Brzozowski’s algorithm use only power set construction \textsc{Power} and edge reversal \textsc{Rev}.  One can observe that reversing edges of a DFA gives an NFA of the reverse language.  Then the power set construction gives a minimum DFA.  This algorithm takes exponential time in the worst case.

Hopcroft’s algorithm is the fastest algorithm known; runs in O(n \log n) time.  Try to partition the states using the Myhill-Nerode equivalence relation.

* Relation with AC^0

Theorem.  \mathsf{REG} and \mathsf{AC}^0 are incomparable.

Proof.  Parity is not in \mathsf{AC}^0.  Palindrome/addition/\{ww : w \in \Sigma^*\} are not regular.


Leave a comment

Finite automata, monoids, and Myhill-Nerode theorem

Kleene-Buchi Theorem.  For a language L, the following are equivalent:
(1) L is recognized by a finite automaton,
(2) L is defined by a regular expression,
(3) L is definable in Monadic second order logic,
(4) L is recognized by a finite monoid.

The syntactic monoid M(L) = \Sigma^*/\cong_L is defined by the equivalence \cong_L with

u \cong_L v if and only if xuy \in L \iff xvy \in L for every x,y \in \Sigma^*.

Myhill Theorem.  A language is regular if and only if the syntactic monoid is finite.

If we define another equivalence relation \sim_L with

u \sim_L v if and only if uy \in L \iff vy \in L for every y \in \Sigma^*,

\Sigma^*/\sim_L may not be a monoid in general, but we have the following:

Myhill-Nerode Theorem.  A language L can be recognized by a finite automaton with k states if and only if the relation \sim_L has index k; that is, \Sigma^*/\sim_L has k classes.