Encoding of Turing Machines and the Universal Turing Machine

     We have presented a variety of Turing Machines (AnBn machine and AnBnCn machine) and in the essay on Machines and Automata briefly discussed the important idea of a Universal Turing Machine. Recall that a Universal Turing Machine (U) is one that can take as input any Turing Machine (T) together with T's input and provide the same result as T. Now there are an infinite number of Turing Machines that can be specified. (Any particular Turing Machine is of finite length, n; but the set of such machines is infinite.) Note that once we write down a Turing Machine its structure; the sets Q and S, the start state, Q0 and the set of Halt states, as well as the Tuples are fixed. The fixed structure that defines a Universal Turing Machine must be capable of interpreting any particular Turing Machine, T. Now, T may have finitely but arbitrarily large Q and S, and Tuples.

     The trick, of course, is to devise a language within which any T can be expressed. In this section we will

  • illustrate the encoding of the AnBn machine and AnBnCn machine into a common language;
  • and consider the possible use of this encoding as another type of measure of the complexity of a problem,

The development given below is adapted from the text: Formal Languages and Automata Theory by Vladimir Drobot. Computer Science Press: Rockville, MD, 1989. A brief discussion of the encoding of Turing Machines may also be found in: Introduction of Automata Theory, Languages, and Computation by John E. Hopcroft and Jeffrey D. Ullman. Addison-Wesley: Reading, MA, 1979.

The figure below reviews the definition of a Turing Machine that will be used in this development.

 

 
 

 Definition of a Turing Machine
 


Turing Machine Encoding

Next, we consider an input vocabulary, S, for the Universal Turing Machine, U, that is sufficient for encoding any Turing Machine, T.

The fixed vocabulary consists of 11 symbols:

  • {1 ,0} out of which are constructed names for each state and each symbol;
  • R, L, and S to name the moves of the tape head;
  • 6 additional symbols that are used as separators in the strings that describe the Turing Machine, T and the starting tape of T.

The string that describes a Turing Machine is of the form:


<seperator1><Nstates><seperator2><Nsymbols><seperator3><list of the tuples><seperator4>

where in our application:


<seperator1> :=: Ý
<Nstates> :=: the number of states of the machine in binary where 1 is
assigned to the start state and Nstates is assigned to the halt states
<seperator2> :=: !
<Nsymbols> :=: the number of tape symbols of the machine in binary where 1 is
assigned blank symbol. (# is the blank symbol in our application)
<seperator3> :=: @
<list of the tuples> is encoded as:
<statei><separator5><symboln><separator5><statej><separator5><symbolm><move><separator6>


where:


<statei> :=: the binary name of the state
<symboln> :=: the binary name of the tape symbol
<statej> :=: the binary name of the state
<symbolm> :=: the binary name of the tape symbol
<move> :=: R,L,or S
<separator5> :=: X
<separator6> :=: Y
<seperator4> :=:¦

Note that the order of the terms that define a tuple is somewhat different from the order used in our examples. Here the order is:

current state, current tape symbol, new state, new tape symbol, action
rather than
current state, current tape symbol, new tape symbol, action , new state

The leftmost portion of the starting tape is encoded as:


<Tseparator1><Nsymbols><seperator2><celli><Tseparator2><cellj>...<Tseparator2><celln><Tseparator3>

where:


<Tseparator1> :=: W
<Nsymbols> :=: as above
<seperator2> :=: as above
<celli> :=: the binary name of the tape symbol in the leftmost cell of the tape
<Tseparator2> :=: %
<cellj>... <celln> :=: the binary names of the tape symbols up to the last symbol on the tape
<Tseparator3> :=: $


     The table below provides the tuples of the AnBn Machine considered earlier. To the right of this table is the encoding of this machine into the fixed alphabet described above.

 

Encoding of the AnBn Turing Machine

Tuples of the AnBn Machine

current control state symbol under read head new symbol action  new control state
q0  A  #  R q2
q2  A  A  R q2
q2  B  B  R q2
q2  #  #  L q3 
q3  B  #  L q4
q4  A  A  L q4
q4  B  B  L q4
q4  #  #  R q1
q1  A  #  R q2
q1  #  # S  qh
The Number of States = 6 which in binary is 110
The Number of Symbols = 3 which in binary is 11

The mapping between state names and binary names is:

          
q0 ­> 1, q2 ­> 10, q3 ­> 11, q4 ­> 100,
          q1 ­> 101,
qh ­> 110

The mapping between symbols and binary names is:

          # ­> 1, A ­> 10, B ­> 11
And thus mapping for each tuple is:

    Tuple:
q0 A # R q2 --> 1X10X10X1XRY
    Tuple: q2 A A R q2 --> 10X10X10X10XRY
    Tuple: q2 B B R q2 --> 10X11X10X11XRY
    Tuple: q2 # # L q3 --> 10X1X11X1XLY
    Tuple: q3 B # L q4 --> 11X11X100X1XLY
    Tuple: q4 B B L q4 --> 100X11X100X11XLY
    Tuple: q4 A A L q4 --> 100X10X100X10XLY
    Tuple: q4 # # R q1 --> 100X1X101X1XRY
    Tuple: q1 A # R q2 --> 101X10X10X1XRY
    Tuple: q1 # # S
qh --> 101X1X110X1XSY
 
 

 This TM encoded as a string is:
     Ý
110!11@
    
1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X1X11X1XLY11X11X100X1XLY100X11X100X11XLY
     
100X10X100X10XLY100X1X101X1XRY101X10X10X1XRY101X1X110X1XSY¦

where the different colorings of the elements of the string are used to help you see the structure of the machine encoded in this vocabulary.


The Length of the Encoded TM is 149

For this example we considered the input string #AB which is encoded as W11!1%10%11$

Thus the input to a U might be the string:

 W11!1%10%11$1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X1X11X1XLY11X11X100X1XLY
100X11X100X11XLY100X10X100X10XLY100X1X101X1XRY101X10X10X1XRY101X1X110X1XSY¦


The Length of the Encoded Start String is 12
The Total length is 161


 
       The Next table below provides the tuples of the AnBnCn Machine considered earlier. To the right of this table is the encoding of this machine into the fixed alphabet described above.  
 

Encoding of the AnBnCn Turing Machine

Tuples of the AnBnCn TM

current control state symbol under read head new symbol action  new control state
q0  A  #  R q2
q2  A  A  R q2
q2  B  B  R q2

q2

C

C

R

q2

q2

%

%

R

q2
q2  #  #  L q5 

q5

C

#

L

q3

q3

C

C

L

q3

q3

%

%

L

q3
q3  B  %  L q4
q4  B B  L q4
q4  A  A  L q4
q4  #  #  R q1
q1  A  #  R q2

q1

#

#

R

q6

q1

%

%

R

q6

q6

%

%

R

q6
q6  #  # S  qh

The Number of States = 8 which in binary is 1000
The Number of Symbols = 5 which in binary is 101

The mapping between state names and binary names is:

          
q0 ­> 1, q2 ­> 10, q5 ­> 11, q3 ­> 100,
          q4 ­> 101,
q1 ­> 110, q6 ­> 111, qh ­> 1000

The mapping between symbols and binary names is:

          # ­> 1, A ­> 10, B ­> 11, C ­> 100, % ­> 101
And thus mapping for each tuple is:

Tuple: qo A # R q2 --> 1X10X10X1XRY
Tuple: q2 A A R q2 --> 10X10X10X10XRY
Tuple: q2 B B R q2 --> 10X11X10X11XRY
Tuple: q2 C C R q2 --> 10X100X10X100XRY
Tuple: q2 % % R q2 --> 10X101X10X101XRY
Tuple: q2 # # L q5 --> 10X1X11X1XLY
Tuple: q5 C # L q3 --> 11X100X100X1XLY
Tuple: q3 C C L q3 --> 100X100X100X100XLY
Tuple: q3 % % L q3 --> 100X101X100X101XLY
Tuple: q3 B % L q4 --> 100X11X101X101XLY
Tuple: q4 B B L q4 --> 101X11X101X11XLY
Tuple: q4 A A L q4 --> 101X10X101X10XLY
Tuple: q4 # # R q1 --> 101X1X110X1XRY
Tuple: q1 A # R q2 --> 110X10X10X1XRY
Tuple: q1 # # R q6 --> 110X1X111X1XRY
Tuple: q1 % % R q6 --> 110X101X111X101XRY
Tuple: q6 % % R q6 --> 111X101X111X101XRY
Tuple: q6 # # S qh --> 111X1X1000X1XSY

 
 

This TM encoded as a string is:

Ý1000!101@1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X100X10X100XRY10X101X10X101XRY10X1X11X1XLY
11X100X100X1XLY100X100X100X100XLY100X101X100X101XLY100X11X101X101XLY101X11X101X11XLY
101X10X101X10XLY101X1X110X1XRY110X10X10X1XRY110X1X111X1XRY110X101X111X101XRY111X101X111X101XRY
111X1X1000X1XSY¦

where the different colorings of the elements of the string are used to help you see the structure of the machine encoded in this vocabulary.


The Length of the Encoded TM is 288

For this example we considered the input string #ABC which is encoded as W101!10%11%100%1$

The Length of this encoded input string is 17.

Thus the input to a U might be the string:

W101!10%11%100%1$1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X100X10X100XRY10X101X10X101XRY
10X1X11X1XLY11X100X100X1XLY100X100X100X100XLY100X101X100X101XLY100X11X101X101XLY101X11X101X11XLY
101X10X101X10XLY101X1X110X1XRY110X10X10X1XRY110X1X111X1XRY110X101X111X101XRY111X101X111X101XRY
111X1X1000X1XSY¦

The Total length is 305.


 
       The final table below provides the tuples of the AnBn Machine where n takes on the values 1 though 6. This machine is restricted to be realizable as a Finite State Machine. It is included here to provide an example of the use of this encoding as a measue of the "complexity" of the machine. To the right of this table is the encoding of this machine into the fixed alphabet described above. Note that the length of this machine is 323 while the length of the entirely general AnBn Turing Machine was 161!  
 

Tuples of the AnBn Machine, n=1-6

current control state symbol under read head new symbol action  new control state
q0  A A R q1
q1 A A R q2
q2  A A R q3
q3  A  A R q4
q4  A  A R q5
q5  A  A R q6
q6  B  B R p1
p1  B  B R p2
p2  B B R p3
p3  B  B R p4
p4 B B R  p5
p5  B B R p6
p6   # # R qh
q1 B B R p6
q2   B  B R p5
q3 B B R p4
q4 B B R p3
q5 B B R p2
q6 B B R p1

The Number of States = 14 which in binary is 1110
The Number of Symbols = 3 which in binary is 11

The mapping between state names and binary name is:


q0 ­> 1, q1 ­> 10, q2 ­> 11, q3 ­> 100, q4 ­> 101,
q5 ­> 110, q6 ­> 111, p1 ­> 1000, p2 ­> 1001,
p3 ­> 1010, p4 ­> 1011, p5 ­> 1100, p6 ­>1101,
qh ­> 1110


The mapping between symbols and binary name is:

# ­> 1, A ­> 10, B ­> 11

And thus mapping for each tuple is:


Tuple: q0 A A R q1 --> 1X10X10X10XRY
Tuple: q1 A A R q2 --> 10X10X11X10XRY
Tuple: q2 A A R q3 --> 11X10X100X10XRY
Tuple: q3 A A R q4 --> 100X10X101X10XRY
Tuple: q4 A A R q5 --> 101X10X110X10XRY
Tuple: q5 A A R q6 --> 110X10X111X10XRY
Tuple: q6 B B R p1 --> 111X11X1000X11XRY
Tuple: p1 B B R p2 --> 1000X11X1001X11XRY
Tuple: p2 B B R p3 --> 1001X11X1010X11XRY
Tuple: p3 B B R p4 --> 1010X11X1011X11XRY
Tuple: p4 B B R p5 --> 1011X11X1100X11XRY
Tuple: p5 B B R p6 --> 1100X11X1101X11XRY
Tuple: p6 # # R qh --> 1101X1X1110X1XRY
Tuple: q1 B B R p6 --> 10X11X1101X11XRY
Tuple: q2 B B R p5 --> 11X11X1100X11XRY
Tuple: q3 B B R p4 --> 100X11X1011X11XRY
Tuple: q4 B B R p3 --> 101X11X1010X11XRY
Tuple: q5 B B R p2 --> 110X11X1001X11XRY
Tuple: q6 B B R p1 --> 111X11X1000X11XRY

 
 

This TM encoded as a string is:
     Ý
110!11@
    
1X10X10X1XRY10X10X10X10XRY10X11X10X11XRY10X1X11X1XLY11X11X100X1XLY100X11X100X11XLY
     
100X10X100X10XLY100X1X101X1XRY101X10X10X1XRY101X1X110X1XSY¦

Ý1110!11@
1X10X10X10XRY10X10X11X10XRY111X10X100X10XRY00X10X101X10XRY101X10X110X10XRY110X10X111X10XRY
111X11X1000X11XRY1000X11X1001X11XRY1001X11X1010X11XRY1010X11X1011X11XRY1011X11X1100X11XRY
1100X11X1101X11XRY1101X1X1110X1XRY10X11X1101X11XRY11X11X1100X11XRY100X11X1011X11XRY
101X11X1010X11XRY110X11X1001X11XRY111X11X1000X11XRY¦

where the different colorings of the elements of the string are used to help you see the structure of the machine encoded in this vocabulary.

The Length of the Encoded TM is 323

For this example we considered the input string #AB which is encoded as W11!1%10%11$

The Length of the Encoded Start String is 12

Thus the input to a U might be the string:

 W11!1%10%11$1X10X10X10XRY10X10X11X10XRY111X10X100X10XRY00X10X101X10XRY101X10X110X10XRY
110X10X111X10XRY111X11X1000X11XRY1000X11X1001X11XRY1001X11X1010X11XRY1010X11X1011X11XRY
1011X11X1100X11XRY1100X11X1101X11XRY1101X1X1110X1XRY10X11X1101X11XRY11X11X1100X11XRY
100X11X1011X11XRY101X11X1010X11XRY110X11X1001X11XRY111X11X1000X11XRY¦


The Total length is 335

 

AnBn Turing Machine

Machine Chapter

AnBnCn Example

 © Charles F. Schmidt

Formal Models of Computation - Table of Contents