Parsing Example

    Parsing a sentence involves the use of linguistic knowledge of a language to discover the way in which a sentence is structured. Exactly how this linguistic knowledge is represented and can be used to understand sentences is one of the questions that has engaged the interest of psycholinguists, linguists, computational linguists, and computer scientists.

    The figure to the right provides an example of the way in which the computational formalism of a context free grammar can be used to represent a fragment of the linguistic knowledge that any speaker of English would possess.

    Here linguistic knowledge is represented as a set of rules. A rule consist of two parts; the part to the left of the arrow (the leftside) and the part to the right of the arrow. You will note that some of the rules have a vertical slash on the right side. For example the second rule reads as NP --> Det N | Prop. The slash is a convention used to represent that the NP can be replaced either by Det N or by Prop.Thus, this is really two rules. The top-most rule states that the symbol S is replaced by the symbols NP and VP. Here S refers to 'sentence' and NP and VP to 'noun phrase' and 'verb phrase' respectively. These are part of the non-terminal vocabulary of the grammar. Words that actually occur in a sentence, such as 'the,' 'home,' and 'go;' are part of the terminal vocabulary of the context free grammar.

     You may have noted the similarity between this formalism and the method of problem reduction that was described in the section on the computational approach to cognition. A context free grammar can be viewed as a special kind of problem reduction. Like the method of problem reduction, the context free grammar involves breaking a problem into subproblems until only primitive subproblems remain.The reading of the first rule as "If the goal is to discover the syntactic structure of this string of words, then try to partition the string of words into one part that is a noun phrase and another part which is a verb phrase" may help you to see this as an example of a rule that decomposes a problem into subproblems. However, note that this is not a general problem solving system, but rather one that is dedicated to the single goal of assigning structure to sentences.

     We will use this simple set of syntactic rules expressed as a context free grammar to illustrate one of the questions that cognitive psychologists have pursued in their attempt to discover how our minds are able to use linguistic knowledge to understand sentences. The question concerns the interplay between the input and the process of attempting to "parse" the input....or in this example, the process of identifying the syntactic structure that can be assigned to the sentence.

     You may recall that when we discussed the cryptarithmetic problem we noted that, in principle, any of the subproblems (assigning the correct integers to some subset of the letters) could be pursued at any point. However, in practice, it was advantageous to solve some of the subproblems before others. The parse tree shown in the center to the right represents the decomposition of the sentence into its components.

      Is there a particular order in which each of these subproblems should be pursued?

     There are many orders that are possible.

 An animation of a Left to Right Top-Down Parse  An animation of a Right to Left Top-Down Parse

  The Parse Tree of the Sentence
"The boy went home"
Four of these possibilities are animated in the figures surrounding the parse tree.  An animation of a Left to Right Bottom-Up Parse   An animation of a Right to Left Bottom-Up Parse 

In these animations, the symbol(s) to which a rule is applied are briefly underlined and shown in red. Then, a new line is added showing the result of applying the rule to the symbol(s).

     Each of these ways of parsing the sentence arrive at the same assignment of structure to the sentence. They only differ in the way in which the rules are used and the order in which they are applied. The top two animations illustrate what is referred to as a top-down parse because the expansion of the tree begins at the top or root-node. The bottom two animations illustrate what is referred to as bottom-up parse because the construction of the tree begins from the terminal nodes (the words of the sentence) of the tree. A top-down parse is realized by matching the left side of a rule to an appropriate symbol and then replacing that symbol with the symbols on the right side of the rule. For example, replacing S with NP VP. The bottom-up ordering is realized by matching the right side of a rule to the appropriate symbol(s) and then replacing the matched symbol(s) with the symbol on the left side of the rule. For example, replacing NP VP with S.

     In addition to the way the rule is used, we can distinguish the order in which the operations are applied. The animations on the left illustrate a left to right order; that is, at a particular level of the tree, the subproblems on the left are chosen before the subproblems on the right. The animations on the right illustrate the reverse or right to left order.

     In language understanding it would seem most natural to parse an input sentence in a left to right order since that is the order in which the words are heard. However, in the presence of noise, it might be that the order chosen is determined by the ease with which a subproblem can be solved. Deviation from the left to right order of processing requires some means of remembering the words that have been received but temporarily ignored.

     The distinction between top-down and bottom-up processing is one that aroused considerable interest. Although these terms were originally defined by the kind of parsing process illustrated here, they have been used in a broader sense; and, it is perhaps better to refer to a bottom-up process as a postdictive process and a top-down process as a predictive process. Recall that the bottom-up process involves the triggering of the right hand side of rules. And, in our example the initial triggering of a rule depends entirely on the input string. And, in fact, a rule is not applied until the input required for its application is satisfied. In this sense, the knowledge is utilized postdictively and the process never goes beyond or gets 'ahead' of what the input justifies. Contrast this with the top-down process. Here verification of a rule occurs as the last step in the application of the rules. When a left-hand side of a rule is triggered, then the contents of the right-hand side can be viewed as a prediction about what will be seen.

     A final distinction concerns the question of determinacy. Most of the time our intuitions support the idea that the sentences we hear have but one interpretation. But notice that even when this is true, it does not imply that at each point in the process of sentence understanding only one choice, the correct choice, was available. A parsing process that considers many different possible interpretations is referred to as a nondeterministic process. There are a variety of ways to realize a nondeterministic process. One way is to pursue all of the possible hypotheses at once. You may remember when we studied induction we noted that this space could be quite large for most inductive tasks. It appears that the space can also be quite large in the language understanding task.

Understanding, Interpreting and Remembering Events

© Charles F. Schmidt