Structuring the Hypothesis Space and Hypotheses Revision

   One of the interesting features of natural language is that it allows sentences to be constructed that can be ordered from general to specific. For example,

"Someone is going someplace next week." is more general than
either "Harry is going someplace next week." or "Someone is going to New York next week";

and all of these are more general than
"Harry is going to the Village on Tuesday."

When things can be partially ordered (or completely ordered) it allows us to "move" systematically from one point in the order to another.

   Logical statements as well as sets can also be partially ordered. In fact, they can be partially ordered in a special way - in a structure that is referred to as a lattice. This is important because it suggests that this order, when it exists in a concept space, might be exploited to our advantage in inductive learning.

   The figure to the right consists of 2 of the example instances that were used earlier; namely, a large black square and a small black square. I have chosen a single dimension - size so that the set of concepts is quite small. And, the power set of a 2 element set is also conveniently small - it contains only four elements.

   These four elements of the power set are the four sets shown in the figure. The top set consists of both the small and the large black squares. The bottom set is the "empty" set. The lines connecting the sets represent the fact that there is a subset relation between the sets connected by the line. The middle two sets are both subsets of the top set and the empty set is an element of each of the sets above it. (The empty set is considered to be an element of every set.)

   This arrangement illustrates the way in which these sets can be partially ordered. And, in fact, these sets form a lattice. The page Algebra of Sets and Definition of a Lattice provides a more detailed discussion of sets and lattices and you may want to refer to it later. But first work through the material on this page so that you can see the way in which these abstract ideas relate to inductive learning.

   For our purposes here, the important feature of a lattice is that its structure can be used to find the generalization of a pair of elements of the lattice that is the smallest generalization that must be made to include both elements. And, it can also be used to find the specialization of a pair of elements of the lattice that is the smallest specialization that must be made to exclude both elements. Recall that these are the two hypothesis revisions that might be made when a new instance in inconsistent with the existing hypothesis. If the new instance is positive and the hypothesis fails to identify it as such, then the hypothesis must be generalized. If the new instance is negative and the hypothesis fails to identify it as such, then the hypothesis must be specialized.

   The example above involves the minimum required to even talk of a space of concept examples. Recall that in our experimental example we used 4 dimensions each with two values. This resulted in a total of 2 to the 4 or 16 possible concept examples. When we know the number of concept examples, then we can actually set a bound on the number of possible concept hypothesis. These 16 possible instances can be grouped in 2 to the 16 or 65,536 ways. This is what is referred to as the power set. The power set is the set of all possible subsets of a set. The power set includes the empty set as well as the set that includes all items. These two bounding sets can be excluded as possible interesting concepts. Thus, we can reduce the space of possible hypotheses from 65,536 to 65,534!!

   In order to gain an appreciation of these ideas, it will be useful to consider a more realistic example. However, we will not use the full experimental space because I really have no desire to draw a lattice of 65,536 sets. Consequently, only half of the space is used by limiting our consideration to only the dimensions of color and size. We will fix the shape at the value 'square' and fix the value of position at 'left'. The power set of these 4 possible concepts instances yields a space of 16 possible concept hypothesis. The lattice of these 16 sets is shown below.

   The curly braces are used in this figure to enclose the elements of a set. The 5 horizontal levels along which the sets are organized reflects the partial order structure of this set of sets. The top set includes all four of the possible concept instances. The next level consists of four sets each of which contain a different combination of three of the four possible concept instances. Similarly, the next level consists of six sets each of which contain a different combination of two of the four possible concept instances. The fourth level consists of 4 sets each containing one of the four possible concept instances. And, finally, the bottom level consists of the empty set which contains none of the possible concept instances.

   The lines in the figure connect a set at level n - 1 with a set at level n if the set at level n - 1 is a subset of the set at level n. Now in order to see how the lattice property proves useful to hypothesis revision, let us choose one of the sets to represent our current hypothesis about the concept. Choose the set at the middle level at the far left. According to this hypothesis the concept is any instance that is black. And, we assume that this hypothesis is consistent with the training instances that have been seen to this point.

   Now assume that we encounter an instance that consists of a large white square. Further, suppose that we are told that this instance is a positive instance of the concept. Consequently, our hypothesis must be generalized. Note that there are 5 hypotheses in this space that are more general than the current hypothesis. How should we choose among this candidate set of revisions?

   Note first that the current hypothesis is a subset of only 3 of the 5. Thus, 2 of 5, the two at the far right of the figure, can be eliminated since adopting either of these would result in an hypothesis that was inconsistent with the training sequence that has been observed to this point.

   Consider next the generalizations of the current instance. There are seven generalizations that include this instance. However, only two of these seven generalizations include both the current instance and the current hypotheses. And one of these two is the most general hypothesis at the top level of the lattice. One other generalization of the hypothesis remains. That, is the set at the far left of the lattice at level 2. This hypothesis is preferred since it is the minimal generalization required to bring the hypothesis in lines with the training sequence.

   A similar line of argument could be illustrated if a negative instance were presented, such as a small black square. In this case the specialization of the hypothesis would result in the identification of the set at the far left of the lattice at level 4.


   The figure to the left shows a training sequence that is used to illustrate the movement in the hypothesis space in response to this training sequence. The training sequence consists of three positive instances of the concept that is to be learned. They are presented in order from left to right.

The animation below annotates the movement in the space using white ovals and redrawing the relevant 'subset line' in white.

 
   Here we have assumed that we began with the most specific hypothesis, namely the null set at the very bottom level of the lattice. Since the training set included only positive examples, this resulted in all hypothesis revisions involving generalization, or movement up the lattice.

The Concept Identification Experiment

 © Charles F. Schmidt

Learning - Table of Contents