LANGUAGE UNDERSTANDING

The questions which need to be answered when we consider investigating language understanding by computer software are:

1 what domains of discourse are rich enough to be a vehicle for studying the central issues yet simple enough to enable progress;

2 what data representations are required when dealing with English sentences;

3 what can be done and what can be done to enable the computations to take place;

4 what is meant by understanding is an intelligent response adequate;

5 what can be done to overcome problems when software limitations are exposed.

FROM SENTENCES TO MODELS

Powerful language systems may require some of the following ideas:

1 facts about word arrangement are explicit in parse trees;

2 facts about the way acts relate to objects are explicit in thematic role frames

3 facts about meaning are explicit in world models

PARSE TREES

Syntactic specialists bounces about in a sentence with only the modest goal of segmenting it into meaningful phrases and sentence constraints arrayed in a parse tree. Consider the sentence:

The clever robot moved the red engine to the appropriate chassis.

The parse tree for such a sentence records that the sentence is composed of a noun phrase and a verb phrase with an embedded noun phrase and an embedded prepositional phrase with an embedded noun phrase.

PARSING SENTENCES

To embody syntactic constraints, we need some device that slows how phrases relate to one another and to words. One such device is the context-free grammar. Others are the transition-net grammar and the augmented transition-net grammar. Still another is the wait-and-see grammar. We will examine each, briefly. First, however, we need a glossary. Linguists rarely write out the full names for sentence constituents. Instead, they use mostly abbreviations:

Full Name Abbreviation

Sentence S

Noun phrase NP

Determiner DET

Adjective ADJ

Adjectives ADJS

Noun NOUN

Verb phrase VP

Verb VERB

Preposition PREP

Prepositional phrase PP

Prepositional phrases PPS

Context-free Grammars Capture Simple Constraints

The first rule means that a sentence is a noun phrase followed by something denoted by the funny-looking VP-PPS symbol. The purpose of the VP-PPS symbol is revealed by the fifth and sixth rules, which show that VP-PPS is a compound symbol that can spin off any number of prepositional phrases, including none, before disappearing into a verb phrase.

The second rule says that a noun phrase is a determiner followed by whatever is denoted by ADJS-NOUN. The third and fourth rules deal with the ADJS-NOUN symbol, showing that it can spin off any number of adjectives, including none, before becoming a noun. And finally, the seventh rule says that a verb phrase is a verb followed by a noun phrase, and the eighth rule says that a prepositional phrase is a preposition followed by a noun phrase. The first eight rules involve only nonterminal symbols that do not appear in completed sentences, the remaining rules determine how some of these uppercase symbols are associated with lower case symbols that relate to words.

Context-free grammars consists of context-free rules like the following:

1 S -> NP VP-PPS

2 NP -> DET ADJS-NOUN

3 ADJS-NOUN -> ADJ ADJS-NOUN

4 ADJS-NOUN -> NOUN

5 VP-PPS -> VP-PPS PP

6 VP-PPS -> VP

7 VP -> VERB NP

8 PP -> PREP NP

9 DET -> a [[??]] the [[??]] this [[??]] that

10 ADJ -> silly [[??]] red [[??]] big

11 NOUN -> robot [[??]] pyramid [[??]] top [[??]] brick

12 VERB -> moved

13 PREP -> to [[??]] of

Because of the arrows it is normal to think of using the rules generatively, starting with sentence S via NP VP-PPS until a string of terminal symbols is reached.

Scan the string from the left to the right until a nonterminal is reached replace it using a rule and repeat until no nonterminals are left. Such grammars are known as context free because the left hand side consists only of the symbol to be replaced. All terminal-only strings produced by the grammar are well-formed sentences. Using top-down moving from the rules to the words

S

NP VP-PPS

DET ADJS-NOUN VP-PPS

The ADJS-NOUN VP-PPS

The ADJ ADJS-NOUN VP-PPS

The clever ADJS-NOUN VP-PPS

The clever NOUN VP-PPS

The clever robot VP-PPS

The clever robot VP-PPS PP

The clever robot VERB NP PP

The clever robot moved NP PP

.

The clever robot moved the red engine to the appropriate chassis.

Whilst it appears natural to move in this way our purpose now is to progress from the sentence to the parse tree using appropriate rules or bottom up. One way of doing this is to use the same rules backwards. Instead of starting with S we start with the words and end up with S hopefully with no spare words over.

The clever robot moved the red engine to the appropriate chassis.

DET clever robot moved the red engine to the appropriate chassis .

DET ADJ robot moved the red engine to the appropriate chassis.

DET ADJ NOUN moved the red engine to the appropriate chassis.

DET ADJS-NOUN moved the red engine to the appropriate chassis.

NP moved the red engine to the appropriate chassis.

NP VERB the red engine to the appropriate chassis.

NP VERB DET red engine to the appropriate chassis.

NP VERB DET ADJ engine to the appropriate chassis.

NP VERB DET ADJ NOUN to the appropriate chassis.

NP VERB DET ADJS-NOUN to the appropriate chassis.

NP VERB NP to the appropriate chassis.

NP VP to the appropriate chassis.

NP VP-PPS to the appropriate chassis.

NP VP-PPS PREP the appropriate chassis.

NP VP-PPS PREP DET appropriate chassis.

NP VP-PPS PREP DET AFJ chassis.

NP VP-PPS PREP DET ADJ NOUN.

NP VP-PPS PREP NP.

NP VP-PPS PP.

NP VP-PPS.

S.

SEE FIGURE 1

TRANSITION NETS CAPTURE SIMPLE SYNTACTIC CONSTRAINTS

All parser interpreters must involve mechanisms for building phrase describing nodes and for connecting these nodes together. All parsers must consider the following questions:

1 when should a parser start work on a new node, creating it;

2 when should a parser stop work on an existing node, completing it.

3 where should a parser attach a completed node to the rest of the already built tree structure.

In a traditional parser built using context free grammar rules a simple procedure specifies how to start and to stop nodes and the grammar rules specify where each node is attached.

An alternative method which is equivalent is called the transition net. They are made up of nodes and directed links called arcs. Examples of rules using transition nets are shown IN FIGURE 3.

The transition net interpreter gives straightforward answers to the questions stop start and attach. Work on an existing node stops whenever a net is traversed or a failure occurs and a node is attached whenever any net is traversed other than the top level.

To move through the sentence net we must first traverse the noun phrase net the first word must be a determiner. This procedure conists of top down parsing. It is called top down because everything starts with the creation of a sentence node at the top of the parse tree and moves down toward an eventual examination of the words in a sentence.

Consider again the sentence

The clever robot moved the red engine to the appropriate chassis.

Figure 4 indicates what happens.

Moving through the sentence net a sentence node is created. Next we encounter a noun phrase arc labelled T1. This creates a noun phrase node and initiates an attempt to traverse the noun phrase net. This in turn initiates an attempt on the determiner arc T3, in the noun phrase net . The first word is a determiner the consequently a determiner node is created and attached to the noon phrase. The word the is also attached to the determiner node. Now we need to take a choice either the adjective arc T4 or the noun arc T5. There is an adjective clever so we take the adjective path. The path T5 is taken for the noun robot. We are now in the double circle success node and this takes us back to the sentence node and we move one st age further on. The next thing to look for is a verb phrase T2. Moving quickly through the arcs T3 T4 T5 with the phrase

the appropriate chassis

we return to the verb phrase net. We now have the option of a prepositional phrase transition net. This is T10 in the verb phrase transition net. We now move to the prepositional phrase transition net. The first arc is a preposition and the first word encountered is to a preposition, eventually the phrase to the appropriate chassis is claimes as a prepositional phrase and we return to the sentence node and success at S3. As there are no more words in the sentence a successful parse has occurred. See FIGURE 2

AUGMENTED TRANSITION NETS

Context free and transition net grammars cannot capture all syntactic constraint more generalisation is needed. One way of doing this is to augment context free grammars with transformations. Transformations can specify word rearrangement for example

when did the clever robot move the red engine to the appropriate chassis

Studying grammars that involve transformations takes us into an area mainly studied by linguists.

Alternatively we can discuss augmented transition nets or ATN's where linguistic properties called features are attached to the current parse-tree node. Suppose we were traversing a noun phrase ATN and we encountered a or this it would be reasonable to attach a singular feature to the noun phrase. Later when a noun phrase noun is met the noun's features held in a features dictionary can be consulted to check that they are consistent with the features already come across. To capture all the syntactic constraints of English is a huge task and although many grammars have been written and are being written there is no completely satisfying grammar yet available.

PROBLEMS OF PREPOSITIONAL PHRASES

If we consider more involve sentences such as

the clever robot moved the red engine to the front of the appropriate chassis

we do not know how to link the prepositional phrase to the front to the correct part of the tree.

Clearly we would like the phrase of the appropriate chassis to be linked to front not the verb phrase as a whole. The obvious improvement is to add a prepositional phrase arc to the noun phrase transition net, but if this is used blindly it can lead to trouble. The problem is caused by the modified noun phrase net and the linkage of the prepositional phrase. Sometimes a transition net parser should go around the noun phrase's prepositional loop but other times it should pop out of the noun phrase transition net returning to the verb phrase in progress taking the prepositional phrase loop in that verb phrase. Somehow that decision involves semantic knowledge and this leads us to another lecture.

SUMMARY OF RULES

To parse a sentence using transition nets

1 create a parse tree node named S

2 determine if it is possible to traverse a path of arcs from the initial node to a success node denoted by a dotted circle. If so and if all the sentences words are consumed in the process announce success otherwise failure.

To parse phrases using transition nets

1 create a parse tree node with the same name as that of that of the transition net.

2 determine if it is possible to traverse a path of arcs from the initial node to a success node denoted by a dotted circle. If so and if all the sentences words are consumed in the process announce success otherwise failure.

To traverse an arc

1a if the arc has a lower case symbol on it the next word in the sentence must have that symbol as a feature otherwise fail the word is consumed as the arc is traversed.

1b If the arc has a downward arrow, [[arrowdown]], go off and try to traverse the subnet named just after the downward pointing arrow. If the subnet is successfully traversed attach the subnet's node to the current node otherwise fail.

TRANSITION NETWORK

A network consists of a set of states and a set of labelled arcs. Arcs have starting and ending states. States can be initial or terminal. A transition network can be considered to be a pattern for recognising a sequence of words. We proceed by moving from one state to another by moving along the arcs recognising appropriate words from the sentence and using the dictionary.

Examples of the use of the network are shown in figures 2.13 to 2.18.

The network can be implemented as a parallel algorithm and involves backtracking to fingd the corresponding network to enable the sentence to be parsed.

RECURSIVE TRANSITION NETWORK

A transition network is equivalent to a type 3 grammar a regular grammar whereas a recursive transition network is equivalent to a context free grammar of type 2. The RTN grammar is made up of labelled networks. The main new feature of RTN's are the arcs which are labelled with syntactic categories. This label is the label of another network. See figure 5.1 where in S we note an arc labelled NP and then we proceed to this network. Figure 5.2 shows the weak equivalence between a context free grammar and an RTN. Figures 5.3-5.5 gives the scheme for parsing sentences using an RTN.

AUGMENTED TRANSITION NETWORK

The additional features here include the conditions and actions associated with the arcs that enable a larger range of sentences to be handled. These new sentences include those expressed in the passive or questions. The actions can involve transformations that mean the networks are modified so that the analysis begins at different points in the network.. This is equivalent to changing the order of the words. Use is made of register structures including register tables where the features of the words are also included. Examples of this and the handling of the sentence

we have been given a firm deadline by the secretary are given in figure 5.7

ATN's consist of

Feature dimensions and role names are attached to phrases from the arc in the network handlijg the phrase. Conditions and actions are most important. For a particular arc to be followed the condition must hold and then the appropriate action must be followed. There are four types of arc. A category arc is used to match a single word. A seek arc initiates a recursive call to another network. A send arc which can be thought of as indicating successful parsing or termination of a network. A jump arc which means that we proceed from one state to another without parsing any elements of the input.

Examples of the use of ATN's in passive and question input is shown in the figs 5.15-5.26 with a dictionary in 5.12