The Semantic Octopus

octopus.jpg (59067 bytes)

In dense, complicated text, grammar doesn’t get us very far (or far enough, if the goal is high reliability). We can keep on adding more and more complex grammatical patterns, but eventually we will be defeated if we only follow this approach. Here are some examples

Landlord hereby leases to Tenant, and Tenant hereby accepts and leases from Landlord the premises ("Premises") described in the plan attached hereto as Exhibit A in the group of buildings known as One Executive Place, 111 West 34th Street, New York, constructed on the land described on Exhibit B attached hereto (the "Land") for the term and upon the conditions provided in this lease ("Lease") to be occupied by Tenant for .......................

Here the "to be occupied" points back to "the premises", fifty words and ten possible targets away, with three parenthetical phrases confusing the issue. The parenthetical phrases will be chopped out, and sheathing will occur from the other prepositional phrases, but we need to know when to look, and having many long, rarely used patterns is confusing and wasteful. And no matter how many patterns we have, there will always be the need for one more, with lawyerly constructions like this.

Another example

Landlord reserves and may exercise the following rights without affecting Tenant's obligations hereunder:
    (j) to make available to tenants in the building at a cost to be determined by Landlord, a high speed communication service....

Here, the form is "to make available a thing" (a lawyerly take on "to make something available"), but because the prepositional chain is long, a comma gets inserted, breaking up the grammatical pattern. We should only pivot on "make" or "keep", not on every infinitive, but the grammatical pattern will only operate at the level of Infinitive, because it needs everything else resolved to the appropriate level. We could use a WITHPROPERTY to make sure we are only using the pattern on Infinitives coming from ToMake and ToKeep, but this is wasteful too. But infinitives aren't enough, it could be "consider making available" or any other form that allows the "make" relation to have the same objects. The constant here is ToMake, and the way it shapes its local context, so it might as well control finding the objects it needs in that context.

A further problem is that some text, such as hospital discharge records, is written in haste and is not grammatical, and it is still necessary to extract information with high reliability.

So what can be done?

One can mount an argument that the system should be able to synthesise grammatical patterns from the symbols it encounters and its existing patterns if it finds no relevant pattern in its armory, but the case of ungrammatical text argues against the option of synthesised grammatical patterns, and the grammatical patterns being synthesised would have to respect the meanings of the relations anyway.

We can implement a semantic octopus approach, where the active elements in the text are identified, and allowed to search out their connections themselves, within less constraining grammatical rules. This fits in with the desire for structural invariance of the result - some examples of saying the same thing:tolease.jpg (26726 bytes)

Fred obtained a lease of the premises from John
John leased the premises to Fred
the premises that were leased to Fred by John
John’s premises were leased to Fred
The premises leased to Fred belong to John
The leasing to Fred of premises in the building that John owns
Fred has become the lessee of the premises, where the lessor is John
Fred will occupy the newly leased premises, with John as landlord

There is the same active element in each sentence, the ToLease relation, and it can be made to seek out its connections, knowing what they need to be to satisfy any particular meaning (it doesn't know the meaning yet, so we have alternatives on meaning, number and kinds of thing it should connect to). It shouldn’t matter whether the relation is found in a noun, an adjectival present participle (a gerund if you prefer), an adjectival past participle, an active verb, a passive verb, a participial, in a relative pronoun clause, or across several sentences, the same connections are required.toleaseloose.jpg (15237 bytes)

Of course, it gets more complicated as more relations come in close proximity. Some relations will share a subject

John failed to make Fred pay the money.

"failed" and "make" share the subject John.

The failure to make Fred pay the money proved expensive.

"failure" and "make" share the same unknown subject.

Some relations have different meanings, requiring different connections.

John made a doll.

John made Fred pay Olga what he owed her.

fredpaymoney1.jpg (120383 bytes)

"made" can mean brought into existence, like making a doll, but when John made Fred pay the money, we don’t just mean that John brought this relation into existence, perhaps by giving Fred the money so he could pay, we mean that John coerced Fred into paying the money (a meaning of ToMake may take us to some other relation, or structure of relations, to represent the meaning more accurately). There are other things happening here too - the "what" is a credible object of "pay" as it is also an object of "owe".

There is a stage where relations are searching out and attaching themselves to the objects in the text, but once the connections are made, they look much simpler and regular in network form (the linear text has been unravelled, just as the network was ravelled to form the text). The relations, with their desire for particular objects, are far better at determining what goes where than a "one size fits all" grammar.

fredpaymoney.jpg (62037 bytes)

Bringing semantics in at an early stage allows us to start reducing the alternative meanings a relation can have. This happens either by possible objects not being there - John didn’t coerce the doll to do anything, so that meaning is stripped - or that several relations compete for objects. If no other relation is interested in the object, the existence of the object may force a particular meaning for the relation, which brings us back to our second example

to make available to tenants in the Complex at a cost to be determined by Landlord, a private telephone service

Nothing else wants to pick up the telephone service, so "make" will have to find a meaning to cover it (and overlook the comma in the process).

John made a doll to pass the time.

This is "John passed the time by means of making a doll", because dolls don’t pass the time and aren’t easily coerced.

Mussolini made the trains run on time.

Trains are not coercible things (you can be sure someone was coerced, but we don’t know who), so all we can say is that Mussolini made something happen (caused the trains to run on time). Another example using "make"

It makes the failure to grasp the opportunity much worse.

Only by the relations squabbling over who gets what can the "much worse" phrase be allocated correctly - grammar contributes nothing here.

We will run into a phasing problem when we have a number of relations, all looking for things to connect to. In the first example, "to be occupied" will run into a parenthetical when searching left. This is handled by the InterimChainInfinitive (the infinitive is preceded by a prepositional phrase) symbol connecting itself to the parenthetical, and being woken again when the parenthetical is cut from the parse chain or subsumed into a higher level symbol. Again it will search left, and either encounter another symbol it can’t handle and have to suspend, or find all possible targets and start deciding what one it prefers. It will also need to wait if other relations may gobble up some of its targets – we potentially have a musical chairs situation, where each watches to see who grabs what, and complains if its only possibility is taken from it (it should have grabbed it when it had the chance - but now we are getting to the heart of the NLP problem - running multiple searches for meaning in parallel).

Sometimes different meanings of the same relation will squabble over the same objects

The traffic made John late.

Six years study made John a lawyer.

The different meanings expect different things – both of these meanings of "make" expect the form

Make something to be something

but one adds an attribute, while the other adds a new property for the object, and may make another relation concerning John comprehensible. Sometimes, the relation has to see a possibility for connection and decline it, as

She made John a cup of tea.

This is an implied "for", nothing to do with "made" at all. It isn’t quite that simple –

John was made a cup of tea while we waited

But ToMake has to be able to see that John cannot be made into a cup of tea, so this is still

Unknown person made a cup of tea for John to have.

Sometimes active operators will find there is no object - a common object is implied by the presence of the active operators, as in

the earlier to occur of 25th April 2008 and the date the work is complete

tolease2.jpg (63243 bytes)"earlier" is looking for an object to attach to, so is "to occur", and so is "of", so one is fabricated to suit all three. The same can be done using grammatical patterns, but without the level of certainty a semantic collision allows. Here, each search is bringing meaning to what it is looking to find - a long way from "noun phrase".

The octopus really starts waving its arms when prepositions are involved. The relation can have direct connection -

He warned John the bridge was damaged.

or the same meaning can be communicated in other ways

He gave a warning to John about the bridge.

Some prepositions are held directly in the relation structure, to point to subject, object, second object, and are typically

Subject - By        Object - To      Second Object - Of

John was warned by Fred of the danger.

Other prepositions are found through linkage to the preposition, such as

warnedagainstdanger.jpg (164746 bytes)

The relation is directly connected to prepositions that are specific to it, and connected to others through its parents. For instance, "warned against" is specific to ToWarn, but "warned about" can inherit from a more general form for "spoke about", "reasoned about" etc. The preposition on its own cannot be expected to handle all the work.

In these cases, the object or second object may be constrained as a grammatical symbol, as well as its meaning. That is, the relation is performing grammatical checks on its possible targets - those it rejects can be snapped up by other relations - the relation is modifying the grammatical context in its vicinity and by doing so can tolerate some lack of grammatical purity.

In simple cases like "John ate a banana", the semantic octopus approach and the grammatical approach would result in a dead heat, and for simple transitive verbs with an immediately following (and acceptable) object, the octopus approach is unnecessary (and doesn’t occur). In more complex cases, there just isn’t the grammatical depth (nor can there ever be) to handle the diversity of sentence structure. Put another way, grammar is fine for all the transitive verbs that have a form "thing verb thing" (cat sat on mat), but as soon as we encounter special cases, we might as well let semantics do the work from scratch, because this is what grammar does - throws up its hands when things get hard, because it is a generalisation.

This isn't the first time we have had to use the octopus approach, where operators make it their business to go out and find things that interest them - the problem typically arises in any complex dynamic scheduling problem. What is important is making sure they both cooperate and compete while doing so - one more aspect of a non-algorithmic approach.

What is described is just another example of Constraint Reasoning, but here everything is dynamic and has to do with building of structure, not just pruning sets of numbers.

Some relation examples

See A Pipeline to Failure in Bioinformatics

    Active Grammar

    Dynamic Phasing


    Prepositional Maps