Home » Articles » NG50. Network Grammar is not a finite state machine

NG50. Network Grammar is not a finite state machine

Storm on the Sea of GalileeIt’s encouraging that none of those who have kindly commented on Network Grammar seem able to answer the objections to mainstream generative grammar set out in NG1 to NG6.  However two eminent academics have expressed deep scepticism about what follows in NG7 onwards.  Essentially the NG model is alleged to be a finite state machine.  FSMs are capable of handling regular languages but not, according to Chomsky himself, human language.

The sceptics offer no further comment on NG.  They presumably consider their argument sufficient and incontrovertible.  It’s unclear whether there will be other objections when NG is shown not to be an FSM

This piece shows that NG should not be rejected as an FSM.

Finite state machine

The crux of what the two sceptics are saying is in a comment KA attached to LS7X.  KA uses ‘machine’ to mean a mathematical abstraction rather than a practical implementation, but also to mean what’s been described in LanguidSlog.

The machine consists of a finite network of nodes.  Since there are only finitely many, we can number them.  So let’s say that the machine consists of nodes n1 n2 … nk.

Each node can apparently be in seven discrete activation states: no activation (0) to full activation (6).  Therefore the state of the machine at any given time can be described by the finite k-digit long number d1 d2 … dk in base seven, where each digit represents the state of node i at that moment.  The state-space of the machine is therefore finite.

The machine accepts words from left to right.  These are drawn from a finite set of words, a set we might call the machine’s alphabet (A).  Whenever a word is encountered, the machine transitions deterministically from its current state into a new state.  The machine appears to start in a state where no node is activated, i.e. the machine is in the initial state 0, and it seems to return to this state at the end of a successful parse.

KA goes on to describe mathematically the processing of a sentence and concludes that the ‘machine’ is an FSM.

Really?

The issue is whether NG is a finite state machine.  That an FSM cannot comprehensively parse human language is not disputed.  Centre-embedded sentences are often cited to exemplify the difficulties.  But NG has been shown (in LS43) to handle such sentences.  So what’s going on?

The contradiction most likely arises from attributing to NG a set of characteristics that lead to the conclusion that it is an FSM.

The claims that the network is finite and that the ‘alphabet’ (lexicon) is finite are uncontroversial.  They would be true of any ‘machine’ implemented in physiology or electronics.

The claim that the network has a finite number of states depends upon activation always being in whole units.  LanguidSlog has said several times that there must be decay of activation but has never said decay is in whole units.  Therefore the claim that state-space is finite is dubious.  Whatever physiology allows would apply to any machine implemented in the brain.  If the state-space of NG is finite then the state-space of any other machine must also be.

The claim that NG ‘accepts words left to right’ is correct.  Phonological delivery is serial and unidirectional.  Any parsing machine must accept words from left to right.

Thus finite network, finite alphabet, finite state-space and left-to-right processing do not distinguish NG from any other machine.

Deterministic

The final claim is that NG ‘transitions deterministically…into a new state’.  This is a bit trickier to deal with.

‘Deterministic’ has a particular meaning in this context.  There are deterministic FSMs and non-deterministic FSMs.  In a deterministic FSM there is only one possible new state for a given existing state and a given input.  When the machine is in state A and encounters word X it always transitions to state B.

In a non-deterministic FSM there can be alternative new states.  When the machine is in state A and encounters word X it may transition to state C or to state D or …

Processing human language needs alternatives to handle situations such as in sentence (5):

(5) Nero is giving Olivia to Poppaea

At Olivia it’s possible that what follows is a noun phrase or a preposition phrase.  If an NP then Olivia is GOAL and the following NP is THEME.  If a PP then Olivia is THEME and the complement of the preposition is GOAL.  The machine must provide paths to deal with either of these alternatives.

Selection from among these paths is essentially a matter of trying them out.

NG does of course have a ‘look ahead’ capability.  This works by leaving certain propositions incomplete (GIVE / ? / OLIVIA) pending the input of later words.  It’s therefore fair to say that NG is non-deterministic and KA’s critique is flawed.

It’s not clear whether KA would approve of NG if it’s a non-deterministic FSM.  I suspect not.  Conventional wisdom is that something higher up the Chomsky Hierarchy is needed for human language.

Chomsky Hierarchy

I’ll not try to define formal language, generative power, context-sensitive and so on.  Readers not already familiar with the subject can try Wikipedia or one of the many books on the subject; Jurafsky and Martin ‘Speech and Language Processing’ should be accessible to non-mathematicians.

The Chomsky Hierarchy accommodates grammars expressed as rewrite rules. Each rule governs the behaviour of a string of adjacent words or non-terminals; the latter are the input (in production) or the output (in comprehension) of another rule.

A level in the hierarchy is characterised by the complexity of the rules it allows.  A formal language can be devised for anywhere in the hierarchy.  Human language ought to fit somewhere in the hierarchy because, at the highest level, formal languages have greater generative power.  Many believe that human language fits between two of the levels Chomsky originally identified; this is called ‘mildly context sensitive’.  Defining the rules for this is difficult and laborious although progress has been made, for example on Tree Adjoining Grammar (Aravind Joshi) and Combinatory Categorial Grammar (Mark Steedman).

No one has been able to define a complete set of rules for a human language.  This suggests the methodology – rules expressed as strings of words and non-terminals – is inappropriate.  LanguidSlog has therefore speculated about mental architecture and the sort of rules it could allow.  The results are promising and ought to encourage more work – not scepticism.

It’s difficult and not very useful to assert that NG is at a particular level in the hierarchy.  NG doesn’t check that a string of words is a sentence in the language; its function is to deliver conceptual propositions to cognition.  NG doesn’t work on individual words but on pairs of words, not necessarily adjacent.  NG doesn’t use non-terminals; the idea of rewrite rules is irrelevant.

Be open-minded!

The sceptics assume there is only one way of looking at language.  Clearly NG is radically different.  This piece has shown that the differences shouldn’t be ignored in order to condemn NG.  It would be better if the sceptics got an understanding of how the P / M / C / R / C / M / P construct works and how multiple instances of it are linked to form sentences.  There are good prospects for capturing this data from corpora.  NG could quickly surpass rival grammars.

Comments