Home » Articles » NG7x. Comments on NG7

NG7x. Comments on NG7

This unscheduled post contains my responses to several points in a comment on NG7 by HK.  I’ve also included responses to points about anaphor binding and agreement in an earlier comment on NG6 by HK.

Answering some of the points needed a wider range of formatting options than I’ve been able to get for comments from the WordPress software.  (If anyone knows better, please let me know.)  Therefore this is a separate post.

I’ve separated HK’s points and put them in italic, and answered each one in regular.

Thanks, HK.  We need more stuff like yours.

Anaphor binding

Consider for example, the c-command condition on reflexive binding. It is satisfied in:

HimSELF John admires most.

This is easily accounted for if there is a trace of movement (or similar). Without it, I’m not sure what you’d have to say about facts like these.

The junction John__admires uses PJ / CJ / MJ for John and PA / CA / MA for admires.  The rule CJ / R1 / CA then delivers the proposition ADMIRE / EXPERIENCER / JOHN.

The junctions admires__himself and himself__admires both use PA / CA / MA for admires and PH / CH / MH for himself; but while the canonical sequence uses the rule CA / R2 / CH, the fronted sequence uses rules CH / R3 / CA and CH / R4 / CA.

Rules CA / R2 / CH and CH / R3 / CA both create ADMIRE / SOURCE / (null) which, needing a concept to be complete, grabs JOHN (already linked to ADMIRE) to deliver ADMIRE / SOURCE / JOHN.

Rule CH / R4 / CA delivers ADMIRE / TOPICALISED / SOURCE.

Some of this goes beyond what is discussed in LS7 and 8.  But it’s straightforward and requires no separate, ghost-in-the-machine process.

Agreement

Other examples show that the trace of this kind of movement is active for agreement:

MARY John says is/*are coming to the party.

Possible junctions are prioritised.  Processing is one-pass, left-to-right.  For the nth word in a sentence, Pn / Cn / Mn, the sequence in which possibilities are tried is in principle:

Cn-1__Cn

Cn-2__Cn

Cn-3__Cn

etc

But not all of these will exist as a rule in the ‘lexicon’.  And those that are in the lexicon may not be available because a word can occur in a sentence only once as dependent.  Subject to that, the backwards scan stops when a valid rule for Cn is found.  (No ghost-in-the-machine is required if a gradient of decaying activation of preceding words gives the right sequence.)

In the given sentence, Mary and John might turn out to be coordinated but no junction is recognised without andJohn__says prevents Mary__says and John__is; but it allows Mary__isMary__are is not in the lexicon.  Etc.

Rules

So, the lexicon contains rules that define licit joins of categories. It would help my understanding greatly if these rules were spelled out a bit. Give some examples. What kind of Rs are there?

C / R / C rules are a bit abstract and so I make no attempt to spell them out.  However there will be plenty of examples of the M / R / M propositions that result from incoming phonological words: see stuff in small caps.  I don’t venture much beyond predicate-argument relations.  Using labels from theta theory gives us some common ground and is not too misleading.

However I’m undecided on how much semantics is brought to the delivered proposition by the relation and how much by the predicate.  I tend to think of the relation being syntactic – see my definition of R in LS7.  Trivially that is to link back to the diagrams in earlier pieces.  But my instinct is to have the smallest number of relations for word__word junctions – to allow those Rs to be plausibly innate.

Rs are concepts and could be anything.  A wider range of Rs might occur in propositions derived from prosody, gesture etc.  These might be learned rather than innate.

Subcategorisation

And if Cs are just categories rather than words, then how can the rules capture basic subcategorization patterns? A key fact about language is that some relations are encoded between items that are not adjacent (e.g. John gave Mary a book, or LUCY John kissed).

This follows naturally from words and rules as LS7.  For example:

Subcategorisation

 

Anything that has C2 can attach as AGENT to GIVE, LEND, SELL etc; any C3 as THEME; any C4 as GOAL.

Verbs give, lend and sell are listed as syntactically identical in Levin (1993).  But arguably give is different because it allows an inanimate AGENT:

(i) Bouillabaisse gave John salmonella

(ii) * Bouillabaisse sold John salmonella

This distinction can be treated as syntactic by assuming different Cs for the AGENT rules of GIVE and of LEND and SELL.  Attributing the unacceptability of (ii) vaguely to ‘semantic processing’ would be less satisfying.

So, while the C for BOUILLABAISSE is not allowed by the rule for SELL/AGENT, it is for SELL/THEME:

(iii) Bouillabaisse sold in Marseille…

Therefore the reduced relative form of (iii) can never give a ‘garden path’.  Interestingly my own analysis of the British National Corpus showed that the reduced form is predominant for relative clauses, while garden paths are vanishingly rare.

Another illustration:

(iv) Nero gave Olivia to Poppaea

(v) Nero restored Olivia to Poppaea

(vi) * Nero refused Olivia to Poppaea

(vii) Nero gave Poppaea Olivia

(viii) * Nero restored Poppaea Olivia

(ix) Nero refused Poppaea Olivia

Verb restore accommodates one object as theme.  A second object cannot attach anywhere and the incomplete proposition it leaves signals ‘ungrammaticality’.  There are rules allowing restored__to and to__Poppaea which together deliver RESTORE / GOAL / POPPAEA.

Verb refuse is ditransitive like give except that the first object is always GOAL, not temporarily shared between GOAL and THEME (as shown in LS8).  Also there is no rule allowing refused__to.

Adjacency

Furthermore, (a counterpart) items may be adjacent in a string and not in an appropriate selection relation, where in your proposal perhaps they ought to be? (For example, the verb and adjective in “John kissed beautiful girls”).

This is covered above under Agreement.  Nowhere does LanguidSlog say that paired words must be adjacent.  LS8 discusses a sentence with give, showing non-adjacent pairings.

Relation

The picture looks nice, but the proposal advanced here remains very unclear. If I’ve understood it correctly, R is a semantic relation (say ‘patient’), not a syntactic relation.

P, M, C and R concepts and P / C / M and C / R / C propositions are logically fundamental to language.  (Indeed concepts and propositions must be fundamental to all mental processes and to the storing of ‘knowledge’, innate and acquired.)  They are implemented directly in the ‘bits and bytes’ of the neurophysiological hardware.  Therefore it’s not really appropriate to characterise R as ‘syntactic’ or as ‘semantic’.

But it would not be too misleading to say ‘an R is syntactic in the C / R / C rule and semantic in the M / R / M proposition delivered to cognition’.

For an explanation of my tactics, see Rules above.

Constituency

 ‘No junction can include another junction’ implies that the proposal will fail to take account of constituency.

The proposal accounts for empirical data that are accounted for elsewhere by constituency.  LS2 to 6 (plus my initial response to HK’s comment on LS6) show why those other accounts describe but fail to explain.

Of course the amount of empirical data addressed using the proposal to date is a tiny fraction of what has been addressed elsewhere.  LS7 is only to discourage instant dismissal of LS2 to 6 because ‘there is no alternative’.  And the whole of LanguidSlog will not answer every mystery of language revealed by the many thousands of man-years that have been expended on generative grammar.

Generative capacity

Dependency grammars typically have the generative capacity of context-free phrase structure grammars.  Grammars that fit your proposed computational arrangement probably have no more generative capacity than a regular grammar (which would be far too weak for natural language). But I may not have understood the proposal very well.

The generative capacity of the proposal is constrained by the M / P / C / R / C / P / M sub-assembly and by the backwards scan (see Agreement above).  There is no algebra to express that and no plan to formulate one.  But my strong impression is that the limits of the proposal and of acceptable English sentences are closely aligned, although long-distance junctions have not been tackled yet.

I don’t understand ‘Grammars that fit…regular grammar’ and wonder if this repeats the misunderstanding about Adjacency (above)?  Ironically, parsing a regular language would need a stored-program computer.

Prestored sentences

Your ‘every possible sentence of an idiolect is pre-stored’ flies in that face of the well-known observation that we can understand sentences we have never heard before, as long as we know the words that are contained in it.

‘Pre-stored’ doesn’t mean the sentence has already been heard or voiced.  It means that the mental network has a great many paths through it – infinitely many because of recursion.  For once I’m not disputing the orthodoxy, but simply making a point about real-time computation.

Mr Nice-Guy

2 comments

  1. KA says:

    I’m going to pick up this very old thread here, simply because, from the perspective of many syntacticians, the issue will be important. HK, a long time ago, raised the question of the generative capacity of the proposal:

    “Dependency grammars typically have the generative capacity of context-free phrase structure grammars. Grammars that fit your proposed computational arrangement probably have no more generative capacity than a regular grammar (which would be far too weak for natural language). But I may not have understood the proposal very well.”

    Mr NG answered in two paragraphs as follows:

    “The generative capacity of the proposal is constrained by the M / P / C / R / C / P / M sub-assembly and by the backwards scan (see Agreement above). There is no algebra to express that and no plan to formulate one. But my strong impression is that the limits of the proposal and of acceptable English sentences are closely aligned, although long-distance junctions have not been tackled yet.”

    The second paragraph reads: “I don’t understand ‘Grammars that fit…regular grammar’ and wonder if this repeats the misunderstanding about Adjacency (above)? Ironically, parsing a regular language would need a stored-program computer.”

    The second sentence in Mr NG’s second paragraph is wrong unless Mr NG thinks that finite state machines require stored programs: This is so, because the set of regular languages (viewed as strings) is equivalent to the set of languages that can be recognized by a finite state machine. I can’t understand what Mr NG is claiming here and I will have to set this last sentence aside.

    This leaves us with the first paragraph, in which Mr NG professes no interest in studying the formal properties of the proposed grammar and in which Mr NG offers up a hunch. We thus have two conflicting hunches HK’s and Mr NG’s: HK suspects that the machine has only the power of a finite state machine; Mr NG suspects that the limits of the proposal are closely aligned with those of acceptable English.

    Let’s move beyond hunches.

    The machine consists of a finite network of nodes. Since there are only finitely many, we can number them. So let’s say your machine consists of nodes n_1… n_k (underscore is intended to be read as subscripting). Each node can apparently be in one of seven discrete activation states: no activation (0) to full activation (6). Therefore the state of the machine at any given time can be describe by the finite k-digit long number d_1 d_2…d_k in base seven, where each digit represents the state of node i at that moment. The state-space of the machine is therefore finite.

    The machine accepts words from left to right. These are drawn from a finite set of words, a set we might call the machine’s alphabet (A). Whenever a word is encountered, the machine transitions deterministically from its current state into a new state. The machine appears to start in a state where no node is activated, i.e., the machine is in the initial state 0, and it seems to return to this state at the end of a successful parse.

    Thus, the machine is characterized completely by the following quintuple:

    M=<A, S, T, s, e>

    where A is the finite alphabet (the set of words), S is the finite set of states, T is the finite set of state transitions t from SxAxS, s is the initial state, and e is the final state (where it seems that s=e=0). This is a finite state machine.

    We can now say that the machine accepts a sentence w_1…w_j iff there is a sequence D=<t_1,…,t_j>, such that all t_i are from T, t_1 is a member of {s}x{w_1}xS, t_j is a member of S x {w_j} x {e}, and for all t_i, t_i+1 t_i is in Sx{w_i}xS, t_1+1 is in Sx{w_i+1}xS, and if t_i is in SxAx{s_i} then t_i+1 is in {s_i} x A x S.

    The language recognized by any machine that meets these descriptions is a regular language.

    Unless there is a mistake in the proof or the assumptions built into it, HK’s hunch is upheld. The machine recognizes a regular language. By common agreement in the field, this is not a very good fit for natural languages.

    • Mr Nice-Guy says:

      Thanks, KA. Did HK misunderstand what, since LS9, I’ve been calling ‘NG’? His comment on adjacency suggests he assumed the words in a sentence are simply processed one at a time in left-to-right sequence, each of these interacting with the state left after the previous word. Points about NG he seems to have missed are: (1) the lexicon (‘alphabet’ if you like) contains rules as well as words – see LS7; (2) a particular word may contribute to the analysis (‘parse’ if you really must) more than once via more than one rule; (3) ’word’ in NG means a word in the context of another word, the junction (‘dependency’ if you like) being allowed by a rule; (4) because a word may retain activation, it may form a junction with another word in the sentence any number of places to the right; (5) the propositions created by different rules may interact; (6) what is delivered to cognition is a bundle of propositions, not a ‘parse’ as conventionally envisaged for PSGs.

      Does that fit your definition of ‘finite state machine’?

      Other points … Re ‘stored-program computer’, see Wikipedia on FSMs, section on Implementation. Re ‘algebra’, this was merely a comment on my priorities and the fact that NG was and is work-in-progress requiring a lot of fresh thinking.

      It’s understandable that devout Chomskyans tackle heresy by attacking alternatives rather than defending mainstream generative grammar. Even if NG proves to be wrong the issues discussed in LS1 to LS6 – structure, mental architecture, lexicon – will remain.

Comments