History of the Spanish clitic pronouns

1. Etymology
The Spanish clitic pronouns are the weak object pronouns me, te, lo(s), la(s), le(s), nos and os (previously vos), together with reflexive se. Of these, the items with first- or second-person reference, together with reflexive se, descend from the corresponding accusative forms of the Latin personal pronoun:

> me
> te
nōs > nos
vōs > vos > os
> se

Latin had no bespoke non-reflexive personal pronoun in the third-person category, and instead used its demonstrative pronouns for this purpose. In the variety from which Spanish later emerged, usage must have converged on the distal demonstrative ĭlle ‘that one’, as it is from accusative and dative forms of this item that the Spanish third-person object pronouns descend, as is shown in Table 1 below.

Table 1  Origin of the Spanish third-person (non-reflexive) object clitics

                 Direct object
Indirect object
Masculine Feminine  
ĭllum > lo ĭllam > la ĭllī > le
ĭllōs > los ĭllās> las ĭllīs > les

Neither the forms of the Latin personal pronoun nor those of the demonstrative ĭlle were originally clitics, i.e. linguistic expressions lacking prosodic independence and needing to attach phonologically to an adjacent word. Self-evidently, however, these items must have evolved into clitics when they occurred as the direct or indirect object of the verb. As part of this process, the relevant forms of ĭlle also underwent apheresis of their initial syllable: (ĭl)lum > lo, (ĭl)lam > la etc.

An additional point to note is that, when immediately followed by ĭllum, ĭllam, ĭllōs or ĭllās, as in e.g. ĭllī ĭllum dedit ‘he/she gave it to him/her’, the singular dative form ĭllī gives ge rather than le in Old Spanish. This is due to the final vowel of ĭllī evolving into the semivowel [j] before the initial /e/ (= ĭ) of the following pronoun, so that, for example, ĭllī ĭllum became [ljelo] once apheresis of ĭllī’s initial syllable had occurred. The sequence [lj] developed regularly to /ʒ/ in Old Spanish, whence the form ge, pronounced /ʒe/. With the devoicing of the sibilants, this latter pronunciation gave way to /ʃe/, whereupon ge was amended to se, by analogy with the reflexive pronoun se (< ). The overall development is shown below, using ‘†’ to indicate an analogical form:

ĭllī ĭllum > gelo > †se lo
ĭllī ĭllam
> gela > †se la
ĭllī ĭllōs
> gelos > †se los
ĭllī ĭllās
> gelas > †se las

In this use, se is functionally equivalent to either le or les, depending on the context. For example, se lo di can mean either ‘I gave it to him/her’ or ‘I gave it to them’. The same was true of Old Spanish ge, implying that at some point in the pre-Old Spanish period, the singular dative pronoun ĭllī displaced its plural counterpart ĭllīs in this particular context. Had this not occurred, the final /s/ of ĭllīs would have blocked any semivocalization of the ī in its final syllable and, as a consequence, its reflex before lo, la etc. would be les, as it is in all other contexts.

2. Clitic placement in Old Spanish
The placement of object pronouns in Classical Latin gives little indication of what was to follow in Old Romance. In the example below, from Cicero, the pronoun me appears sentence-initially and is separated from its verb adduxerunt by three syntactic constituents, viz. the subject tuae litterae, the negative adverb numquam and the prepositional complement in tantam spem. On both counts, its placement would infringe the constraints that governed the positioning of object pronouns in the medieval Romance languages.

Me tuae litterae numquam in tantam spem adduxerunt quantam aliorum;
‘Your letters never brought me into as much hope as did those of others’
(Cic. Att. 3.19.2)

In later Latin, or more generally in spoken Latin, it is possible that a tendency emerged whereby object pronouns that were not pragmatically salient, for example as foci or as contrastive topics, were positioned immediately after the first element in the sentence, excluding any sentence-initial peripheral elements, such as hanging topics or certain types of temporal or locative adverbial (cf. Salvi 2011: 363). If so, this tendency must have been progressively consolidated, as it is manifested rather more clearly by the weak object pronouns of Old Spanish and indeed of Old Romance generally.

2.1 Main clauses
In Old Spanish there was a fundamental distinction between main clauses and finite subordinate clauses as regards the linearization or placement of clitic pronouns. While proclisis (immediate preverbal placement of the clitic) was the rule in finite subordinate clauses, weak pronouns in main clauses were often enclitic on the verb (i.e. the finite verb). Enclisis was in fact obligatory whenever the clitic would otherwise be sentence-initial, as in (1) below.

(1)        Embiaron le entonce los griegos de macedonja mandado que los acorriesse apriessa.
             ‘The Greeks of Macedonia then sent him a message to come to their assistance quickly.’
             (General estoria V, fol. 120 v.)

The prohibition on sentence-initial clitics in Old Spanish, and indeed Old Romance generally, is known as the Tobler-Mussafia law, which has been much discussed in the syntactic literature. In its conventional format, the Tobler-Mussafia law is actually too weak, because although enclisis was obligatory when the clitic would otherwise be the first item in the sentence, it was not restricted to this type of case. Indeed, enclisis was actually quite common in cases in which proclitic placement would not have resulted in the clitic being in sentence-initial position (see examples (6) to (8) below).

In reality, certain types of preverbal element were invisible to the clitic linearization algorithm, in the sense that even though their presence would have prevented the clitic from being sentence-initial had it been placed before the verb, it was still positioned after the verb. To understand the distinction between elements that were ‘invisible’ for clitic linearization and those that were not, it is probably easier to consider the latter class first, i.e. the class of elements that were proclisis attractors.

The most consistent proclisis attractor was the negation marker no(n), the presence of which invariably caused the clitic to be preverbal in main clauses, as is illustrated in (2):

(2)       Non les pudo dar la tierra queles prometiera.
            ‘He was unable to give them the land that he had promised them.’
            (General estoria I, fol. 213r)

In the same way, objects and prepositional complements, together with certain types of adverb, that were moved from a postverbal position to a preverbal one (i.e. before the finite verb) also triggered obligatory proclisis. In (3) to (5) below, the item which has been moved from its usual postverbal position is underlined (and both it and the proclitic weak pronoun are in bold font):

(3)        E esta alcaria les damos en camio de Solucar de Albayda e de Brenes
             ‘And this hamlet we give them in exchange for Solúcar de Albaida and Brenes’
             (Alfonsine privilegio, 14 March 1272, Murcia; Seville: Arch. Cat. IX. 3. 56)

(4)        Mas por este nombre le llama sant bernaldo en la glosa
             ‘But Saint Bernard calls him by this name in the gloss.’
             (General estoria IV, fol. 182v)

(5)        Ca el juyzio que es cosa muy derecha manifiestamientre se deue dar. & no en encubierto.
             ‘For a trial, which is a very just event, must take place openly and not in obscurity.’
             (Libro de las leyes, fol. 48v)

Note that the structure in (3) is distinct from the common modern structure known as clitic left dislocation, in which a sentence-initial phrase is resumed by a co-referential clitic. In (3), the object esta alcaria is not resumed by a clitic (or by anything else); it has simply been moved from its usual postverbal position to a preverbal one, an operation which is now quite rare but by no means impossible. Interestingly, while object fronting as in (3) triggered obligatory proclisis in Old Spanish, clitic left dislocation, which was much less common than in the modern language, usually correlated with enclisis (see Mackenzie 2019: 35).

The common feature of proclisis attractors in Old Spanish is that they were syntactic elements that were highly integrated in clause structure: items like the negation marker, together with objects and other complements, are the basic syntactic building blocks of any language. In contrast, the elements that were ‘invisible’ for clitic linearization were syntactically peripheral, i.e. less tightly integrated in clause structure. This category includes items such as naturally preverbal locative or temporal adverbials, sentence-initial topics and the like, all items which can usually (though not always) be separated from the rest of the clause by an intonation break.

In the example below, the adverbial phrase otro dia is a peripheral element of the relevant kind. That is to say, it is not part of the clause’s core syntax, which is reflected in the fact that a pause can naturally be inserted after this element. As a consequence, it is invisible for clitic linearization purposes and the weak pronoun le comes after the verb enviaron, exactly as it would do if nothing preceded the verb:

(6)         Otro dia enuiaron le dezir que saliesse al campo a lidiar con ellos.
             ‘On another day they sent word asking him to come out to the battlefield to fight with them.’
             (Estoria de España II, fol. 196v)

Similarly, in (7) below the preverbal conditional clause is peripheral to the core syntax, the following finite verb faze then coming before its clitic rather than after it. Notice that the scribe appears to have sensed the syntactic separateness of the preverbal conditional clause, as he inserted a semicolon immediately after it, which in the medieval manuscripts usually indicates a break in the intonation.

(7)        Et si la mezclaren con mestranto; faze se della emplastro muy bono pora la ferida del alacran.
             ‘And if it is mixed with wild mint, it makes a vey good poultice for a scorpion’s sting.’
             (Lapidario, fol. 72r)

An interesting case is that of the preverbal subject, which has a variable effect on clitic linearization in Old Spanish. Frequently this item co-occurred with enclisis, implying that it was structurally peripheral:

(8)         Et ell Emperador recibio lo muy omildosa mientre. & muy sancta.
             ‘And the emperor received it with great humility and piety.’
             (Estoria de España II, fol. 272 r.)

On the other hand, it also occurred commonly with proclisis, suggesting full integration within clause structure:

(9)         Ca la mugier pare al Rey. la mugier le aduze a uida.
             ‘For woman gives birth to the king. Woman brings him into life’
             (General estoria IV, fol. 125 v.)

The variation indicated by (8) versus (9) can perhaps be linked to the fact that Old Spanish (like modern Spanish) was a null subject language, which implies that the true syntactic subject is frequently an unpronounced subject pronoun rather than an overt linguistic element. If that was the case in (8), the preverbal phrase ell Emperador would turn out not to be the true syntactic subject, but rather a left-peripheral topic, resumed by an unpronounced subject pronoun; i.e. the meaning would be ‘And the emperor, he received it . . .’. Under that analysis, pronominal enclisis would be expected. The supposition as regards (9) would then be that, in the latter type of case, the apparent subject is the true syntactic subject and not a left-peripheral topic, with proclisis following accordingly. For a more detailed discussion, including quantitative data, of clitic linearization with preverbal subjects in Old Spanish, see Mackenzie 2019 (pp. 87–89 and p. 95).

Adapting some of the ideas in the literature (see e.g. Rivero 1991, 1993 and Fontana 1993), the pattern outlined so far can be analysed in terms of the clitic having a fixed position to the left of the finite verb’s customary position (designated as T in the theoretical literature). This yields the proclitic linearization exemplified by (2) to (5), together with (9). To derive the enclitic linearization illustrated by example (1), together with (6) to (8), the verb can be analysed as moving leftwards across the clitic, as shown schematically in (10), where the copy of the verb in strikethrough font indicates its position prior to reordering and the brackets demarcate the core clause:

(10)        Otro dia [enuiaron le enuiaron dezir que saliesse . . .]

In the spirit of the Tobler-Mussafia law, the operation schematized in (10) can be seen as a device to prevent a clitic from being the first element in the core clause, i.e. excluding peripheral items such as sentence-initial topics, framing adverbials and the like.

This does not, of course, explain why the medieval clitic shunned that position so consistently. And there is, in fact, no general agreement as to why this was the case. However, a plausible conjecture would be that the Tobler-Mussafia typology associated with the high Middle Ages is the residue of an ealier stage in the language’s evolution, late spoken Latin, say, or pre-literary Spanish, during which the weak pronouns were, by hypothesis, inherently enclitic or ‘left-leaning’. In this view, they originally had to attach phonologically to a host expression, not necessarily the verb, to their left. Over time, this constraint lost its phonological basis, but the requirement for there to be a word or phrase to the clitic’s left persisted, although reinterpreted in syntactic terms, i.e. with phonological attachment between the clitic and the expression to its left being superseded by structural proximity.

The conjunction et, y, e, &

Many clauses in the Old Spanish manuscripts are introduced by et, y, e or &, all meaning ‘and’. When such clauses are not part of a larger subordinate clause, they behave exactly like independent sentences as regards clitic linearization, although the coordinating conjunction itself is invisible to the linearization algorithm. This invisibility is a consequence of the fact that words meaning ‘and’ are coordinating conjunctions, implying that they are syntactically external to the constituents which they link together. Pronominal enclisis was thus mandatory (to begin with, at least) in main clauses in which proclisis would bring the clitic into direct adjacency with et, y, e or &:

(11)       Et fizieron les y luego en valladolit las bodas.
             ‘And they held the wedding for them there in Valladolid.’
             (Estoria de España II, fol. 289 v.)

(12)       & al tercero dia tornos le la enfermedat. & touieron lo por muerto.
             ‘And on the third day the sickness returned and they held him for dead.’
             (Judizios de las estrellas, fol. 200 v.)

Note that in (12) the first clause also exhibits enclisis, specifically of the array or combination se le, the se component being reduced to s and being orthographically attached to the verb form torno. In this case, however, the clitic array would not be immediately adjacent to & if it was proclitic. The reason for the enclisis is that the preverbal phrase al tercero dia is the same type of element as otro dia in example (6), i.e. it is a peripheral element that is invisible to the linearization algorithm.

2.2 Finite subordinate clauses
In contrast to main clauses, finite embedded or subordinate clauses in Old Spanish exhibit systematic proclisis from the earliest documented period, implying that the basic linearization pattern in this context has remained unchanged. Example (13) below provides a typical illustration:

(13)      Et enuiaron le pedir que les diesse cada anno .L. donzellas de las mas fijas dalgo
            ‘And envoys were sent to request that he give them each year 50 young women of the highest birth’
            (Estoria de España II, fol. 23v)

For additional illustrations, see the embedded que-clauses in examples (1) and (2), as well as the preverbal si-clause in (7).

Note that the requirement for proclisis in subordinate clauses overrides the principle that clitics did not occur immediately after et, y, e or &. Thus in (14) below, la is proclitic in relation to touiere, despite coming immediately after et. This is because the coordinate clause containing the clitic is itself contained within a subordinate clause, viz. the conditional clause si fuere negro et la touiere alguno consigo.

(14)       Et si fuere negro et la touiere alguno consigo. recabdara todo lo que quisiere con los omnes.
             ‘And if it is black and someone carries it on himself, he will achieve anything he wants with other men.’
            (Lapidario, fol. 111v)

2.3 Infinitival clauses
The two most frequent types of infinitival clause in Spanish are those that follow a preposition, such para ‘for/to’, de ‘of/from’, a ‘to/at’ or en ‘in/by’, and those that follow a so-called restructuring or clause-union verb such as querer ‘want’, deber ‘have to’, poder ‘be able to’ or causative hacer ‘make’. Verbs of this type form a relatively tight syntactic unit with the following infinitive, and hence may attract a clitic that is structurally associated with the infinitive into their own orbit, producing a phenomenon known as clitic climbing (also: clitic promotion).

As regards infinitival clauses that were introduced by a preposition, these exhibited variation in Old Spanish between proclitic and enclitic placement of the weak pronoun. This is illustrated in the two examples below.

(15)      Ca estos siempre punnan de los embargar que se no saluen.
            ‘For they continually strive to prevent them from being saved.’
            (Libro de las leyes, fol. 78r)

(16)       et uenran contra las uillas del Rey por acercar las. & conquerir las.
              ‘and they will attack the king’s towns in order to surround them and to conquer them.’
              (Libro de las cruzes, fol. 41v)

In infinitives governed by clause-union verbs, the clear tendency in Old Spanish was for the clitic to be placed immediately before the finite verb, as in (17) below, rather than for it to attach enclitically to the infinitive (as is common in modern Spanish):

(17)       Mas los franceses non lo quisieron fazer.
              ‘But the French did not want to do it.’
              (Estoria de España II, fol. 195 v.)

An additional possibility was for the clitic to go between the finite and non-finite verbs, as in (18) below:

(18)        Et el quisiera lo fazer mas yo nol quis oyr.
               ‘And he had sought to do it but I refused to listen to him.’
               (General estoria II, fol. 85 r.)

Note that in the type of case illustrated by (18) the weak pronoun should be analysed as being enclitic on the finite verb rather than proclitic on the infinitive. This can be inferred from that fact that the ‘clitic-medial’ pattern shown in (18) is almost never found in Old Spanish when the finite verb is preceded by a categorical proclisis attractor, such as negation (see Mackenzie 2019: 77–78). This gap in the data would be unexpected if the medial clitic was capable of being proclitic on the infinitive.

3. Transition to the modern system
In modern Spanish, enclisis on a finite form of the verb is only possible in (non-negative) imperative clauses. This implies that the reordering operation schematized in (10) no longer occurs in either declarative or interrogative finite clauses. Conversely, the modern Spanish clitic is able to occur freely in sentence-initial position, as is illustrated in the example below:

(19)        Lo iban a mandar a Portugal [. . .]
               ‘They were going to send him to Portugal . . .’
               (Salvador Garmendia, Los pies de barro, 1973)

Thus weak pronouns in Spanish have undergone a fundamental transformation. The old Tobler-Mussafia clitic, with variable linearization conditioned by syntax, has been replaced in all finite contexts (other than positive imperatives) by what amounts to an inherently proclitic item.

This transformation was largely complete by the early seventeenth century, as the unbroken black line in Figure 1 below (from Mackenzie 2019: 92) indicates.

In infinitival clauses, there have been two major developments. First, proclisis has been entirely lost. In the specific context of infinitival clauses that follow a preposition (cf. (15) above), the history of this change is actually quite intriguing, because the usage trend appears to have moved significantly in one direction in the late Middle Ages and then in quite the opposite direction in the early modern period. As the unbroken grey line in the above figure shows, proclisis increased in this context from about the 1270s until the 1430s, after which time it declined quite quickly and was almost extinct by the beginning of the seventeenth century. For extensive discussion of this and other clitic-related ‘failed changes’, see Mackenzie 2019: 97–108.

The other major development affecting infinitives has been a significant increase in the frequency of enclisis on the infinitival component of clause union structures (see Davies 1997). This has been at the expense of both the pattern illustrated by (17) and that illustrated by (18), the latter having in fact completely died out (in declarative and interrogative clauses). The loss of the latter pattern should be seen as a direct consequence of the loss of enclisis in finite clauses (other than positive imperatives), given that the medial placement option in clause union structures implies finite enclisis rather than infinitival proclisis.

