Philippe Blache (CNRS & Universite d'Aix-Marseille, France)
Evaluating complexity in syntax: a computational model for a cognitive architecture
Measuring language complexity necessitates a precise definition of the set of criteria to evaluate. Unfortunately, this question is far from simple. We propose in this talk a method to focus on syntactic complexity. Classically, such complexity is calculated either as a function of the syntactic structure (the number of nodes, their embedding) or the computational complexity of the parsing procedures. However, such proposals are dependent from the syntactic formalism, whatever the theoretical framework (being it constituency or dependency-based). In fact, these approaches do not measure directly the syntactic complexity, they rather evaluate the complexity of building a syntactic structure, which is an indirect evaluation. The question is then: is it possible to identify syntactic properties that can be evaluated directly, independently from a given structure and thus contribute to the elaboration of a complexity model? We present in this talk an approach implementing such a solution by means of constraints. We will present first the formal framework, then the computational model for the evaluation of syntactic complexity. This model can be used to predict the difficulty of language processing by human subjects. We will present different experiments illustrating the relevance of the model and propose a cognitive architecture for language processing.
|Alex Housen (Vrije Universiteit Brussel, Belgium)
L2 complexity – A Difficult(y) Matter
In this talk I offer a critical evaluation of current second language acquisition (SLA) research on complexity. First I discuss the general purposes for analysing complexity in L2 studies to demonstrate that, although it figures as a prominent research variable in various strands of L2 research, it has rarely been investigated for its own sake. Then I take stock of the different ways in which complexity has been defined, interpreted, operationalised and measured in L2 research, and point to the often contradictory claims, circular reasoning, impoverished measurement practices and inconsistent empirical findings that characterize much L2 complexity research. Next, I suggest ways in which this state of affairs can at least partly be redressed by proposing a more narrow conceptualisation and rigorous operationalisation of L2 complexity as a research construct, and by distinguishing between two complexity-related constructs which in my view are particularly relevant to theory construction in L2 research: structural complexity, which relates to the 'what' of SLA, and cognitive complexity (or: difficulty), which speaks to the 'how' of SLA. These two constructs will be characterized in more detail and valid operationalisations for each will be considered.
Frederick J. Newmeyer (University of Washington, University of British Columbia, Simon Fraser University)
The question of linguistic complexity: historical perspective
The standard view among non-linguists is that languages can differ tremendously in relative complexity. However, in the late 20th century a consensus developed among the world’s professional linguists that all languages are equally complex. Such is still probably the majority position, though it has become increasingly challenged. Three factors led to this consensus. The first factor is humanistic: Since all people of the world are in some sense equal, their languages must be as well. The second is based on language use: In order for languages to be useable, any complexity in one area must be counterbalanced by simplicity in another area. The third is theory-internal: Universal Grammar, by its very nature, demands equal complexity. All three arguments for equal complexity have been subject to serious critiques in recent years. As far as the first is concerned, it has been argued that nothing can be concluded about the cognitive abilities or culture of a people on the basis of whether their language’s morphological system, say, is simple or complex. Secondly, many examples have been brought forward to show that increasing simplicity in one area of grammar does not necessarily lead to increasing complexity in another. For example, languages with large vowel inventories tend to have large consonant inventories as well. Third, the Universal Grammar-based arguments for equal complexity do not hold up when scrutinized carefully. Even models of formal syntax which attribute a great deal of innate knowledge to language users, like the Principles-and-Parameters approach, for example, allow for language-particular syntactic and morphological detail, which could well increase the degree of complexity.
After a brief discussion of the difficulties in actually measuring how one language might be judged as more or less complex than another, the presentation concludes with a discussion of the historical and sociological factors that have been evoked as influencing increase and decrease of relative complexity. Some of the factors adduced include the relative size of the linguistic community, its degree of isolation from other communities, and the amount of contact that the society has had with its neighbours — in particular, whether this contact has primarily involved children or adults.
|Advaith Siddharthan (University of Aberdeen, UK)
Automatic Text Simplification and Linguistic Complexity Measurements
Notions of linguistic complexity are used in two ways in research on automatic text simplification: (a) to inform decisions on what linguistic constructs a system should simplify, and (b) to evaluate the extent to which a text has been simplified by a system. In this talk, I'll explore both aspects, and summarise how methodologies from a range of disciplines inform research on automated text simplification. For instance, the literature on language impairment (studies on deafness, aphasia, dyslexia, etc) identifies linguistic constructs that are known to impair comprehension for particular target populations. Similarly, the literature on language acquisition provides evidence about the order of acquisition of linguistic phenomena. An analysis of manually simplified texts gives us the distributions of different linguistic constructs in simplified language and in some domains, manuals exist for producing controlled language. The evidence base that informs automated text simplification systems is therefore quite diverse, and reading comprehension has been shown to improve when texts have been manually modified to make the language more accessible (e.g. through reduction of pre-verb length and complexity, number of verb inflections, number of pronouns, number of ellipses, number of embedded clauses and conjunctions, number of infrequent or long words), to make the content more transparent (e.g. by making discourse relations explicit), or to add redundancy (e.g. by paraphasing, elaborating, using analogies and examples).
While systems can be evaluated by quantifying the extent to which such simplifications are performed and the number of errors made in the process, in practice most published work uses human judgements to rate simplifed texts on a Likert scale for simplicity, fluency, and the extent to which meaning is preserved. More recently, there have been attempts to use online and offline techniques from the psycholinguistic literature to investigate sentence processing and text comprehension. It is clear that in the context of computer generated or regenerated text, linguistic complexity stems not just from the use of specific syntax and vocabulary (which can be easily quantified), but also from disfluencies introduced in the process that can hinder comprehension, or even alter meaning. In this regard, automated text simplification has the potential to provide a rather interesting test set to evaluate automated readability assessments.
|Benedikt Szmrecsanyi (KU Leuven, Belgium)
Measuring complexity in contrastive linguistics and contrastive dialectology
This presentation surveys a range of measures and metrics that are popular in functional-typological, contact-linguistic, and sociolinguistic research concerned with the comparison of languages and dialects in terms of language complexity. The presentation sets the scene by summarizing some crucial themes that permeate this literature, including the distinction between global and local complexity and between absolute and relative complexity. I then go on to sketch (i) absolute-quantitative measures of complexity (e.g., the number of phonemic contrasts in a language, the length of the minimal description of a linguistic system, or information-theoretic complexity), (ii) “baroque complexity” measures (about linguistic elements that are in the language for no apparent reason other than historical accident), (iii) irregularity-centered complexity measures (about, e.g., the outcome of irregular inflectional and derivational processes), and (iv) L2 acquisition complexity, which is about the degree to which a language or language variety – or some aspect of a language or language variety – is difficult to acquire for adult language learners (for example, we know from the SLA literature that adult language learners avoid inflectional marking whenever they can). The presentation concludes with a few remarks on future directions in complexity research in contrastive linguistics and contrastive dialectology.