Distinguishing language documentation and language description, revisited: LIP discussion

The distinction between language documentation and language description: a LIP discussion, Ruth Singer recaps last night’s Linguistics in the Pub, a monthly informal gathering of linguists in Melbourne to discuss topical areas in our field.

Last Monday’s debate about the distinction between language documentation and language description inspired by a recent article on the topic by Nikolaus Himmelmann in Language documentation & Conservation:
Himmelmann, Nikolaus P. 2012. Linguistic Data Types and the Interface between Language Documentation and Description. Language Documentation & Conservation 6. 187-207

The discussion was led by Rachel Nordlinger and Nick Thieberger from the University of Melbourne. Also commenting on the debate was Eva Schultze-Berndt, visiting Melbourne Uni from the University of Manchester, via Kununurra and Cairns.


A big audience turned out at the new venue of Naughton’s hotel for Linguistics in the Pub on a rather chilly night. But the warming food and fiery discussion made everybody forget about Melbourne’s rather persistent winter weather. People from La Trobe (Centre for Research into Linguistic Diversity), Monash and Melbourne Universities all attended. In addition to teaching staff, research fellows, postdocs and PhD students there were a good number of enthusiastic undergraduates from Melbourne and Monash.

We all came together to debate some of the issues raised by the most recent paper from Nikolaus Himmelmann, on the distinction between language description and language documentation. In this debate we returned to the topic of the second ever Linguistics in the Pub, held at the Recreation Hotel in North Fitzroy in January 2010. It is interesting to note that the opinion of Linguistics in the Pub participants seems to have changed in 2 1/2 years that have passed, perhaps reflecting a growing confidence among those practising language documentation. When I led the earlier debate, arguing that there is no distinction between language documentation and description, all the other participants argued strongly against this position. However in Monday night’s debate, most participants strongly opposed Himmelmann (2012)’s position: that language documentation is a subdiscipline of linguistics that can be clearly distinguished from language description. Either opinions among the group have changed or this just shows that participants in a debate will always argue for a middle ground, when presented with an extreme view.

The suggestion that there is a distinction between language documentation and description argued for by Himmelmann in his 1998 paper was a significant event in the development of the field of language documentation. So, our discussion of Himmelmann (2012) involved not only the arguments made in that paper but also a general debate about the endeavour of language documentation and exactly what it is:
1. Whether language documentation should be considered a separate field or discipline of linguistics, or
2. Whether language documentation is just a process that is part of good descriptive linguistics work, and
3. Whether language documentation purely conceived is ‘real’ linguistics or just a form of cultural documentation that does not rely on any way on insights from the discipline of linguistics

Himmelmann’s paper is a response to Evans (2008) review of Gippert, Himmelmann & Mosel (2006) Essentials of Language documentation. In his review Evans criticised some of the conceptions of language documentation in Gippert et al. Himmelmann (2012) extends the argument in Himmelmann (1998) that language documentation is a separate though interrelated discipline to descriptive linguistics. He develops a typology of data types using the example of how philologists approach historical texts. The multiple versions of an ancient text are the raw data, the critical edition is the primary data. Further analysis, in linguistics, results in structural data. In this typology of three data types, Himmelmann makes a distinction between elicited data and observable data. However, as participants commented, there are many different types of data than just these two which could be viewed as genres of performance. A number of participants questioned Himmelmann’s typology, arguing that the distinction between the three types of data does not serve any useful purpose and is just categorisation for its own sake. Most also criticised the extent to which Himmelmann argues for the separation between language documentation on the one hand and descriptive linguistics on the other, while simultaneously asserting their interdependence.

Language documentation, as presented in Himmelmann (2012) includes recording and transcribing language data as well as rich annotation on context, speakers, cultural relevance etc. but excludes interlinear glossing, which is dependent on linguistic analysis so is part of the descriptive linguistics endeavour. There was fierce debate about whether language documentation as described in this way, is ‘real’ linguistics. Some argued it isn’t really as it does not involve linguistic analysis. However, participants disagreed on whether language documentation purely conceived contributes to linguistics as a discipline. Whether a richly annotated corpus of texts, compiled for example by a community member and native speaker should be considered a suitable PhD thesis was debated. In the field of literature, apparently a richly annotated corpus or a critical edition of a work is a valid PhD thesis. Nick Thieberger mentioned that this debate had occurred at the University of Hawai’i, where many native speaker linguists are trained, although no such PhDs have been done as yet.

Whether such a project should be funded as language documentation was also debated, although examples of such (often community led) projects were discussed. Some participants argued that a richly annotated corpus compiled by a native speaker without linguistic analysis could be just as valuable to linguistics as one compiled by a trained linguist. Corpora produced by linguists can be quite narrow in scope if the linguist is only interested in a few aspects of the language so has no interest in documenting the social and cultural context of the language. A related question was how much a linguist can get from such a richly annotated corpus. Opinion was quite divided on this matter, some argued that a richly annotated corpus can provide the data for research while others argued it is necessary to do fieldwork on the language yourself. This divide no doubt reflects the different types of research questions linguists have.

The debate then returned to the status of language documentation: is it a practice that is part of descriptive linguistics, a subdiscipline of linguistics separate to descriptive linguistics or what? Some pointed out that Himmelmann’s claim that language documentation is a subdiscipline, although a bit extreme, is a backlash against the lack of attention paid to fieldwork and data in the majority of linguistics departments around the world, until the 1990s. The question was raised whether language documentation is better considered a kind of corpus linguistics. Corpus linguistics is considered a subdiscipline of linguistics and apparently, the creation of a well designed corpus is considered a suitable PhD project within corpus linguistics.

Finally the discussion returned to Himmelmann’s (2012) distinction between raw and primary data and their subtypes and whether he placed them into the categories of language documentation or descriptive linguistics. Most participants were not in favour of the distinction and raised these criticisms of the typology:
● Primary data need more rich description than Himmelmann mentions, on their context etc
● Why is the audio recording included as part of raw data but not primary data?
● Psycholinguistic experiments are placed in the category of ‘data based on metalinguistic skills’ – why does this go in to documentary linguistics rather than descriptive linguistics?
● Many more data types were suggested than those listed by Himmelmann. For example, negative evidence can also be richly annotated by making comments on the native speakers response: such as how strongly they ruled out the suggested form.

It is likely that debate about the place of language documentation within the disclipine of linguistics will continue for many years to come, both at Linguistics at the Pub and elsewhere.

References cited above:
Evans, Nicholas. 2008. Review of Essentials of language documentation. Language Documentation & Conservation 2(2):340–350.
Himmelmann, Nikolaus P. 1998. Documentary and descriptive linguistics. Linguistics, 36.161–195.
Himmelmann, N. 2006. Language documentation: what is it and what is it good for? in J. Gippert, N. P. Himmelmann and U. Mosel. (eds) Essentials of language documentation, Berlin: Mouton de Gruyter. 1-30.
Himmelmann, N. 2008. Reproduction and preservation of linguistic knowledge: linguistics’ response to language endangerment. Annual Review of Anthropology, 39:337-50.
Himmelmann, Nikolaus P. 2012. Linguistic Data Types and the Interface between Language Documentation and Description. Language Documentation & Conservation 6. 187-207
Woodbury, Tony. 2003. Defining documentary linguistics. In P.Austin (ed.) Language documentation and description, Vol. 1, 35-51. London: SOAS.

Linguistics in the Pub September
Most agreed that the new pub was cosier and the food, if not better then at least a nice change. So the next Linguistics in the Pub, on Tuesday 25th September, will be held at Naughton’s Parkville hotel again. Bruce Birch (ANU) will show a demo of his Iwaidja dictionary and phrasebook smartphone app. Following that, Rob Mailhammer (UWS) will lead a discussion on authenticity in language revitalisation.

Here at Endangered Languages and Cultures, we fully welcome your opinion, questions and comments on any post, and all posts will have an active comments form. However if you have never commented before, your comment may take some time before it is approved. Subsequent comments from you should appear immediately.

We will not edit any comments unless asked to, or unless there have been html coding errors, broken links, or formatting errors. We still reserve the right to censor any comment that the administrators deem to be unnecessarily derogatory or offensive, libellous or unhelpful, and we have an active spam filter that may reject your comment if it contains too many links or otherwise fits the description of spam. If this happens erroneously, email the author of the post and let them know. And note that given the huge amount of spam that all WordPress blogs receive on a daily basis (hundreds) it is not possible to sift through them all and find the ham.

In addition to the above, we ask that you please observe the Gricean maxims:

*Be relevant: That is, stay reasonably on topic.

*Be truthful: This goes without saying; don’t give us any nonsense.

*Be concise: Say as much as you need to without being unnecessarily long-winded.

*Be perspicuous: This last one needs no explanation.

We permit comments and trackbacks on our articles. Anyone may comment. Comments are subject to moderation, filtering, spell checking, editing, and removal without cause or justification.

All comments are reviewed by comment spamming software and by the site administrators and may be removed without cause at any time. All information provided is volunteered by you. Any website address provided in the URL will be linked to from your name, if you wish to include such information. We do not collect and save information provided when commenting such as email address and will not use this information except where indicated. This site and its representatives will not be held responsible for errors in any comment submissions.

Again, we repeat: We reserve all rights of refusal and deletion of any and all comments and trackbacks.

Leave a Comment