Research, records and responsibility conference: Ten years of PARADISEC

The conference celebrating ten years of PARADISEC in early December had a suitably interdisciplinary mix of presentations. Joining in the reflection on building records of the world’s languages and cultures were musicologists, linguists, and archivists from India, Hong Kong, Poland, Canada, Alaska, Hawai’i, Australia, the UK and Russia. The range of topics covered can be seen in the program: http://paradisec.org.au/RRRProgram.html

The conference ended with a discussion of what was missing in our current tools and methods. While it is clear that linguists have done pretty well at using appropriate tools for transcribing and annotating text, and building repositories to provide long-term citation and access to the material, there is still a long way to go. Problems that are impeding the current work of documenting languages include:

A lack of uptake of existing tools and methods by field linguists (despite training programs)
High barrier to archiving (we could make it easier to get things into an archive and encourage upload of recordings by speakers)
Failure of the academic community to value the creation of suitable tools as being necessary to creating scholarly output
Failure to value properly constituted collections of primary records
The failure of current tools to allow simple interoperation of the data they create (so that Elan’s output may not work in Fieldworks, or Toolbox output does not open in Fieldworks, for example)
Lack of standard formats for common datatypes (interlinear glossed text, linguistic paradigms) that makes it difficult to create collections of texts that can be shared with other researchers

Solutions suggested included the following:

A common task list from which programming teams could select to build interoperating tools (transcription, annotation parsing, lexical database construction, visualisation, analysis of primary data)
It was observed that there used to be administrative assistants who would keep linguistics departments running and that now we need programmers who can keep our software and websites working properly. However, typically we do not have that level of support.

Emerging or imaginable future possibilities include:

Using mobile apps to increase the records available for each language (cf. Steven Bird’s paper at the conference)
Crowdsourcing models that allow shared annotation of primary records
Automatic transcription based on speech recognition in any language
Unlimited portable storage, permitting multiple copies of large collections

Proceedings were opened by Professor Mark Considine (Dean of Arts, University of Melbourne), and Kevin Bradley (National Library of Australia) with keynote speakers Shubha Chaudhuri (American Institute of Indian Studies, New Delhi), who spoke on “Communities and Archives” and Alexandre Arkhipov (Moscow State University), who opened Wednesday’s Digital Tools Workshop with his talk on “XML databases and XSL transforms in language documentation workflows”. Other authors and presenters included Linda Barwick, Andrea Berez, Bruce Birch, Steven Bird, Cathy Bow, Kevin Bradley, Rona Googninda Charles, Michael Christie, Brian Devlin, Lynn Drapeau, Andrea Emberly, Joanna Filipczak, Sébastien Flavier, Victor Friedman, Ulrike Glavitsch, Lelia Green, John Hajek, Amanda Harris, Gary Holton, Cat Hope, Daniela Kaleva, Joanna Łacheta, Renée Lambert-Brétière, Sylwia Łozińska, Lisa MacKinney, Tos Mahoney, Paul Monaghan, Stephen Morey, Piotr Mostowski, Simon Musgrave, David Nathan, Mark W. Post, Jennifer Post, Martin Raymond, Paweł Rutkowski, Guillaume Segerer, Jozsef Szakos, Nick Thieberger, Andrew Treloar, Sally Treloyn, Michael Walsh, and Peter Withers.

All presentations were recorded and mp3 versions will be made available soon, together with pdfs.

A visualisation of the program in a wordle is presented below.