Concluding the ELIIP workshop

In a few weeks’ time reports and powerpoints on the ELIIP workshop will be up on the ELIIP website for discussion.
I took away memories of the beauty of the mountains and saltlakes, the strange comfortableness of bison, and a slight increase in knowledge about the Latter Day Saints – how can one not feel sympathetic to the nineteenth century Welsh Mormon who set sail for Zion equipped with an English and Welsh dictionary.
There’s a lively group of people at the University of Utah working on native American languages (from Brazil north to Ojibway). One project that especially struck me was a Shoshone outreach program. Several Shoshone were at the ELIIP workshop. Last year 10 Shoshone high school students came to the Center for a six week summer camp funded by a donation from a local mining company. In the program they learned some Shoshone language, as well as crafts from Shoshone elders. The students worked as paid interns to do some work on language documentation and prepare language learning material in Shoshone. It was a great introduction, not only to language documentation but to university life generally. What a good idea!]
Back to the workshop. Yes we need something like ELIIP – a list of endangered languages with information about them and pointers to other sources about them. But it won’t work unless it is aimed at more than just linguists. And it must point to rich information. And it must be inclusive. And it must be simple to use. And, since there is very little money around, it must be designed to have as low maintenance costs as possible.
Summing up, I’d say the workshop allowed various ideas to gel about what the one-stop shop for languages would look like. I thought the most important were:

  • Avoid duplication. A lot of work has already gone into collecting material. Don’t waste it.
  • Data-freshness. People will be drawn to the site if they believe that the data is fresh, rich and reliable.
  • comes at a cost Whatever’s built has to be updatable and maintainable at minimal cost. So maintaining links – even with a web crawler – is beyond many sites
  • Buy-in If it’s to work, lots of communities, archives and linguists need to be able to add in material easily and to feel that it belongs to all of us
  • Simple interface for searching AND for uploading. This means paying for good design and testing with a range of users. Maybe there’ll be several interfaces for different types of user.
  • Wish-things
    • There was a strong swell of opinion in favour of digital archives where people could deposit digital data files and update information easily
    • Snapshots in time People will want to know what a language was like 10 years ago, 20 years ago – how many speakers, did children speak it and so on.
    • Localisation How to translate the material into other languages for countries where outreach on the importance of helping speakers keep their languages is really needed? Spanish, Chinese, Russian, Pidgins and French may be the main lingua francas for some of these areas.

.

A divide was proposed by Gary Simons between curated web services (where people create data and people manage that data) – like Wikipedia – and aggregating web services (where automatic harvesters harvest data from archives, libraries etc) – like Google. I think the consensus was that we needed both – linking to information that is out there, and filling in the gaps.http://www.language-archives.org/OLAC/metadata.html


Aggregation means work for existing archives as well as for ELIIP. If an archive’s data is to be harvested, it has to be accessible to data harvesters. And access has many levels – first, knowing that it is there (e.g. via a URL which builds in the ISO code for the language). Getting language cataloguing information (metadata) in a shape that is harvestable is hard and time-consuming, as was noted by researchers and archivists wrestling with OLAC and IMDI metadata. And then if you want to go beyond a link, there’s extracting the information from the page itself. (I liked the way WALS ( The World Atlas of Language Structures) allows going to actual references on GoogleBooks or equivalent).
What kind of interface? It has to be simple – for searching, for uploading data, and for commenting on existing data. We suggested a basic interface (possibly offered by another organisation – e.g. Sorosoro. Doug Whalen suggested a hinged model – one underlying database which could be expressed as a UNESCO list for policymakers, one for ELIIP researchers, one for the general public, and lots of community portals for communities.
On community portals, I was impressed by the way the DoBeS people can generate semi-automatically community portals from the material in their archive – an advantage of having highly structured data in the first place. E.g. Dane-zaa Community Portal to facilitate the use of the archive collected by the DoBeS team together with the elders. An interactive community portal the community members could customise and manage would be great.
Simple interactivity is important – free form comments are easier than web forms but the information has to go to the right people and this can be tricky when there are thousands of different right people for different questions. Hans-Joerg Bibika brought up the WALS database where they have thousands of comments which can be made on any data point or set of datapoints. He thinks that roughly 60% of commenters are linguists, 30% noise and 10% native speakers. It is a blog system. The 65 authors of WALS are linked via RSS feeds and because it’s their chapters they have some incentive to correct mistakes. Having public tracking keeps the administrator honest. And it turned out to be simple to implement.
Wikipedia cropped up many times. It has superb page rank and data freshness. BUT … a number of drawbacks were noted, many by Doug Whalen. Regular Wikipedia doesn’t support heaps of links and for data richness we need that. It doesn’t go for original research (it wants citations) and people are loathe to put work into something which some non-specialist can then change. Only 700 languages or so have pages, and who would create the others? Cold hard truth crept in here, researchers live from recognition. Getting them to maintain a language site requires some incentive other than a sense of virtue. So there needs to be some minimal recogition of people who do contribute information – whether by authoring chapters as in WALS, or by having a public list of regional editors which people can then cite as indicative of community/research service
Wikipedia has only one level of access, with widely varying types of information, and so it can be rather daunting for Joe User. Various suggestions were batted around. One was having basic and advanced interfaces. Another was creating a template for language entries in Wikipedia, with automatic links to ELIIP and Ethnologue, and suggestions of archives. A promising line of enquiry for more reliability is a more controlled type of Wiki, such as Wiki-Species (also easier to translate as you can see from the list which includes Eald Englisc).
Anyway, many ideas – watch ELIIP’s space!

3 thoughts on “Concluding the ELIIP workshop”

Here at Endangered Languages and Cultures, we fully welcome your opinion, questions and comments on any post, and all posts will have an active comments form. However if you have never commented before, your comment may take some time before it is approved. Subsequent comments from you should appear immediately.

We will not edit any comments unless asked to, or unless there have been html coding errors, broken links, or formatting errors. We still reserve the right to censor any comment that the administrators deem to be unnecessarily derogatory or offensive, libellous or unhelpful, and we have an active spam filter that may reject your comment if it contains too many links or otherwise fits the description of spam. If this happens erroneously, email the author of the post and let them know. And note that given the huge amount of spam that all WordPress blogs receive on a daily basis (hundreds) it is not possible to sift through them all and find the ham.

In addition to the above, we ask that you please observe the Gricean maxims:

*Be relevant: That is, stay reasonably on topic.

*Be truthful: This goes without saying; don’t give us any nonsense.

*Be concise: Say as much as you need to without being unnecessarily long-winded.

*Be perspicuous: This last one needs no explanation.

We permit comments and trackbacks on our articles. Anyone may comment. Comments are subject to moderation, filtering, spell checking, editing, and removal without cause or justification.

All comments are reviewed by comment spamming software and by the site administrators and may be removed without cause at any time. All information provided is volunteered by you. Any website address provided in the URL will be linked to from your name, if you wish to include such information. We do not collect and save information provided when commenting such as email address and will not use this information except where indicated. This site and its representatives will not be held responsible for errors in any comment submissions.

Again, we repeat: We reserve all rights of refusal and deletion of any and all comments and trackbacks.

Leave a Comment