50 words of Australian languages project

The Research Unit for Indigenous Language is running a project in 2019/2020 to collect and present words in as many Australian Indigenous languages as possible. Please consider contributing to this project.

This project aims to provide resources for schools to teach at least fifty words in their local language.

We are asking for contributions of at least fifty words in as many Australian Indigenous languages as possible. The typed words need to be listed in a spreadsheet, with audio file recordings attached. Full instructions on capturing the details are on this website.

The words will be used as teacher resources; in apps; and online language resources (such as the Gambay map by First Languages Australia).

Splitting Elan files

Part of the project relies on getting the audio or video recording of fifty words, having a transcript in Elan, and then linking the English word as the transcript of the word in the recording. We can then split the media file using the Elan file to end up with a set of fifty files, each named with the language code, so, for example, the language Windaga has the AIATSIS code A111, so the word for dog would be called A111_dog.

The Elan splitter is a set of python scripts provided by Ben Foley and can be downloaded from this github repository. It reads the Elan file and allows the user to choose which tier to use to chop up the audio file into chunks.

Chief Investigators: Professor Rachel Nordlinger and Associate Professor Nick Thieberger, resourced by the Research Unit of Indigenous Language.

This research project has been approved by the Human Research Ethics Committee of The University of Melbourne. Project number:1853186.1


  1. This is a wonderful project! I like the idea a lot. What is your timeframe for submitting words into the database? As many other academics I work best with deadlines :).

  2. Nick Thieberger says:

    We are now ready for contributions. Deadline is next week! Or the week after! https://go.coedl.net/50WordsProject

  3. Nick Thieberger says:

    We now have a draft site here: http://50words.online/
    It will keep growing as we get more contributions.

  4. Nick Thieberger says:

    There are 15 languages in the website now so please feel free to add more!

  5. Nick Thieberger says:

    There are now 30 languages on the map, and more are on the way: Alyawarr, Barngarla, Bilinarra, Burarra, Djamparrpuyngu, Djinang, Eastern Arrernte, Gurindji, Jaru (Balgo), Jaru (Yaruman), Kaurna, Kaytetye, Kukatja, Kunbarlang, Kuninjku, Mathi Mathi, Mawng, Murrinhpatha, Ndjébbana, Ngaanyatjarra, Ngalia, Ngardi, Nyangumarta, Pitjantjatjara, Tiwi, Wajarri, Warlmanpa, WikMungkan, YanNhangu, and Yawuru.

Leave a Reply