OpenSpeaks is a toolkit for making audiovisual documentation of low-resource languages. Speakers of indigenous, endangered and other languages with very little audiovisual recording can use different frameworks and best practices provided in OpenSpeaks to enhance their documentation strategies. OpenSpeaks also contains different tools and techniques in addition to editable templates that can be localized and used during the documentation.

Alt=Logo of the OpenSpeaks project
Alt=Logo of the OpenSpeaks project

Introduction edit

OpenSpeaks was conceived in 2017 as a standalone website.[B 1] to be used as an open toolkit for any Citizen Language Audiovisual Archivist (CLAVA). It is designed keeping in mind indigenous, endangered, other low-resource and first languages. But, it can be used by anyone for documenting any language. There are currently four chapters in this Open Educational Resource. Each one is designed keeping beginner/intermediate-level archivists in mind. There are different frameworks in place to help these archivists to assess their own documentation environments and capture nuanced narratives while recording. These frameworks are collectively known as the "OpenSpeaks Framework" that can accommodate a wide range of different environments with a "community-first" approach.[A 1] Some basic understanding of audiovisual documentation is required to use this toolkit.

A language can be documented in many different environments. In a spectrum of digital documentation environments, natural and conversational use of languages and recording in a controlled environment (such as a scripted recording inside a studio) lie in two extremes. The linguistic discipline of Documentary linguistics or Language documentation helps keep historical audiovisual-records of a society and protect cultures. Language documentation also furthers the growth of languages through practical use. Recording a language while people are having a conversation helps understand the informal use of a language. It even can be useful to better design a multimedia literacy program. Similarly, recording pronunciations of words and phrases inside a studio with no noise can help create a speech synthesis system. Factors such as age, gender, influence of non-native languages over one's native language, and socioeconomic strata of an interviewee also make a deep impact on a person's overall speech. For a language to survive and grow, it is essential to a wide range of recordings keeping in mind the aforementioned human and other environmental factors.

OpenSpeaks can be useful for framing three major aspects:
Who/what will affect and who will be affected during documentation
Things that will affect and scenarios
Potential problem Possible way to address
1. Your own status quo as an archivist
You might not be able to afford a crew. You need to edit the footage yourself.
You might not have access to expensive equipment. You need to record using a phone.
2. Your interviewee
Their age, gender, mental and socioeconomic state, what kind of questions would be ethical and what you should not ask
Your interviewee is grieving the loss of a family member/relative.
  • You might cancel the recording entirely.
  • You must not schedule recording of a celebratory custom which would sound disrespectful.
Your interviewee has to take time out from their work.
  • Discuss and find a way to avoid such situations.
  • Try to compensate for their time if possible.
3. Environment
Physical or social environment
Your interviewee has small children who they need to attend constantly.
  • You need to work with them about their availability.
  • You should be ready to stop the interview immediately when they need to attend the children.

The process of documentation varies from archivist to archivist and all the above factors. For instance, documentary linguistics as a practice focuses on collecting the rich linguistic data of a language[B 2] such as recording everyday conversation in a marketplace whereas documentary filmmakers focus on the aesthetics and storytelling. Training volunteers to create informational videos, such as the virALLanguages project that helps create COVID-19 awareness videos in indigenous/minority languages, can also be counted as documentation of language.[B 3]

These resources of this toolkit are divided into four interrelated chapters:

  1. Chapter 1: Consent, Rights, Copyright and Open Licensing
  2. Chapter 2: Multimedia (audiovisual) recording
  3. Chapter 3: Metadata collection and publication
  4. Chapter 4: Accessibility considerations.

Chapter 1: Consent, Content Rights and Content Licensing edit

While making audio and video documentations of language, you encounter questions related to consent, rights and copyright, and licensing. Some of the most asked questions can be:

  • How do I take permission for an interview?
  • Do I take permission from an interviewee in writing or verbally?
  • Who owns the rights when I make an audio or video recording?
  • What kinds of ownership rights exist?
  • What is copyright and who owns copyright when I make a recording?
  • Do I need to register for copyright?
  • Is there a license for publishing the recording?

You might not find direct answers to each such question. Because there is no easy answer to any of these questions. Such situations are unique. So, in this chapter, you will find ways to address. There are different contexts and backgrounds provided below. They will hopefully help you assess your own situation and make a judgement.

Here are some of the common terms you will encounter below:

  • Documentation: Recording any information so that it can be used later. Reciting a poem to someone (who can remember it later), writing it in a paper and making an audio/video recording of the narration are different kinds of documentation.
  • Media: Channels and tools for sharing information. The same information can be printed in a book (print media) or shared over a chat application like WhatsApp or Signal (digital media) or written in a CD (an old digital media) or a cassette tape (a much older and analog media).
  • Content or media content: Information and experiences for end users. Content is often documented in different mediums (e.g. physical mediums include a paper note or a book and digital mediums include a SD card or internal memory of a phone).
  • Consent: Voluntary agreement by a person to the proposal of another person. It happens verbally, by other physical gestures and in writing. Consent is generally taken before legal, medical (e.g. vaccination), research and sexual relations.
  • License or licence: An official permission or permit or the proof of the same that allow someone to use or own something. That "something" can be a manuscript of a writing, recording of a narration or even a doctor's medical license. A software developer (individual) can provide a license if they create a new software or a medical board (organization) can provide a license to a doctor.
  • There are more terms like copyright, moral rights, open licensing. But you will learn about them in detail below.

Consent for documentation edit

In the context of language documentation, consent is often given voluntarily by the interviewee to the interviewer. It indicates a prior approval for the recording. The interviewer would request the interviewer their permission for the recording. The interviewee will need to understand the request. Then they would give explicit permission for the recording and the subsequent publication of the recorded media content. This chapter discusses the how, when, where and who for acquiring consent.

In many interviews, an indirect consent is assumed when the archivist sets up the recording equipment like camera and microphone.[B 4] It is assumed that a person who is being interviewed is aware of the recording by looking at those equipment. But this is not universally applicable. The interviewee might be visually impaired or they might be unaware of the recording process. Also, it is hard to prove legally or ethically at a later date that such a consent is good enough in case of a conflict.

Written consent edit

It is always recommended to use a written or a printed consent form. If using a printed form, make two copies: one for the interviewer and the other for the interviewee. The interviewee should be an adult literate who can understand the text in the form to provide consent by signing it. If the interviewee cannot provide consent then please discuss who should provide consent on their behalf. An adult parent or guardian in case of a minor can provide consent for a minor. A caretaker or a family member can provide consent for an interviewee with physical disability. It is strongly advised to not interview a person with mental disability for ethical reasons.

See an example form below. You can also copy the content, modify if needed and translate into your preferred language (preferably an official language used in the respective jurisdiction).

                            RELEASE FOR CONSENT AND RIGHTS


• I agree to publish the "WORK" under the LICENSE below.

• LICENSE: I acknowledge that by doing so, I grant anyone the right to use the work in all permissible ways the chosen License (Creative Commons Attribution-ShareAlike 4.0 International
(CC BY-SA 4.0 - allows which might include use in a commercial product or otherwise, and to modify it according to their needs, provided that they abide by the terms of the license and any other applicable laws.

• I am aware that this agreement is not limited to the "PRODUCER".

• I am aware that the copyright holder always retains ownership of the copyright as well as the right to be attributed in accordance with the license chosen.

• I acknowledge that I cannot withdraw this "Agreement".

• I am aware and I agree that this Agreement document can be annotated, subtitled, translated, published, distributed and broadcast, without any further approval from myself or my representatives.

• I affirm that: (a) I have the full power and authority to grant the rights and releases set forth in this Release. If any third party claims that the use of the Content violates its rights, I agree to cooperate fully with the PRODUCER to defend against or otherwise respond to such claim.

Share the name and/or explain about the "WORK" *
(If it is an interview, you can write "My interview on TOPIC-NAME" where "TOPIC-NAME" is what you spoke about.; If you submitted a video/audio/picture, you can share the name of the file or describe what the video/audio/picture is about)

• Place where this was signed:
• Date of signing*
• Language(s) spoken:
• Media kind (digital audio, video, photograph):

                        INTERVIEWEE DETAILS

• Full name:

• Email or other contact information: 

• Do you agree to the aforementioned CLAUSES explained?*
- Yes, I AGREE to all the above ☐

• Signature:

                        INTERVIEWER DETAILS
• Full name:

• Email or other contact information:

• Do you agree that:
a. you will provide open access to the content, especially to the native speakers?
b. you will use the content primarily for the promotion, protection and preservation of the language?
d. you will never claim ownership over the wisdom shared in the content but will attribute the interviewee and/or their community?

- Yes, I AGREE to all the above ☐

• Signature:

A document like above becomes a mutual agreement of consent. It might not be a legal document. It also contains terms like "Open Access" and "Creative Commons" which are explained later in this chapter. If a written consent is not feasible for any reason, a verbal consent in a recorded form can be asked for.

Verbal consent edit

It is not always possible to acquire consent in many cases. For instance, the interviewer and the interviewee might speak different languages. The interviewee might be illiterate or have a disability to understand a written consent. In such a case, it would not be ethical to acquire a written consent even if the interviewee is willing to sign. Let us look at some scenarios that will help while acquiring consent.

Recording scenario How Consent type (verbal, written)
Large group activity either in public or in a closed space (more than four-five people) like singing, dancing or even having a meal Interviewee can plainly ask if it is okay to record and publish, and record the collective verbal agreement Verbal recorded (optional if the group is very large and the activity is public)
Small group activity (four-five or less number of people) Each consenting adult has to provide consent Verbal recorded or signed written consent
Individual interview a) interviewee if they are adult and can consent

b) a parent or guardian (when a parent is unavailable) if a minor

c) a guardian if interviewee is not eligible for their own physical/mental disability

Written preferred
Video/image without any personally identifiable information (i.e. human faces, any personal information such as name, address, location and other personal data; applicable to audio exposing similar personal information) A consent is generally not required A verbal agreement (good to keep on record) is recommended if the recording includes religious/culturally sacred sites/rituals and other such elements of a society that are guarded carefully

There can be endless scenarios beyond the above. There can also be situations while acquiring prior consent would not be possible. In such a case, keep the recording very private and acquire consent as soon as possible. If needed, mention about the delay in the consent form for transparency. If you cannot acquire the consent, you should not use the content in your final production. You should destroy the recording immediately instead. If the recording is vital to your production, then you have to ensure all personally identifiable information is redacted from the recording. So, it is strongly advised to not publish content that is acquired without consent.

It is important to note that consent is not just a legal matter, but it is very much a social, ethical and moral subject. It has to be done in a careful manner with mutual agreement.

Rights: Copyright, Moral Right and Other Ownership Rights edit

Copyright is the legal ownership of content. It empowers the legal owner of any work (e.g. text, image, audio and video, data and software) so that they can decide how others could use that work. In simple words, copyright protects the content from unlawful use. Copyright is a really complex subject. In language documentation, there are often confusions on the ownership of the documented content. There are different levels of rights. Moral rights and copyright are two levels of rights that are often discussed. Moral right is the right that the original creator of a content has over their work.

Case study: Recording of folk songs in Colombia edit

“Who Owns The Content” is a short film which rotates around a central question on ownership of oral culture.

Let us take the example of an incident that happened in Colombia to understand more about the rights over a work. A young guy discovered cassette tapes that included recordings of folk songs sung by his late father. A European researcher made the recordings. The songs are known to his local community. After finding the tapes, the guy made digital versions of the songs and uploaded them on the internet. But this made the siblings furious. There are three – four levels of "ownerships" that exist here:

a) moral right co-owned by the community where the folk songs are sung and the late father (as the singer/narrator of the songs that provide evidence of the songs)

b) copyright of only the recordings owned either by the researcher (if he self-sponsored) or by the late father (who might have commissioned the recording) or by the institution that might have employed the researcher

c) the physical copies (cassette tapes) of the recording owned by the young guy and his siblings who are the legal heir of the father

In the above case, the copyright is unknown. One has to investigate further to find out any evidence to support the copyright claim.

Short film "Public Domain Day" explaining how Public Domain works.

Copyright generally lasts for the lifetime of the original creator and a certain number of years after their death. After that, the work goes to Public Domain. Different countries have different number of years as their respective copyright terms (see list here). There is no registration required to acquire copyright. Any work with originality becomes a copyrighted work by default. The symbol "©" is usually used to denote a copyrighted work.

A legal contract or agreement helps decide who the copyright owner will be. Simply put: if an employee of an organization is paid by the employer to do a certain original work, then the employee has a moral right over the produced work whereas the employer has the copyright. Most organizations include a blanket agreement with their employees for a right over all the work produced by the employee in the latter's official capacity. Similarly, a commissioned work would be copyrighted by the individual/organization who commissions the work.

Originality is a grey area. Taking a picture of a natural landscape will result in an "original work" and hence copyrighted whereas taking a picture of a painting will not be counted as original work. Look at the examples below to get some insights on copyright in the context of language documentation.

Content Owners (both copyright and moral rights)
Video/audio recording of a individual singing a folk song Multiple owners:

a) if the original author of the song is unknown then the copyright of the song will be assumed as Public Domain

b) the narrator or singer will have a moral right

c) copyright of only the recording will be owned by the archivist (in case of non-commissioned work) or the individual/organization that has commissioned

Video/audio recording of a cultural or social event a) the people involved in the event and/or the larger community has a moral right

b) copyright of only the recording owned by the archivist or the commissioning individual/organization

Photograph of an artwork, painting, mural, etc. Original artist of the artwork

During the production of audio or video, supporting content (e.g. newspaper clippings, stock images, audio or video footage) are generally used. It is a must to acquire permission from the copyright holder for using such works if you are creating something for commercial purposes. Many use such supporting works copyrighted by others under "fair use" (in the United States) or "fair dealing" (in countries outside the U.S.). However, it is a considerable grey area and involves copyright infringement risks.

The sample release here has a provision to include clarity on copyright. While recording an audio/video in a language, you can seek for written permission (or verbal permission recorded preferably as video) to use and distribute the work. The form has included the Creative Commons Attribution-ShareAlike 4.0 International (CC-BY-SA 4.0) License as a suggestion. But you can use a license that works in your particular case. More details on the "open" (meaning that allow for a wider and open dissemination of information) licenses are discussed in the next section.

Licensing: Public Licenses and Open Licenses edit

In the previous section you read about how content is protected by copyright. Generally, you need to take permission to use it. Languages are usually documented for public use. Most importantly, such documentation must be accessible to all the native speakers. Imposing strict copyright restrictions might not allow open and public access to recorded content in a language. Not all users know about the legal terms of using copyrighted content. Similarly, not all copyright owners (artists, authors, performers, etc.) also do not know which license to use for their work. If a specific license is not mentioned during publishing the work then the work is automatically copyrighted. "All Rights Reserved" is popularly written for such works. However, copyrighting language documentations accidentally might restrict many native speakers to access. This would also hamper the growth of low-resource languages. So, you are encouraged to use different open licenses whenever possible. There are also some non-open public licenses. But let us learn about these license definitions first.

Public Licenses

Public licenses or public copyright licenses are for the general public. By using such a license, the copyright owner grants a universal permission. Any permission that is specific (for some individuals or some organizations or only for legal residents of a country) cannot be called public licenses. Open Knowledge Foundation has recommended the use of Open Licenses for creative work which is guided by the Open Definition.

Free/Open Licenses

These licenses are commonly known as open licenses. They are inspired by the Four Essential Freedoms of Free Software. Such licenses allow everyone to use, modify, share and improve. Different open licenses have different degrees of these four freedoms.

Creative Commons Licenses

A set of licenses known as Creative Commons licenses (acronymed as "CC" licenses) are generally used for copyrighted works. CC licenses apply to all such works including text, image, other multimedia content like audio and video, and even datasets. There are seven main Creative Commons Licenses. The table below gives more clarity about what these licenses are.

License logo, name and shortened name What is allowed What is NOT allowed Commercial use allowed

CC-Zero or CC0

  • Copyright Owner allows others to use, distribute, remix and use this work to create other works
  • A user can re-distribute for free or make money
  • The user does not need to give credit to the Original Owner
There are no such restrictions for users Yes

CC Attribution or CC-BY

Copyright Owner allows others to:
  1. use the work
  2. distribute it
  3. adapt or remix or create derivative works and re-distribute such works both for free or for money

The user MUST credit the original Copyright Owner for the above use.

This is the most accommodating of licenses offered. Recommended for maximum dissemination and use of licensed materials.

The user cannot claim the original work as their own. This means the user has to give credit to original Copyright Owner in all places where they use the work. This applies to all the licenses below in this table. Yes

CC Attribution-ShareAlike or CC-BY-SA

Copyright Owner allows others to:
  1. use the work
  2. distribute it
  3. adapt or remix or create derivative works and re-distribute such works both for free or for money

The user MUST credit the original Copyright Owner for the above use and they must release the new creations under the license the Owner used.

User cannot release works derived from original work under a different license (than the one used in the original work) Yes

CC Attribution NoCommercial or CC-BY-NC

Copyright Owner allows others to:
  1. use the work
  2. distribute it
  3. adapt or remix or create derivative works and re-distribute such works ONLY for non-commercial purposes (user can NOT earn money from the original work or new works using the original work)

The user MUST credit the original Copyright Owner for the above use.

User cannot make money from new works derived from original work No

CC Attribution-Noncommercial ShareAlike or CC BY-NC-SA

Copyright Owner allows others to:
  1. use the work
  2. distribute it
  3. adapt or remix or create derivative works and re-distribute such works ONLY for non-commercial purposes

For the above use, the user

  • MUST credit the original Copyright Owner
  • MUST share the new creation under the same license the original Copyright Owner had used
User cannot release works derived from original work under a different license (than the one used in the original work)
User cannot make money from new works derived from original work

CC Attribution-NoDerivatives or CC Attribution-NoDerivs or CC BY-ND

Copyright Owner allows others to:
  1. use the work
  2. re-distribute it both for free or for money

The users must credit the original Copyright Owner for the above use.

User cannot re-distribute new creations using original work Yes

CC Attribution-Noncommercial-NoDerivatives or CC BY-NC-ND

Copyright Owner allows others to:
  1. use the work
  2. distribute it

For the above use, the user MUST credit the original Copyright Owner

User cannot use original works to create new creations

or re-distribute such creations using original work

The above table is inspired by the "seven regularly used licenses" section of Wikipedia.

Here are some recommended tools and other resources that you can use to identify which Creative Commons License you need to use:

  • CC Chooser (see legacy version): a form where you can fill the options to find an appropriate license for your work. You can simply copy the License text (or code if using in a website) and use it.
  • Internet Archive: a free repository that is strongly recommended for uploading your language documentation work. It supports a wide range of file types (images, documents, audio and video) and formats apart from Creative Commons Licenses. If you want your file to be used for Wikipedia and other Wikimedia projects, you need to upload them to Wikimedia Commons. Only CC0, CC-BY and CC-BY-SA licenses are allowed there.

Creative Commons Licenses are not the only kind of free/open licenses. The GNU Free Documentation License is a popular free license that is used for many text materials like books and manuals. It generally allows a user to use the original work, make a copy, redistribute, and even modify it. However, the original document or source code MUST be included in the new work if more than 100 copies of the same are published.

Chapter 2: Audiovisual recording edit

This chapter helps address the following questions:

  • How to prepare for recording an interview as audio or video?
  • What hardware and software are required for the recording?
  • What are the safety considerations?

Module 1: before recording edit

Important things to keep in mind during an audio/video interview recording and potential issues

Before you record edit

1. Prepare before recording
Recording a language that you know might seem easier than recording a language that you do not know. However some preparation and research are always needed. The preparation before recording includes gathering information about two major areas:
  • who you are interviewing (their personal background as gender and profession, social background such as economic or the surrounding community, political and religious factors, and any other influencing factors that would potentially influence their speech)
  • what you are recording (the language or dialect)

Look for the published materials covering the languages on the internet as the process is faster and look for other audiovisual content, blogs, websites, books, magazines, journals, etc. Try searching in different languages that might be relevant locally (a dominant language or official language). For instance, if you are trying to document the Sasak language of Indonesia, try searching for resources in Bahasa Indonesia and Balinese. Take notes of your findings and observations while researching. Notes help frame better interview questions.

2. Frame model questions

Asking some such questions to yourself during your research can be helpful. Asking better questions lead to getting more nuanced responses. For instance, if you are going to interview a farmer, you can learn more about the local weather and soil quality of the region, and frame better questions. On the other hand, imagine you are going to interview a transgender person in a society where transgender people are discriminated. You can ensure to be extra careful to not record anything that might be problematic for them later. You learn about the factors that make your interviewee marginalized by discussing with them prior to recording. This will help frame better questions. But, more than that, their nuanced responses can contribute largely to better the state of their human rights.

Ask yourself while researching:

  • What kind of language documentations exists in this language I am going to record? What is missing?
  • Who I am going to interview?
  • What dialect/variation do my interviewees speak?
  • What are different religious, cultural and social influences on the language/dialect of my interviewee?

Interviewing for audio/video needs an informal and friendly but respectful approach from you. While you have to be very empathetic, humble and sensitive with your behavior, you also need to ask many direct questions politely. You need to realize what their constraints are. While preparing, try to frame some broad questions. Go into details based on what the interviewee shares. Those follow-up questions often help uncover more about a topic that they are speaking of. Their answers also lead to new topics. But if you feel strongly about asking a question, write it down in a notebook or a piece of paper and take a picture of that in case you might miss it accidentally. Ask such questions in the beginning.

3. Keep a notebook, not just a phone

A notebook comes very handy during interviews. You can not only write down questions you really think of asking, you can also take notes that are relevant during the interview. Smartphones are also useful for note-taking but you might be using the smartphone to record audio/video during an interview. Secondly, sometimes communications devices such as smartphones interfere with other audio and video recording equipment. It is quite normal for the audio recorders to pick up an interfering signal if you happen to receive a call during the interview. Also, you looking at a phone during an interview might be a bit distracting to the interviewee. Phones are useful for some note-taking. If you are recording a video with a camera and audio separately with an audio recorder, taking pictures of the camera/recorder display is useful for post-production. The displays might include key information such as timestamps and file names.

3. Manage a rough plan

It is always useful to make a high-level plan for the interview and share it with the key people involved. A spreadsheet containing details such as name of place of interview, people who you will interview, their language, dialect and other influencing factors (more in a table below) that you would have found out in your research. If you have a short recording window, a plan would help manage time and expectations. But do not make your plan too detailed with very precise information. It is important to have a balance between planned schedule and what makes sense practically.

What factors you might need to document in the plan
Factor Purpose (what questions can be framed based on this)
Location Questions about environment
e.g. If the place is in the middle of a dessert, the daily lives of people there is certainly affected. You could frame questions on contemporary (how the local community manage with cooking and other needs with less water) and cultural narratives (folklore, folk songs).
Personal information about the interviewee
  • Age group and Gender: you can ask gender-sensitive questions (you can check with any contact person in the interviewee's community if there are gender-related questions one must not ask or should ask in a particular way. On the other hand, you can frame more specific questions. A married woman with children might be better positioned to sing a lullaby as opposed to a male farmer who might or might not know lullabies.
4. Know your devices and software
It is critical to include in your plan the hardware and software that are needed for the recording (more in the next module). It is also important that you know them well. See a simple checklist below:
Hardware/software checklist
Recording device/software Check
Smartphone (if using for recording)
  • Keep the phone fully charged right before the recording.
  • Take a backup of files and free up space as audio/video files take up a lot of space.
  • Keep the brightness to the screen to the level what is absolutely necessary for you to see.
  • Use the best settings for audio/video. 1080p for video and 44K/48K with stereo are recommended.
  • Carry a spare microSD card if you can and if your phone has an option to transfer data to. This will be helpful to free up space if the phone memory is over.
  • Carry a charging cable in case the battery gets too low.
  • It is advisable to give 5-10 minutes of break after recording for 20 minutes video (or 45 mins - 1 hour audio) to avoid the phone from heating up.
  • Shoot in raw if your camera allows that but know the setting well beforehand.
  • If shooting with a standard mode, try to use the best quality settings (e.g. .mov over .mp4 file extension)
  • Keep a fully-charged spare battery if possible.
  • It is advisable to pause every 20 minutes. It helps both from avoiding over-heating of the camera and editing during post-production. Too long recordings are very hard to work with.
Audio recorder


  • Try to use a "deadcat" windscreen. You could also make one using old furry toys. The most important thing to remember is that there is a rubber band or a hook-and-loop fastener (popularly known as a velcro) so that it does not move during the recording. Windscreens are important during outdoor filming as you can never guess how windy it would be outside.

Check what apps are best for your recording workflow if you are planning to use your phone for the audio and video recording. It is advisable to use apps (e.g. Open Camera, Filmic Pro for iOS devices) that show the audio levels on screen while recording so you know for sure that the audio is indeed being recorded.

5. Keep a notebook/note-taking app to capture some important data
Physical/digital note-taking while recording always helps during post-production. Also, you need to capture some metadata (more in Chapter 3) for which you can use the note or use a printed template. But please keep in mind that the noise you might make while writing might get recorded so choose your pen carefully.
6. Ensure you get to record in a quiet place
The most challenging aspect of any recording is a quiet place for clean audio and a well-lit place for good quality video. Check below to know what to avoid:
Noise sources Possible solutions
Ambient noise (Audio)
  1. Talk to the interviewee before recording to check what could be the least noisy place where you are going to record
  2. If you can, get a lavalier microphone (also known as lav mic, lapel mic, clip mic, etc.) so that you get a nice clean sound as it is placed close to the interviewee's face
LED and other home electric lights (Video) Most home lights, when captured in a camera, look flickering and disturbing. When you'll learn more about the solution for such issues in the next module, avoid home lighting and use lights that are recommended (more here) for filing if you can afford. Alternatively, if you are filming during the day, you can sit close to a window with the interviewee's face lit with the natural lighting.

Module 2. During recording edit

The recording process can be very planned or very spontaneous. You should be prepared well in advance to capture the most when your interviewee is ready. Please note that recording interviews can be tricky as the people you are going to interview are mostly not actors. When you ask them questions, their reaction will be natural. The previous module can be helpful to be prepared well.

There are some guidelines that apply to almost all interview recording situations:

  • Start with asking for consent and record the same (see how to ask in Chapter 1). This should be the in the beginning of recording even if you have taken a written consent. A recorded verbal consent would be useful in case you lose the signed written consent accidentally. You asking permission of your interviewee and them responding should be recorded as audio/video. A common format for asking for consent verbally*:
You say Interviewee says*
  • Ny name is ...... (say your own name)
  • Today is ...... (recording calendar date) and it is ...... (recording time)
  • I am making a recording for ...... (purpose of recording e.g. documenting the Sasak language)
  • With me present ...... (name of interviewee)
(they might nod or agree verbally or ask you any question to clarify)
INTERVIWEE-NAME, will it be okay if I continue recording? Yes
Can you please say your name? (if they are comfortable sharing, otherwise skip this step) (they say their name)
I am going to publish this recording under LICENSE-NAME (see Chapter 1 for licenses).

(simplify the clauses that they can understand) It means that anyone can use the recording for any purpose, share it with others, and even make commercial use (if the license is CC-BY or CC-BY-SA or CC0). When others use, they will have give credit to you (in case of CC-BY or CC-BY-SA). Are you okay with that?


NO..(If they do not agree, they might ask you for more clarification or suggest any changes to the clauses. You will need to confirm with them a license that they agree with.)

Please note that the format above might vary from language to language, culture to culture and even person to person. Many might not know the name of the license so you need to speak in a very simple language and explain to them. The focus is to ensure that your interviewee understand why you are interviewing them, the interview being recorded, and you publishing it later.

  • Use clapping or some marker sound before each recording.[A 2] You are going to edit the recording. So, using some sort of marker is helpful to distinguish one recording from other. A simple and effective way to do that is clap near your microphone. A clap sound looks like a sudden bump (see picture). It then becomes easier to identify while editing.
  • The best emotion is captured when your interviewee trusts you the most. Try to be empathetic and friendly, relate to them in a human way and keep a check on their comfort level. They would open up to share something that they care about only when they think they can trust you. Trust is built over time. How do you bring it up in a short interview?
  • Warm-up questions: You can always ask some casual/informal questions in the beginning to warm up and slowly move towards asking more personal questions.
  • Body language: In a physical interview, your body language matters much more than a telephonic or voice/video call. Positive body posture can entirely set the mood of the subject. So a thumb rule is be a good listener and show curiosity to learn from the interviewee. But when you are interviewing someone speaking an endangered language that is alien to you, you still can start with the same body posture. Even though you won't understand the vocabulary, being empathetic and trying to relate by observing the interview's emotional flow. You could reflect that by the right kind of camera moves.
  • Motion is emotion: Documenting a language is not just about placing a camera on a tripod and interviewing someone though that is a good starting point. But you need to capture the life of someone on the camera if you are capturing them saying about their life. If a picture means a thousand words, a video means a million! So, take some ample amount of time to shoot some b-rolls. For instance, if your interviewee has narrated about a bedtime story during the interview, capture some relevant shots—like kids sitting around an old person, or parents with kids. B-rolls are generally short so shoot really tiny videos (30 seconds - 1 minute max.) and cover a wider range of areas because you never know where you can use them. You can use the b-rolls as cut shots.

Audio recording edit

Selecting audio recording device:

In language documentation, the audio is much more important than the moving images in a video. Some issues with video can be fine as long as audio is loud and clear. If you can afford, purchase a good quality microphone ("mic" in short). Select the microphone based on your production setup. Different mics are built for different purposes. For instance, a mic that is used for stage programs (say a singer performing) capture sound from a very close distance and cancel everything from the far (e.g. audience shouting in a crowd). It also requires a soundboard to convert analog sound into digital. One such mic might not be that useful for recording inside a small and close space. Similarly, mics that are designed to be used in indoors (say, studio mics) would capture more unnecessary sound if used outdoors.

Additional equipments:

  • Shotgun microphone: A shotgun microphone looks like a long stick covered with foam. It is attached on the top of the camera (or is plugged into the audio port of a phone). It captures audio from one direction. While recording with a shotgun microphone attached to camera/phone you always need to direct towards the speaker unless you are capturing any other sound.
  • Boom microphone: These are similar to shotgun microphones but are generally handheld. They are used a lot by television reporters while covering outdoor news. In most productions, a boom mic operator holds the mic stand and points the mic towards the speaker.
  • Tripod: A tripod is used to stabilize a camera. As the name suggests, it has three feet to stand steadily on the ground. The top of a tripod has a mount for the camera to be fixed. Most cameras have a threaded hole below them for fixing with a "tripod plate" or "mount". You probably need to purchase an additional holder if using a phone with a tripod. Most tripods have a handle to move sideways (panning) or up and down (tilting). Some tripods have no handles and are used for static shots.


  1. Home studio: If you are recording at home, try to create a minimal setup. You need a microphone to be able to record the audio. If you can, I would suggest recording in a small home studio setup like the picture above (consists of a USB microphone, a computer, and a monitor headphone).
  2. Field recording with a recorder or phone: The recording setup will largely vary if you are meeting someone outside your home for a field recording. In that case you will need to carry an audio recorder or a smartphone (some sort of recording app installed in it) with earphones. If you’re using a portable recorder make sure you cover the top of the mic with a soft cotton cloth or fake fur to a) avoid dust going inside, and b) the sound of the wind during outdoor recording. Use a rubber band to tighten the base and never touch the cloth/fur while recording. Mics can capture small little movements and completely distort the audio.
  3. Recording from phone: Earphones that come with the mobile phones can record good quality audio. They work both for phones and computers provided both devices have the same port. However, indoor recordings are preferred over outdoors recording when it comes to recording voice. Record in outdoors when that is important to establish context. If you are recording outdoors, make sure the microphone is facing the speaker and is closer to them. Earphones and phone microphones can designed to capture audio from a short range. If the interviewee is moving during the recording, you can try to clip the phone to their clothes. You can also try to hide the earphone below their shirt/top (by taping from inside) so that the earphone is not visible if it is a video. All these would vary from place to place and from person to person. For instance, an interviewee might not be comfortable to take help from a stranger interviewer to place the mic behind their shirt/top if they both are of different gender. Try taking help from others who your interviewee knows.
  4. Audio editing software: If editing from a computer, Audacity, a free and open source audio editing software is the first choice for many seasoned recording artists. It is robust, easy to use and can be used in multiple platforms. If you are using your phone or tablet to record and edit the audio, then, use your native recording app or try to find a good free alternative in your respective app store. Ideally the recording/editing app should be allowing you to record in a decent lossless quality (minimum requirement is 44100 Hz, above 16 bit PCM i.e. 24 or 32 bit, above 220 kbps; check your settings to find these). Save the audio in .WAV or .FLAC (Audacity supports both). If your recorder/phone does not support these formats, try to use an app/online converter like this (MP3→FLAC or M4A→FLAC) to convert the audio into .FLAC.

Video recording edit

Selecting camera

Videos are captured as a series of moving images. Each image is called a "frame". When these frames are played quickly, we process them as a video. This simple theory is the same for all videos. For language documentation, there are some recommended practices:

  • If you are purchasing a camera, purchase something that can at least capture in full HD (1920x1080 px). Most phones today can capture in full HD. Sometimes phones can record even better quality images than some cameras. Decide a camera or a phone that best suits your requirement.
  • The camera application (app) that comes with your phone sometimes automatically adjusts the video to produce an optimum image quality. However, it might also add unnecessary features. If you are able to download a more professional app, please consider doing so. Professional apps might require some practice as they have more number of options for better adjustment of settings. If you are starting out, you can use the default camera app, and gradually learn about professional apps.

Editing recorded videos

You need to compress the video using a free software like Handbrake, and upload that into YouTube or something similar without making it public. We will download it and ask you to delete it so that you don’t have to worry about the amount of space it will take in your hard drive.

Chapter 3: Metadata collection and publication edit

Metadata is the information that is collected to provide more information about any particular data. When you record a language as audio or video, each file is saved in your recorder or camera or phone as a unique file. This chapter would help you collect and organize the essential information for your documentation project.

The following questions would help identify what you need to collect and describe in your metadata:

  • Language:
    • What language and variation (dialect, other variation) am I recording?
    • What are the different names people have given to this language?
    • What writing systems (scripts) are used to write it?
    • Which communities/tribes use each script?
  • Speaker/interviewee:
    • What is the name of the interviewee, their age, gender?
    • Who could have influence the speaker's dialect? (e.g. If your interviewee migrated outside their home region, then it is important to mention. Think of other such factors that might be useful. Do take their consent while noting such details.)
  • Technical:
    • Audio: You will see these details for each audio file by opening the "Properties" (Windows) or "Preferences" (MacOS) or the  icon that looks like three dots (known as "kebab menu") on a phone. The details might look like:
      • Size of the file: 10 MB
      • File extension: .wav (not compressed, high quality, and larger size) "OR" .mp3 (lower file size and lower quality)
      • Duration of the audio: 00:06:15 (in a HH:MM:SS format; "00" - number of hours, "06" - "number of minutes" and "15" - number of seconds in this example)
      • Channel: Stereo "OR" mono (in a stereo recording you can have independent audio both on left and right side, mono has both left and right channels combined into one)
      • Bit rate: 256kbps (The higher the bit rate, the higher the quality and the higher the file size. Reducing the bit rate would reduce both quality and file size)
      • Sampling rate: 48kHz "OR" 44.1kHz (either of these two are recommended; default settings in most devices record with either of these sampling rates)
      • Last modified: June 23, 2021 4:10 PM (automatically created)
    • Video: Videos are very similar to audio with some additional metadata and can be displayed the same way like audio. These additional ones can look like the below:
      • Dimensions: 1920x1080 (known as "fullHD" -- one of the common dimensions. It is advisable to set your camera of phone camera to record in fullHD to ensure a good quality video.)
      • Codecs: AAC, H.264 (This is advisable to have. Your camera/phone camera mostly record in this format by default.)
      • Audio channels: Stereo "OR" mono

Annotation is the process of collecting additional information that might help provide background to any particular situation. For instance, a particular alcoholic beverage in an indigenous community is offered to the local deity first before drinking. A video that shows people consuming and the subtitles/captioning with the conversation that they are having might not provide enough context. Such nuances are generally added in text or audio along with a timestamp (e.g. refer to 01:36: Lakshmi and Babu are showing a gesture of respect to each other before drinking "rasi"). Audio/video content will surely need subtitles in largely spoken languages like English for a wider coverage. Transcriptions are generally created to have a verbatim version of the interview. Ideally, you need to work post-interview with a native speaker to create the transcription to ensure there is no loss of information in the process. However, it is not so for anyone to follow a transcription. So you need to create summaries for each section of the interview which will capture the highlights and sometimes details (for instance a game play or story).

Download Content Release form (editable document in .odt and .docx, fillable form in .pdf); Metadata Documentation Sheet in .ods, .xlsx)

Chapter 4: Accessibility edit

Accessibility considerations are to ensure that everyone can access the published digital media with no/moderate hassle. The underlying principle with accessibility is ensuring that none is excluded and making conscious effort to avoid any critical issues to people with disability. Use of subtitles/captions in audio and video, using typefaces/fonts in the visual media that have proper contrast, size and alignment considerations, and use of colors that are friendly to the eyes of people with color blindness are some of the most important considerations. To check whether the media you have published is accessible or not, you could use the below checklist.

Yes/no, How to Recommendations
A. Video captioning

Do your audio/video have subtitles/caption?

Yes Closed captioning (CC) is more preferred for web applications as the caption is not "burned in" (hardcoded) on the video but is displayed separately. It also helps for translation of captions if you could release it as Timed text formats such as SubRip (files ending with a .srt suffix). Open captioning means that the captions appear as images that are "burned in" on the video. You can only watch it whereas you can select different language versions available in case of Closed captioning.
No Adding captions to videos is a very essential requirement when it comes to linguistic documentation. There are many ways to add captions. For computers, a highly recommended software is Aegisub (user manual) as it supports all major platforms (Windows, Mac and other Unix operating systems). Many modern video editors also support captioning. If you are collaborating with remote translators then Amara is a recommended option. It is an Open Source video subtitling platform (learn how to use it from here). Popular platforms like Internet Archive, Vimeo and YouTube are supported on Amara. YouTube also supports an in-built Closed Captioning. We strongly recommend the comprehensive guides that BBC has created (short version here, long version here) to learn how to create accessible captioning.
B. Audio/video transcriptions

Do you upload a transcription file separately along with your audio documentations?

Yes Verbatim transcriptions often retain stutters and fillers such as "umm..", "hmm.." that are a part of human speech. As the primary purpose of transcriptions is accessibility, verbatim transcriptions help. Non-verbatim transcriptions either omit stutters and fillers entirely or they are replaced with explanatory text. You might have seen in (English-language) movie subtitles how they write [MUSIC][A 3] when there is background music playing. Similarly, you can use different explanatory texts based on the context. (see below for how to transcribe)
How to Please see the Transcripts resource page on W3C for more recommendations. Here is a step-by-step guide to create audio-to-text transcription that might be useful in some cases.
No Written languages: You must consider adding transcriptions to your audio and video. Simply put, transcriptions the text version of what is heard in an audio or video. They are very essential for people with full/partial blindness as they use screen reader software to convert text into audio and listen to the audio version to be able to access the content. Transcriptions are also helpful when a particular word is not very clearly pronounced. It is important to note that many written languages might not yet have a speech synthesis software but language documentations have a long lifeline. So, if you transcribe today and upload the transcription, it might be useful someday. It is often uploaded separately as a text file along with an audio file. YouTube shows the transcription separately when the option is selected on the right side of the video (only when the video is captioned). Spoken/oral languages: As oral languages do not have a writing system, you might consider translating the content first into a well known language that is relevant in your context, and make the transcription available.
C. Color contrast How to High contrast text is easily readable by people with low vision. So, it is always preferred over any aesthetics corrections. In your titles/captions, credits in the case of videos, documents shared along with audio/video, and web displays (websites, blogs, articles), try to use high contrast text. Extremely light-shaded text over a light-shaded background (e.g. grey over a sky background like this) are hard to read for many.

FAQ edit

  1. What is OpenSpeaks?OpenSpeaks is a set of free and open resources that are intended to help anyone who is documenting a language. It includes guides on asking for consent before recording, how to record a language in a multimedia format, the process of selecting copyright for a recording, recording metadata (important information that is useful for archival), and publishing the content. It also contains downlodable forms and other templates.
  2. How this project is maintained?OpenSpeaks was originally started on Wikimedia Commons, a sister project of Wikipedia, by Subhashish Panigrahi. Later, it was housed here at the O Foundation. To make it a truly open project, it was mirrored on Wikiversity at so that anyone can edit and improve it. Both the versions are synchronized on a regular basis.
  3. How is it different than other resources?We think of OpenSpeaks as a directory of resources and a platform that is complimentary to other similar platforms. Many useful resources that are developed by  language documentation organizations and other leaders are included and attributed here as well.
  4. I am interested to contribute to OpenSpeaks. How can I help?You can certainly help grow OpenSpeaks. No skill is a small skill and your contribution would be valuable. Please go to the Wikiversity site to log on using your existing Wikipedia or other Wikimedia project credentials or create a new account there (a different set of Privacy Policy applicable as Wikiversity is a non-OFDN site).
  5. Will I be attributed when someone uses my contributed work?Yes. Both the License terms and the attribution guide below encourage attributing to the authors with a hyperlink to the list of authors.
  6. Can I use the content of this website or the OpenSpeaks page on Wikiversity?Yes. We encourage everyone to make use of this content in their own work, translate, and even distribute for commercial reproduction. However, when you do that, please attribute (see next answer for details) properly.
  7. What license OpenSpeaks is available under and how to attribute when I use any content?OpenSpeaks shares the same Creative Commons License -- CC-BY 3.0 -- that Wikiversity uses. See the Attribution section for details on attributing.

Attribution edit

English version

Panigrahi, Subhashish; contributors, Wikiversity (2021) [First published 2017]. "OpenSpeaks". O Foundation. India: O Foundation. {{cite web}}: |last2= has generic name (help)

BibTeX entry
   author = "Subhashish {Panigrahi} and Wikiversity {contributors}",
   title = "OpenSpeaks",
   publisher = "O Foundation",
   year = "{{CURRENTYEAR}}",
   url = "",
   note = "[Online; accessed {{CURRENTDAY}}-{{CURRENTMONTHNAME}}-{{CURRENTYEAR}}]"
Santali version

Panigrahi, Subhashish; contributors, Wikiversity (2021) [First published 2017]. "OpenSpeaks" [ᱚᱯᱮᱱᱥᱯᱤᱠ]. O Foundation (in Santali). Translated by Murmu, R Ashwani Banjan; Baskey, Fagu; Murmu, Joy Sagar. India: O Foundation. {{cite web}}: |last2= has generic name (help)

BibTeX entry for Santali version
   author = "Subhashish {Panigrahi} and Wikiversity {contributors} and R Ashwani Banjan {Murmu} and Fagu {Baskey} and Joy Sagar {Murmu}",
   title = "OpenSpeaks",
   publisher = "O Foundation",
   year = "{{CURRENTYEAR}}",
   url = "",
   note = "[Online; accessed {{CURRENTDAY}}-{{CURRENTMONTHNAME}}-{{CURRENTYEAR}}]"

Acknowledgements edit

OpenSpeaks has been enriched from a range of major projects, readings and interactions. It might not be possible to attribute all in a chronological order but some of the individuals and organizations include, but is not limited to:

Voluntary declaration edit

The Chapter "Chapter 1: Consent, Content Rights and Content Licensing" was created and expanded with a grant from Creative Commons. More details in this page.

Glossary edit

  • CLAVA: Citizen Language Audiovisual Archivist (also, language archivist or simply an archivist) is used to loosely describe an individual who is recording a language as audio or video for archival purposes.
  • OER: Open educational resources are freely accessible, openly licensed text, media, and other digital assets that are useful for teaching, learning, and assessing as well as for research purposes (from Wikipedia).
  • Phone: Phone is used across this toolkit to refer to a smartphone.

Notes edit

  1. "Community" in "community-first" approach is a reference to the community of native language speakers whose language is intended to be documented.
  2. Please inform your interviewee in advance the purpose of such clapping. Clapping sound might be an issue with a person with disability. Clapping during a conversation might be disrespectful in certain cultures.
  3. The vocabulary, format and style for transcriptions vary from platform to platform. For instance, some use [NAME OF SONG IN BACKGROUND] whereas others use icons such as ♬ NAME OF SONG IN BACKGROUND ♬ for representing the same thing.

References edit

  1. "OpenSpeaks". O Foundation. 2021-05-13. Archived from the original on 2021-06-18. Retrieved 2021-06-18. Original domain name currently deprecated {{cite web}}: |archive-date= / |archive-url= timestamp mismatch (help)
  2. Seyfeddinipur, Mandana; Rau, Felix (2020-09). "Keeping it real: Video data in language documentation and language archiving". Language Documentation & Conservation 14: 503–519. ISSN 1934-5275. 
  3. Panigrahi, Subhashish (2020-05-11). "Promoting coronavirus education through indigenous languages". Global Voices. Retrieved 2021-05-05.
  4. Maass, Peter (2013-08-13). "How Laura Poitras Helped Snowden Spill His Secrets". The New York Times Magazine. Retrieved 2021-05-12.

Other recommended resources edit

  1. [Course] “Archiving for the Future: Simple Steps for Archiving Language Documentation Collections“. Accessed 30 September 2020.
  2. [Online guide] “Language Sustainability Toolkit“. Living Tongues Institute. Accessed 30 September 2020. (Archive, also see other recommended educational resources by Living Tongues)
  3. "Resources · Language Digital Activism Toolkit". Language Digital Activism Toolkit. Retrieved 2021-05-15.