Do you know how to create accessible audio and video media? Are you familiar with the difference between audio-only, video-only, and synchronized media? This resource summarizes relevant Web Content Accessibility Guidelines (WCAG) and Section 508 requirements and highlights specific considerations for content, planning, and development. It also clarifies how content creators should work together with designers and developers to ensure that federal websites and digital products meet the Revised 508 Standards.
Table of Contents
- User Benefit
- Audio Descriptions
- Planning Accessible Audio & Video Media Projects
- User Controls for Captions and Audio Descriptions
Reach out to your agency Section 508 Program Manager or to the Government-wide IT Accessibility Program if you have any additional questions.
According to the 2010 U.S. Census, nearly 19% of the U.S. population can be classified as having a disability—whether from birth, by accident or via old age. This means many of your customers may be blind or low-vision, deaf or hard-of-hearing, or have a cognitive, dexterity or other physical disability. For these individuals, being able to access content on the internet greatly affects their quality of life and, for federal employees, the ability to perform their duties.
Considering that many disabilities may not be obvious, it’s common for people to believe that they have yet to meet or work with a person with a disability. Yet, statistically, one out of every five people you meet and work with has acquired a disability. And as disabilities can be temporary (e.g., broken wrist, ear infection) or situational (e.g., loud meeting space), products that are accessible become beneficial to even more people.
Captions provide a textual display of spoken dialogue and indicate other sounds on visual displays, such as television monitors, computer screens and projected video. Captions are designed for people who are deaf or have hearing loss and enable full participation when viewing video or multimedia productions that have audio. Captions identify speakers on and off camera. Sound effects are typically indicated by text, such as "telephone ringing" or "footsteps," and the use symbols to indicate sounds, such as music. Captions are also beneficial to people who speak a foreign language, are learning how to read, or are watching the media in a noisy area, as well as those who understand best by processing visual information.
Depending on whether the media is live, or pre-recorded, be aware of the synchronized media standards as laid out in the WCAG when planning the production of the media product:
- 1.2.2 Captions (Prerecorded) - Captions are provided for all pre-recorded audio content in synchronized media, except when the media is a media alternative for text and is clearly labeled as such.
- 1.2.4 Captions (Live) - Captions are provided for all live audio content in synchronized media.
Examples of Captioned Media
In this example, note how the producers created a well crafted captions file, with proper synchronization, spelling, and grammar, to ensure that the text alternative for the pre-recorded content provides an equivalent experience for those unable to hear the audio track.
Duration: 0h 0:31m
What is the difference between open and closed captions?
Open captions are displayed as a permanent part of the video, can never be turned off or hidden, and do not have to be selected by the user. Closed captions can be turned on and off and are not a permanent part of the video display.
What is the difference between captions and subtitles?
Subtitles are used to translate dialogue into a different language. They are intended for hearing audiences and do not indicate audio information important to understanding the program. Subtitles rarely identify speakers or nonverbal sounds such as music and sound effects. Subtitles are not an acceptable method for conforming with the synchronized media standards.
What is the difference between transcripts and captions?
Transcripts are the output of a process in which speech or audio is converted into a written, plain text document. Transcripts do not have time-coded information associated with them. Captions divide transcripts into blocks of text, known as “caption frames” which are time-coded and synchronized with the audio of a video.
Can’t we just use auto-caption technology for pre-recorded media?
While auto-captioning technology has improved considerably over the years, auto-captioning does not yet provide an equivalent experience for people who are deaf, hard of hearing, or who may otherwise use captions for understanding audio information. Voice-to-text translations can be incorrect, lack speaker name changes, fail to include any grammar and punctuation, and are often not synchronized with the narration in the audio track. Auto-captions also fail to describe other audible information, such as “cars crashing” or “fire alarm sounding” that could provide important non-visual context. As such, current auto-captioning technology would not allow your agency to meet the minimum standard for pre-recorded media.
What are some tips for the production of captioned media?
The following are a few key items you should consider:
- Avoid relying solely on auto-captioning for pre-recorded media.
- Plan for captioning and avoid inefficiencies when producing media. Use speaker scripts, automated/manual captioning from live media events, and other text-based materials in your pre-recorded media captioning procedures.
- Synchronized media content with speech should have accessible captions that are synchronized to appear at approximately the same time as the corresponding audio, and equivalent to the spoken words and other audio information.
- Spoken words should have text equivalents. Sounds and other audio elements necessary to understand and enjoy the entertainment experience should be captioned. Audio elements that are often overlooked by captioners include additional information provided verbally for a text element on a presentation slide.
- The style of captioned elements (e.g., speaker names, sounds) should be consistent throughout a project.
Audio description (AD) is audio-narrated descriptions of a synchronized media program's key visual elements. These descriptions are inserted into natural pauses in the program's dialogue.
Audio description is a way to provide access to visual information by means of a verbal representation of visual elements in a video program for audience members who are blind or have a visual impairment. Audio description often cannot convey all of the visual information included in each scene of a video program; therefore, content creators and audio describers necessarily make choices to prioritize the information ultimately included in the description. Those choices seek to convey the intent of the program’s creator by presenting audiences with a description that illustrates the visual elements of a story in a manner that provides a comparable experience to that of sighted viewers.
Once again, depending on whether the media is live, or pre-recorded, be aware of the (WCAG) synchronized media standards when planning the production of the media product:
- 1.2.3 Audio Description or Media Alternative (Prerecorded) - An alternative for time-based media or audio description of the prerecorded video content is provided for synchronized media, except when the media is a media alternative for text and is clearly labeled as such.
- 1.2.5 Audio Description (Prerecorded) - Audio description is provided for all prerecorded video content in synchronized media.
Examples of Audio Described Media
In this example of audio description, the producers planned for the secondary audio track by increasing the length of video between the narration audio track. This results in a better, more equivalent media experience for the viewer.
Duration: 0h 0:42m
In this example, the audio description was not planned for originally, and has been inserted into the media afterwards by stopping the motion of the video and allowing time for a secondary audio track description of meaningful information.
Duration: 0h 1:40m
What are common approaches to adding audio descriptions to videos?
- Build the descriptions into the media file - The easiest way to create audio descriptions in your video is to plan for it and have your speakers identify themselves verbally (rather than just displaying their name on screen) and describe any visual information. This way, anyone—whether or not they are visually impaired—will know who is speaking and what they are referencing
- Make a second media file - The Revised Section 508 Standard requires federal agencies to use a video player that provides user controls for captioning and audio descriptions. Some players may play an alternate audio track which contains the primary and secondary narration, others may simply load an alternate audio described version of the media file. Alternate media files must be labeled as the audio described version. Unless you planned for the time needed for the secondary audio track, you may find it difficult to simply reuse the original video file without pausing the video where there is a natural break in the narration, to maintain synchronicity.
What are some tips for the production of synchronized media?
An audio description is ultimately a creative process regardless of style, implementation or quality. Not all visual content can be described thoroughly and creators must make decisions regarding what is important to describe, the vocabulary used, timing and method of delivery. If possible, audio description should occur within the production process rather than external to it. If it does not occur within the production process, post-production remediation should be overseen by a member of the creative team, if possible. The following are a few key items that should be considered:
- Plan for accessibility to avoid retrofitting audio descriptions into an already produced media. All audio elements should be identified prior to editing video content.
- Visual elements necessary to understand and enjoy the entertainment experience are described. Visual elements that are often overlooked by describers include title, speakers and speaker changes, and end credits.
- Opportunities for redundancies that clarify comprehension and enjoyment should be considered where possible.
- The style of description should be consistent throughout a project.
- Description should only occur during non-dialogue pauses; description should never occur over dialogue, musical numbers or sound effects unless absolutely necessary.
- Describers should ensure that elements important to the narrative are described before additional details are provided. If time allows, the describer can include additional descriptions about the setting, a character's physical appearance and/or clothing to enhance the experience.
Planning Accessible Audio & Video Media Content
When creating new audio and video media content, producers must plan for captions, audio descriptions, and text transcripts to ensure that all users have access to the information. Fortunately for producers, determining which of the alternate methods is required is as quick and simple as answering this two-question workflow (see also, Figure 1):
- Will we be using sound?
- Yes, we will be using sound.
- Will we be using video?
- Yes, we will be using sound and video together. This is called Synchronized Media and requires both captioning and audio description.
- No, we will be using sound only. This is called Audio-only Media, and requires a transcript** be published alongside the audio media player.
- Will we be using video?
- Yes, we will be using sound.
- No, I am not using sound (just video).
- Is the video b-roll*?
- Yes, the video we are using is considered b-roll. This is the raw source media and does not have to meet any Section 508 requirements.
- No, the video we are using is content. This is called Video-only Media and requires either a transcript or audio description.
- Is the video b-roll*?
* B-roll is the supplemental or alternative video footage (photographs, and animation) captured for creating the video.
** Transcript is a text description of the video content to ensure equal understanding of information.
Figure 1: Planning Accessible Audio & Video Media Projects
NOTE: As federal agencies can establish policy and design guidelines which meet or exceed the minimum standards of Section 508 (for example, an agency requires open captions for all videos), producers should confirm agency requirements when planning new media content.
User Controls for Captions and Audio Descriptions
The Revised Section 508 Standards include a requirement (503.4) where a digital product displays video with synchronized audio, the product shall provide user controls for closed captions and audio descriptions at the same menu level as the user controls for volume or program selection. This requirement works in concert with WCAG 2.0 success criteria and conformance requirements:
- 503.4 User Controls for Captions and Audio Description. Where ICT displays video with synchronized audio, ICT shall provide user controls for closed captions and audio descriptions.
- 503.4.1 Caption Controls. Where user controls are provided for volume adjustment, ICT shall provide user controls for the selection of captions at the same menu level as the user controls for volume or program selection.
- 503.4.2 Audio Description Controls. Where user controls are provided for program selection, ICT shall provide user controls for the selection of audio descriptions at the same menu level as the user controls for volume or program selection.
Example of Accessible User Controls
The Department of the Interior has developed an open-source media player which conforms with the Revised Section 508 Standard for user controls by ensuring that the caption control (CC) and audio description control (AD) buttons are visible along-side the other user controls (Figure 1).
Figure 1: User controls of the Department of the Interior media player
In this example, the National Park Service uses a keyboard accessible media player which provides user-selectable controls for closed caption and audio description at the same menu level as the volume. Note how the captions include descriptions of sounds, and the secondary audio track integrates the description of meaningful video content.
Duration: 0h 3:07m
|1.0||Video Program with Sound|
|1.1||Captions - Media has captions (open or closed) that are synchronized with video which includes other meaningful sounds and audio information.|
|1.2||Audio Description - Media has visual information described by either the primary or a secondary speaker; either in the original or alternate media which is clearly labeled and associated.|
|2.0||Video Program without Sound|
|2.1||Transcript or Audio Description - Media has either an associated text transcript, or a narrated audio track.|
|3.0||Audio Program without Video|
|3.1||Transcript - Media, such as a podcast, has a text transcript which includes other meaningful sounds and audio information.|
Below is a consolidated list of the resources referenced on this page along with some additional resources.
- Guide to Accessible Web Design & Development: Synchronized Media
- How to Meet WCAG (Quick Reference) - W3C
- Captions, Transcripts, and Audio Descriptions - WebAIM
- Making Multimedia Section 508 Compliant and Accessible - Digital.gov
- The Ultimate Guide To Closed Captioning - 3PlayMedia
- Resources developed by DigitalGov University
- Descriptive Video Production And Presentation Best Practices Guide For Digital Environments - Media Access Canada
- The Description Key - The Described and Captioned Media Program
- The Audio Description Project - American Council of the Blind
- Audio Description Core Concepts (PDF) - Harpers Ferry Center of the National Park Service
- Transcription vs. Captioning – What’s the Difference? - 3PlayMedia
- 508 Accessible Videos – How to Make Audio Descriptions - Digital.gov
User Controls for Captions and Audio Descriptions
- The Department of the Interior (DOI) media player is available on GitHub
- 508 Accessible Videos – Use a 508-Compliant Video Player - Digital.gov
We're always working to improve the information and resources on this website. To suggest a new resource for this or another page, please contact us.
Reviewed/Updated: January 2021