Caption Quality Standards

We recommend using our professional captioning service to caption your videos. However, if you're captioning your own videos, follow the guidelines below to ensure an inclusive, accessible experience for everyone.

Should Captions Be Verbatim?

In general, try to use the same wording as the speaker, but minor edits to what was said can sometimes make the captions more usable. However, be careful that when changing the text you do not change the meaning of what was said. Below are some guidelines to consider:

You may remove filler words, such as “you know,” “well…”, “um”, or other non-essential information. However, sometimes a stutter or filler word can be meaningful (showing nervousness for example).
When a speaker uses grammatically incorrect language or a dialect, it should be reflected in the captions.
Words in another language should be captioned as spoken (formatted in italics if possible). Otherwise, a “[speaking French]” or similar caption will suffice. Never translate words in another language to English.
If speech is cut off or low quality and is impossible to understand, use the word inaudible with brackets: [inaudible].
Hesitations should be replicated in the captions - long pauses should be shown using ellipses (...), and smaller pauses should be shown through commas or dashes.

Line Length and Breaks

Putting in line breaks at the correct places makes a huge difference in the quality of the experience for the viewer. The following are guidelines to consider when adding line breaks and assessing line length:

There should be a maximum of 2 lines on the screen at a time.
Don’t include more than 42 characters on a line (including spaces).
Where possible, both lines should be about the same length.
Don’t end and begin a sentence on the same line unless the sentences are extremely short (Three words or less as a general rule).
Avoid breaking between a word and its modifier, in prepositional phrases, between first and last names or associated titles, or immediately after a conjunction. Instead, lines should be broken at punctuation or natural pauses in speech wherever possible. This way, the captioned experience is similar to how we naturally process language. See the Captioning Key - Line Division section for some helpful examples.

Time on Screen

The speed of the speech can impact how long a caption is on the screen at a time. In general, we want to stick as close to the speaker’s timing as possible. However, if you feel like the time on screen is too short to be readable (some people talk fast when nervous!) you can try to stretch the caption a touch to make it more readable. Use caution though—we don't want to create a domino effect where many other captions would have to be adjusted and/or would all be off-timing.

Formatting

Screaming and shouting should be shown using ALL CAPS. Capitalize proper nouns such as names, places, and organizations.

Italicized text should be used:

When a speaker is quoting someone else.
For words and phrases that aren’t in the primary language of the video.
When a person is thinking or daydreaming.
For offscreen speech or sounds

Note: Some programs or file types may not allow you to add italics.

Numbers:

Spell out numbers from one to ten, use numerals for numbers over ten.

Speaker Identification/Changes

It is important for viewers to understand who is speaking in the captions. Here are some guidelines to consider:

Start a new line or use labels when a new speaker begins talking.
If a speaker is visible on screen, introduces themselves, or their name is shown on screen, no identification is needed.
If it is unclear who is speaking, consider identifying the speaker on a line before their speech:
MR. SMITH: Let's eat lunch.

OFFSCREEN NARRATOR: This is the next video in our series.
Once a speaker has been identified, we recommend using two arrows to indicate when a speaker changes:
>> Hi Doctor, nice to meet you.
>> General, pleasure.
If the speaker has been identified but is not on-screen, be sure to still identify them.

Non-Speech Elements

Non-speech elements include music without lyrics, sounds, or the absence of sound. These elements can be an important part of conveying a video’s meaning. Here are some general principles to follow:

Generally, music or sound effects should be included in their own caption box so they appear separately on a screen, rather than within text.
Non-speech elements should be presented within square brackets “[ ]”.
If a non-speech element, such as music or no audio, continues through the entire video, simply add one caption box with the non-speech element at the beginning of the video followed by an ellipsis. This type of caption should remain on the screen for 15 seconds.
[Upbeat music playing...]

Sound Effects

Captioning sound effects can be very important for helping someone who is deaf or hard of hearing understand a video. However, be careful to only include sound effects that are necessary or helpful in understanding or enjoying the video.

When deciding whether to include a sound effect, you may want to ask yourself if you think most people would want to read about that sound effect or if it adds any value to the video.

We recommend reviewing the Sound Effects section of the DCMP Captioning Key for specific examples.

Music

Music is often used to communicate a mood or theme.

When possible, the title and composer of a musical work should be included.

[Enya playing "Orninoco Flow"]
For unknown music, the style or presentation of the music should be described objectively.

[Calm piano music]
If the message of song lyrics is important, they should be captioned word for word. Surround lyrics with music icons (♪).
Nonessential background music that doesn’t contribute to the mood can be ignored.

There is additional guidance from the Music section of the DCMP Captioning Key that may be helpful.

Other Non-Speech Elements

There may be other non-speech communication in a video that is important to include in captions. For example:

If there is speech that contains an important emotional cue that is not shown, you can indicate it in the captions. For example:

[sorrowful]
My dog died.
If it appears that people are talking, but they actually aren’t, it would be important to let the viewer know that no speech is happening:

[silence]

Learn More

If you would like to learn more about any of the above caption standards or other specific situations, we recommend the below resources for additional reading:

DCMP Guidelines and Best Practices for Captioning Educational Video

Video reviewing the DCMP Standards