The State of Subtitling for the Deaf and Hard of Hearing

In today’s world, subtitles and captions are becoming ever more important as companies in the media industry seek to target a wider audience and promote inclusivity for all. As a result, visual media is becoming more accessible and dynamic.

Subtitles are a valuable addition to video content for a wide variety of viewers. They help to improve video accessibility for the deaf and hard of hearing as well as individuals with autism, ADHD, and dyslexia. Foreign-language speakers also benefit from having subtitles as they help them to understand content with dialogue that’s not in their native language.

Even among hearing native language speakers, many prefer to have subtitles on as it enables them to follow along with the speech and retain the information presented better. A study conducted by the UK’s Office of Communications in 2006 revealed that as much as 80% of individuals who used captions were not deaf or hard of hearing.

Subtitles and closed captions help viewers to comprehend dialogue that is spoken rapidly. They also allow viewers to enjoy content in noisy environments or sound-sensitive environments such as libraries without having to turn the volume up. Many social media users view video content on auto-play without audio and captions help to ensure that the message isn’t lost.

Without a doubt, subtitles enable media companies to reach a wider audience and increase user engagement on their content. Videos with captions usually have longer viewing times and are likely to be shared more. It’s no wonder, then, that the subtitling industry is rapidly growing.

Subtitled Content is on the Rise

According to a report by the European Federation of Hard of Hearing People, most countries are improving in their provision of subtitles on Television. The rise of “On Demand” services such as Netflix, Hulu, and Disney+ has also led to an increase in subtitled content.

Services such as BBC iPlayer and Netflix do a good job of providing subtitles. However, progress could be made in the number of languages in which such captioning is provided.

While progress has been made in the amount of content being subtitled, some still believe that subtitling is an expensive, slow, and laborious process. However, subtitles cost very little compared to the overall production costs of programming.

What’s more, Automatic Speech Recognition (ASR) technology has improved dramatically over the years, making it much easier to create accurate subtitles. Nowadays, voice services can understand accents better and make more informed decisions to accurately distinguish between similar sounding words and phrases.

On the other hand, speech recognition software still has a lot of room for improvement.

While it has certainly grown by leaps and bounds over the decades, reliably recognizing speech in real-world acoustic environments is still a challenge for most systems. Also, most ASR technologies require well-trained language models as well as input from experts to keep the Word Error Rate (WER) to a minimum.


ASR needs Human Intervention

Accuracy is a vital component of subtitling as it’s key to ensuring that individuals who rely on captions get an accurate representation of the original spoken content. Word errors often result from less ideal conditions including poor audio quality, background noises, overlapping speech, and multiple speakers.

Formatting errors are also common with ASR technologies. Some of these include incorrect punctuation, misleading speaker labels, bad grammar, and a lack of inaudible tags and other notations.

According to a report on the state of AI driven subtitling, demand for subtitling solutions grew rapidly in 2020. The study revealed that several advancements in ASR technologies, as well as training, have been made in recent years. However, the report concluded that current ASR technologies are still inadequate for use as standalone systems without human input.

Thus, it’s essential that media production companies work closely with experts to consistently improve the language model and to reduce word and formatting errors. This will result in less time needed to review transcriptions, lower correction costs, and ultimately, lower subtitling costs.


Experts can Help Lower AI-Driven Subtitling Costs

To reduce word error rates, specific custom dictionaries can be used to help the Automatic Speech Recognition system recognize specific words. For example, a talk show about Formula 1 racing may contain the names of F1 drivers such as Hamilton or Verstappen as well as specific terms like DRS (Drag Reduction System) or Tankslapper.

Since these terms aren’t frequently used in regular day-to-day speech, an ASR system might fail to correctly transcribe them, leading to an increased word error rate. However, if these terms are uploaded into the ASR system beforehand, they’re more likely to be recognized. This will lead to fewer errors and reduce the need for a human operator to ensure accuracy, making AI-driven subtitling even cheaper.

Experts are also needed to ensure proper speaker diarization. Speaker diarization is the process of splitting an audio stream into different segments according to the identity of the speakers. Some ASR systems do not support this process and only produce large blocks of text, thus requiring a human operator to manually perform the diarization.

However, the best ASR systems have perfected this process and can accurately indicate the switches between speakers in an audio stream. As a result, the work of a human operator is greatly reduced making the process much faster and cheaper. Automatic speech diarization also makes it easier to sync subtitles with speech.

When companies work with experts to improve the language model, implement custom dictionaries, and provide human operators with a custom-designed user interface that incorporates lessons learned from studying human operator workflows, the cost of subtitles per program is reduced even further. helps content distributors with AI driven subtitling and translations.

Back to Blog