What are subtitles? What are captions?

First of all, it's worth noting that the distinction I'm about to make here is not all that widespread. I think it's a useful one, though, and it's been advanced by, i.e. the HTML5 standard, so I decided to present some of my thoughts on the matter.

At first glance, subtitles and captions seem to be pretty much the same thing. They are a kind of timed text, ented alongside video or audio. However, the words subtitle and caption have come to have quite erent ings, with different standards and best practices used in their creation and presentation. To borrow the ing the above-mentioned HTML standard, subtitles are "Transcription or translation of the dialogue, able for the sound is available but not understood," whereas captions are "Transcription or translation of the ogue, d effects, relevant musical cues, and other relevant audio information, suitable for when sound is ailable ot clearly audible."

Subtitles and captions both seek to address the accessibility of a piece of media. Even people who don't use them in their daily lives are probably familiar with closed captions from the muted televisions in places like dentists' offices and cafes. These captions are onscreen text, timed to speech, that make audiovisual media accessible to anyone who is, for whatever reason, unable to listen to or hear the media. This includes people who are deaf or hard of hearing, someone who doesn't want to use their phone's speaker on the bus, or anyone who's trying to follow a sports game in a crowded bar. The main point is that captions make the media accessible to anyone who would otherwise be unable to hear what's being presented.

Subtitles, though just as concerned with making a pace of media more accessible, do not aim to be a replacement or supplement for the audio track. Instead, subtitles, refer to timed text that offers a translation of the media in question. In other words, subtitles make the medi accessible to an audience that would otherwise not be able to enjoy it because of a language barrier. As an anime fansubber for an English-language audience, this is the perspective I come from.

So what's the difference, really?

While you might assume that subtitles would simply be captions that happen to be in a different language from the source material, that's not quite the case. Because of their different purposes, they tend to have different standards in terms of how they're displayed, timed, styled, etc. Furthermore, the two kinds of timed text have significantly different institutional histories. The standards for captions are written down in policy, lobbied for by advocacy groups and usually codified by public broadcasting organizations like the BBC. Good subtitle style, on the other hand, has largely been pioneered by much smaller organizations, as well as subcultural communities like fansubbers who are, for better or for worse, much more flexible in terms of how they do things. In contrast to the "accessibility first" ethos of captioning best-practices, when subtitlers have seriously thought about their craft they have tended to emphasize a "seamless" experience: one where the target-language viewer has as close as possible to the same viewing experience as the native-language audience. This has resulted in some standard practices that aren't ideal for accessibility.

If you compare professional closed captions and good, modern fansubs, the differences are stark. Things like subtitle flash, which fansubbers avoid for aesthetic reasons, are actually mandated by the style guides used by organizations like Netflix. Take this video from Mark Brown at Game Maker's Toolkit, where he talks about subtitles [captions] in video games. He presents several "golden rules" for game developers on how to create accessible captions. Plenty of them are excellent best practices for both kinds of timed text, like making sure the text is large enough, using a clean sans-serif font, ensuring good contrast, etc. However, he also advocates certain techniques that would be entirely out of place in good subtitles: putting a semi-transparent black frame around the text, as is common in YouTube captions; the several-frame gap mandated by Netflix mentioned above; and colour-coding characters' speech.

The topic of the second half of that video brings me up to my next point: captions, assuming that you're unable to hear the media's audio track, make a habit of captioning just about as much as is reasonable. Exclamations, doors being knocked on, whistling, and so on. Good subtitles, on the other hand, avoid subtitling anything that isn't speech. It's common for new subbers to want to subtitle things like screams, breathing, and so on. Those sounds however, for any human audience anyways, already have all the meaning they're going to have. Assuming the audience can hear the sounds, a text caption adds no meaning whatsoever. So, in order to provide the "seamless" experience that subtitles aim towards, these sounds should not at all be represented in the text.

Subtitles, however, have something similar they have to reckon with: signs. A sign is—in subtitler lingo, anyways—any piece of text shown in the video itself. If you're translating Japanese anime, and there's a Japanese sign, or book, or namecard, or video game HUD element on-screen, you'll be letting your audience down if they can't understand it. Whats common in captions and professional subtitles is to simply overlay text over the sign. This solution, while simple and effective, flies in the face of the subtitle ideal.

In DVD or BluRay subtitles, where the subs are actually just a series of bitmap images you could overlay an image. In old-school hard-subbed fansub releases, you could use professional compositing tools to modify the video itself. In modern anime fansubbing, however, where hardsubbing is eschewed in favour of clean video with subtitles rendered in the video player, fansubbers have pioneered the art of typesetting using only the features provided in the ASS4+ subtitle format. This is, all things considered, a pain in the ass and everyone regrets it.


There's quite a bit more depth I could go into, like talking about the placement of the text on screen, and how text is styled in various standards, but I think that's something for another time. For now, suffice it to say that subtitles and captions are different, with different purposes and taking different forms. Captions strictly prioritize accessibility and assume the viewer is unable to hear the audio, whereas subtitles prioritize creating an experience as close as possible to the native audience's. I do want to leave you with some questions, though: Does this mean that subtitles are inaccessible? Particularly in anime, the sphere I'm most familiar with, there's a broad swath of subtitled media that doesn't strictly meet caption accessibility standards. Is the iniquity great enough that's something we should worry about, or are subtitle standards adequate, if not perfect? Should something be done? If so, what?

