November 7, 2018

How to Handle Multiple Languages When Tagging For Accessibility

Jen Goulden | Project and Quality Manager
Document Accessibility Whitepapers

When you think of accessible PDF files, elements such as lists, headings and tables probably come to mind. If you’re testing files you’ll make sure that complex tables have been properly tagged, that the heading hierarchy is correct and that alt text has been applied where needed. This is a good place to start, but there are other important things to consider.

One issue that people often ask about is document language. If the content of a file is entirely in English you can select the language in the document properties. But how should you handle files in other languages, particularly those that are bilingual or multilingual?

If the file is unilingual the language must be selected in the document properties, whether the content is English, Spanish or Swahili. If the file contains content in more than one language, each block of text must be appropriately tagged. For example, let’s say you are tagging the dinner menu for a Mediterranean cruise ship (one of my favourite destinations). This document contains English, French and Spanish versions of the menu. You would need to tag the English portion as English text, the French portion as French text, and the Spanish portion as Spanish text. The reason this is so important is that screen reader users can set their software to detect language. When this feature is activated the screen reader will pronounce the text in each language as it should be spoken. One thing to note is that the end user may need to download extensions of their screen reader software in order to access languages which don’t use the Roman alphabet (such as Mandarin, Russian and Japanese).

One final consideration regarding multiple languages in accessible PDF files relates to accented letters. Screen readers cannot always process accented characters accurately, which can result in some pretty strange pronunciations. Usually the best approach is to use Unicode values to represent these characters.

Bilingual and multilingual documents are more common than they used to be, and this trend is likely to continue. Setting the document language and accurately tagging foreign-language text will not only make your content more accessible but will greatly improve the user experience as well.