RNIB Media Briefing on Accessible PDFsTuesday, October 25, 2005
These are my notes and recollections to the RNIB media briefing into accessible PDFs. The Royal National Institute for the Blind hosted a media briefing on the evening of the 20th October 2005. The topic was the accessibility of PDFs. This event was organised in conjunction with Adobe, and specifically because Adobe's US-based Accessibility Manager, Greg Pisocky was in the UK for a conference.
Brief chats with Adobe
I had a brief chat with the three Adobe delegates - David Stephenson and Marc Straat both based in the UK, as well as Greg Pisocky, before the main briefing. The one point that caught me by surprise was the observation that people deal with many different formats every day, from spreadsheets, to Word documents and Powerpoint presentations. Trying to simplify accessibility to just HTML isn't a realistic option.
I quizzed the Adobe guys about dynamically generating PDFs using XSL-FO - at the moment FOP cannot generate tagged PDFs, so that makes creating accessible PDFs impossible. Although the number of people that have this particular problem is small, they acknowledged that these customers are asking for a solution to exactly this problem. This is part of their future plans. One of PDFs advantages is creating a tagged PDF from an XML source.
One interesting piece of information is that there is now a PDF Usability and Accessibility group looking to create and publish guidelines for accessible PDFs.
I just had to ask Greg about his thoughts on Joe Clark's A List Apart article on PDFs, the response was unambiguous - Joe Clark was spot on.
Julie Howell, the RNIB's Digital Development Manager, opened proceedings. PDFs have a long history of use in the real world. Many people say the RNIB shouldn't be talking with Adobe, the RNIB disagreed. It would be a folly for the RNIB and others to turn their backs on what is happening in the real world. We are not here to sing the praises of Adobe, but to test what their software can do in terms of creating accessible documents. Ideally we'd like to see software dealing with all the problems of accessibility, and not lay it all on the developer. This briefing is about looking at the results so far.
The two speakers this evening are Hugh Huddy, the Digital Policy Development for the RNIB, and Greg Pisocky, Adobe's Accessibility Manager.
Hugh Huddy overviews the briefing
Although both the RNIB and Adobe are presenting during this briefing, the presentations have been developed separately. There are no joint goals. What you are seeing is something fresh: 2 organisations fighting head to head.
Hugh's history of surfing the web included regularly increasing the default font sizes in his browser. At one point, when he was up to 20 pixel sized Arial fonts, he then switched to screen reading. And has been using assistive technology for several years.
We are all interested in PDFs because we are interested in information. Information is about lives, it empowers people, even allows them to vote. PDF is starting to be important in the free flow of information.
There has been an explosion of interest in PDFs. Searching for the keywords download PDF on Google returns 400 million hits. Adobe must be delighted, but I'm not.
Hugh Huddy on problems and difficulties with PDFs
True documents can't be web pages. True documents can be archived, searched, contain signatures and identification, saved and printed. A document is a thing, a web page is something different. We communicate using documents, they are certificates of information exchanges. Web documents can't do that.
Is there such a thing as an accessible PDF? It depends on what we call the standard it is based on. Convergence has happened with PDFs. Millions of PDFs are in use, and millions of PDFs are not accessible.
Publishers love PDFs. They churn them out as if it were the very latest thing people want. They produce documents that look exactly what they want it to look (to people without visual impairments), and they have fun creating them.
PDFs are a new runaway train. Searching for PDFs returns 360 million hits, all of that without having accessible standards. PDF authors (professionals, lawyers and doctors) are getting on with their lives, and they haven't spotted these accessibility problems.
Hugh demonstrates a screen magnifier using an old Yahoo page. A screen magnifier user can't get an eyeful of a page in one go, the usefulness of layout is limited. Structure is important, however.
Using a screen reader is like listening to someone on the telephone. The positioning of information is of critical importance. It takes ten to fifteen seconds for a screen reader user to find something a non-visually impaired person would spot instantly.
No-one spends their days browsing the web. We deal with multiple document formats everyday. There was a trend in accessibility of limiting the number of formats, by limiting the use of formats with serious accessibility problems. But today we manipulate spreadsheets, PDFs and Word documents. There was a trend that said "lets stay away from troublesome formats and anyone pushing these formats". Adobe well deserved the reputation for the poor accessibility of its formats. Its lucky to be given another chance.
At the moment, the number of tools that can generate accessible PDFs is small. We will extend the ability to create accessible PDFs beyond Adobe Acrobat 7 and Adobe LiveCycle Designer.
Greg: File formats and user agents
We need to distinguish between the PDF file format and the PDF user agent. There is also another set of user agents we need to consider - assistive technologies. Popular authoring tools can't produce accessible PDFs. Third party organisations are jumping on the bandwagon of accessible PDFs. Acrobat Reader version 7 has made a number of accessibility improvements.
The PDF format is the result of an open public file specifications labelled as being from Adobe. The majority of PDFs created are not from Adobe. Adobe Acrobat 7 is compatible with screen readers - this has been tested with Supernova, JAWS, Window Eyes, as well as VoiceOver on the Mac, and Gnopernicus on Linux.
Greg: Defining accessibility
The definition of accessibility has changed. Vendors have paid attention to advocacy groups, but it was not until e-government requirements forces access as part of law that accessibility was taken seriously. Government requirements changed the situation from an issue that affects a low number of people to a big user with an accessibility requirement. A major factor in improving accessibility is the forced procurement of accessible products.
Accessibility is not just an issue of functionality, but also economical access. Acrobat 7 users don't have to have a screen reader to access a PDF, since Acrobat 7 offers features like Read Aloud, Reflow content and high contrast displays. We don't do a good job of telling users the key presses they need to activate these features. Acrobat 7 works with screen readers.
Greg: Creating accessible PDFs
PDFs are typically a destination format, it is normally born as a word processor file which is then translated into a PDF. Scanned PDFs are the worst possible thing - its a sheet of paper run through a scanner. It looks fine, but it is not accessible. Its just an image.
Anyone who can hit a print button can produce a PDF. We create PDFs without thinking about how it sounds, without thinking about it in the same way we think about spelling, grammar and visual layout. Accessibility is the 4th editorial component of creating PDFs.
If you want to create PDFs, then use applications that create tagged PDFs. The PDF Maker macro for Microsoft Office is a good tool. Microsoft have announced their intent to produced PDF tools that create accessible PDFs natively.
The PDF specification is the other scenario for accessible PDFs - it is a collection of mandatory and optional chapters. The accessibility chapter is optional. Open Office did something that no other vendor has done - they supported the accessibility chapter. Other vendors are not doing this, and they should.
A list of tools that generate tagged PDFs:
- Adobe GoLive
- Microsoft Office
- Star Office
- Open Office
Tools that currently do not generate tagged PDFs:
- Quark Express
- Corel WordPerfect
- Adobe Illustrator
Greg: Accessible PDFs with Adobe Acrobat 7
In Adobe Acrobat the accessibility controls are hidden, but they are there. [Greg demonstrates adding alternative text to an image]. The mechanisms are there, we are not taking full advantage of them. Adobe's LiveCycle designer is the only tool that can make accessible PDF forms.
Quick list of tips for creating accessible PDFs with Acrobat 7.
- Review and correct the reading order
- Review and correct tagging of figures and diagrams
- Add Alternative text to images
- Tag simple table data
Its possible to bolt on accessibility to previously created PDF documents - its like playing a game of pinning the tail on the donkey. For example, we draw a box around a block text and then mark it as a paragraph.
Acrobat 7 offers the following accessibility authoring features:
- An accessibility checker
- Recognising text using OCR - useful to tackle the scanned page problem
- Accessible forms (using LifeCycle)
- MakeAccessible - An XML to PDF is the best approach to creating PDFs, since XML provides a logical order and structure as a basis for PDF accessibility
Hugh demonstrates PDF accessibility
Authors still have difficult headjams to think their way through. Authoring program designs limit the approach to writing content. Software vendors have an obligation to improve their software.
Hugh demonstrates how a screen reader handles typical PDFs. Adobe Acrobat offers an option to infer a reading order, which on inaccessible PDFs reads out "blank blank blank blank". If that was a legal document for me to sign, or a job application, or a company report - its not good news, its a waste of my time.
Running an accessibility quick check reports that the document appears to contain no text, and that it may be scanned text. "How many documents did we scan in and stick on our website?"
Hugh tests a tagged PDF
A tagged PDF document is then demonstrated, reading out the contents successfully. Reading the content isn't sufficient, being able to navigate the document is also important. Accessibility is about finding and opening a document, as well as getting a feel of the overall architecture of a document. A non-visually impaired person would glance at a document to understand the overall architecture, a visually impaired person would need some sort of overview navigation.
Hugh then tests the "How to use PDF reader" PDF from the Adobe website. Its navigatable because the document is tagged and structured. When the PDF is tagged, a visually impaired person can then move through the sections and open or collapse sections. Without these navigation features a document cannot be accessible since it will just be reams and reams of text.
Many PDF documents are totally inaccessible. Its a common assumption that electronic documents are alternate formats and are accessible. Millions of people believe a screen reader can read any document even if its a picture.
Greg: Assessing the accessibility of PDFs
When assessing the accessibility of a PDF, ask yourself these series of questions:
- Is the PDF a scanned image?
- Is it intended to be a form?
- Is the PDF tagged
- Are the items properly tagged?
- Verify the reading order
- Add proper tagging (e.g. to figures and tables)
- Add alternative description to graphics
- Have I missed something? Run an accessible checker and make the recommended and appropriate repairs suggested.
Greg: Problems with making PDFs accessible
There's a large burden on the author for creating accessible PDFs, they need to know where the preferences can be set. Adobe's specific direction is to automate this process. If a PDF isn't tagged there's no chance of it being accessible. Realtime (on-the-fly) tagging triggers a reflow which can try to guess the tagging structure.
Tables are the holy grail when it comes to accessible PDFs. An author must set his sights on improving the markup and access to data inside the table. This is a high priority of Adobe, refinements and improvements are taking place.
There are two specific areas of focus. The User Agent needs to automate more of what needs to be done to make a PDF more accessible. And we need to focus on accessible tables (as well as any RNIB recommendations).
Hugh on the RNIB recommendations
Hugh now demonstrates the RNIB Annual Report - a financial report containing 70 pages and lots of tabular data. This is a high profile document for the RNIB. The authors really didn't know what to do to create accessible PDFs. Many people are afraid to get it wrong, and more afraid to even try. We got lots of technical help, and the end result: for the first time ever you could navigate a document with a screen reader that you could also print out. The ends are not quite joined up but we are so close.
JAWS reads out the table of contents as "Page has seven links", and prefaces each link with by reading out the word "link", which allows the reader to jump around the structure. Here we have navigation and an understanding of the document structure.
Hugh: PDF Holy Grail: Tabular data
Tables are a frightening, but its not always true that they are inaccessible. Page 34 of the RNIB Annual Report is a balance sheet. The screen reader announces the table size and allows the user to switch into table navigation. It reads headers like html table browsing mode. This works very well with uniform tables - but designers dislike uniform tables, probably because they look boring. Designers prefer nesting subheadings and titles.
Inclusivity is the only way forward. Adobe Reader 7 and JAWS renders a PDF nicely. It sounds normal, and that is a great success. The average publisher won't be able to create such an accessible PDF at this point. Thanks to Adobe (and Greg) for making such a big difference. Adobe can make these accessible features simpler to use. That's the holy grail, making accessible PDFs are part of the design itself. Speed up Adobe, you have the power to make a big difference.
Authors, buy products that can make accessible PDFs. This is critical, create a demand on the market. Only bug products that are generating accessible markup. Get wise about producing accessible PDFs. Take the time to understand why and what needs to be done to make a PDF accessible, otherwise it won't be accessible, and that's a legal problem.
The RNIB and other disability related organisations, put pressure on each other to look at and adopt these new technologies. Throw back inaccessible PDFs and complain.
Authors, sort out the problems when they are reported. Buy accessible products.
Adobe, fix the problems that authors raise when they are trying to make accessible PDFs.
Questions and Answers
A question and answer session followed. One interesting point was how to test that the reading order of a document was correct. Greg suggests raising with Adobe the idea of displaying a read order view numbering each section of text, and that can be used to visually verify the reading order is correct. Also, saving the PDF as plain text is a good way of checking the reading order, and also as a way of allowing PDF content to be accessible to a greater variety of devices.
- A List Apart: Facts and Opinions about PDF Accessibility
- IT-Analysis: Accessible PDF documents for the blind
- Joe Clark: Here we go again with untagged PDFs