Weblogs: Web Accessibility

The Price of omitting the alt

Monday, September 10, 2007

There's an ongoing, loud and sometimes fraught, argument going on in the HTML 5 arena about whether to allow the alt attribute of an img to be omitted. The alt attribute is one of the main ways of specifying a short text-equivalent for an image, and one of two ways of directly relating an image to its equivalent textual representation.

A textual equivalent to an image is absolutely essential so that the content conveyed by the image is accessible to human senses that cannot process images. Where we have content in a non-textual form, we need to provide an equivalent representation of that content in a textual form. On that point there is no argument.

Why have the parties in the WHATWG decided that making the alt attribute optional is the way forward? What problem or issue does it solve?

The identified problem

The redoubtable Mark Pilgrim weighs in with the following observation:

For the record, I wholeheartedly support [...] this decision in particular. People who truly care about accessibility will do alternate text properly (or at least try), no matter what the spec says. People who do not care about accessibility will not do alternate text properly, no matter what the spec says. In between, we have a real, demonstrated problem of software vendors who favor validation over accessibility to the point of actively hurting the latter to satisfy the former.

So what's the real, demonstrated problem? It seems to be photo galleries, like Flickr, particularly images that are uploaded without text descriptions. In typical situations, no alternative text can be made available. This is different to the situation where the alternative text is already present in the surrounding text (where alt="" is correct approach).

The examples where authoring tools are failing to capture meaningful text equivalents for images are Flickr and Photobucket. Perusing the WHATWG mailing list brings out the examples of an email client (when dragging-and-dropping in an image), as well as a wiki.

These are good examples - of high profile Web 2.0 applications - where the collection of text-equivalents is either poor, or optional in the interests of usability. And where well-meaning web standards aware developers have attempted to fix the problem of invalid markup, and yet, in WHATWG's rough consensus opinion, creates alt attributes that are misleading, poor quality, or human-unfriendly.

Weighing up the solution

From understanding the problem, next we need to understand why the current solutions are worse than omitting the alt attribute. What's particularly confusing is Lachlan Hunt's choice of examples:

Flickr, for example, repeats the images title; Photobucket appears to combine the image's filename, title and the author's username; and Wikipedia redundantly repeats the image caption. The problem with these approaches is that using such values does not provide any additional or useful information about the image and, in some cases, this is worse than providing no alternate text at all.

The cited examples of Flickr, Wikipedia and Photobucket all demonstrate the characteristic of the alt attribute being redundant.

Dealing with redundant `alt` attributes

These examples are interesting because they conceivably fall into the already documented A graphical representation of some of the surrounding text, and hence the recipe of alt="" is already the suggested approach to avoiding such redundancy

Or is it? The draft specification describes these images in such a way that its the surrounding text content that's the key content of the page, and the images merely complement or reflection of that. Perhaps in situations where the photograph is the central piece of content - like a Photo Gallery, while the text surrounding it just complements that content, should the guidance be different?

Either way, the alt attribute is clearly present. There is a tiny overhead to accessibility for redundant alternative content, but I don't consider it to be a case of actively hurting accessibility.

Dealing with non-existent alternative text

Maybe the real argument is when the alternative text cannot be provided? Without a clear example or use case, it is a little difficult to establish exactly what Lachlan Hunt is referring.

In the previous examples, the web sites have adopted the solution of duplicating existing content to make up for the alt attribute. What Hunt seems to suggest is that these images should be introducing additional information, and that information needs to be in the alt, and its that information that websites like Flickr cannot generate.

I'm skeptical of this position. Surely if there is a title, then that is the title of the image is sufficient, with the applied tags and optional description offering more detail, however low quality those titles are.

The only scenario I can think of where there is an issue is when the user doesn't fill in a title, description or tags. That is quite common in photo-sharing sites, since by its nature its a social visual experience. In this circumstance, there is no information about the image, apart from user who uploaded the picture, and some meta data from the camera itself, and maybe a generic tag if its part of a photoset.

This scenario is a problem. The solution of a null alt doesn't correctly express the text equivalent of the image. Its not saying this image offers no content, or the content of this image already exists in the surrounding text.

Far from it, the text equivalent is saying errr... we don't have a text equivalent for this image.. And that is an accessibility problem. Patching it over with a null alt attribute merely disguises the problem, rather than highlighting it as a problem.

As I understand it, making the alt attribute optional allows this very scenario to stand out from the we do have alternative text for this image. And that is a needed distinction.

Meaning of null `alt` attributes

In modern web development practices alt="" can mean one of the following things:

This image offers no content.
The content of this image is already on the page in a textual form.

But in practice, a null alt attribute has been applied to mean:

This is the default set by the web-standards aware authoring application.
The author has no idea what content this image provides, but the page needs to validate
The author cannot be bothered to insert an appropriate alternative text here, but the page needs to validate.

These intentions need to be cleanly separated and highlighted as problems that require fixing or addressing. And in that case, allowing the alt attribute to be omitted, and reducing the corresponding validation logic from failing a page to raising a warning does help the accessibility practitioner.

The meme of the lazy author

Roger Johansson emphatically states that allowing the alt attribute to be omitted will only lead to lazy and ignorant authors and tool vendors ignoring it completely. That comment is off the mark, since these very same people are ignoring text equivalents for images already anyway - so there's no deterioration there.

As Mark Pilgrim points out, and I agree with: People who want to build accessible websites will not be hindered by this particular change. People who don't care about accessibility won't be affected by this particular change. But people who are trying to satisfy validators at the expense of accessibility now have a way of satisfying a validator without actively harming accessibility. They can accept that there are warnings raised (that require manual checks), knowing that there are no errors in the markup.

The oversimplification and automated testing

People who understand web accessibility know that the presence of an alt attribute does not prove that the content of an image is accessible. Also, they understand the lack of an alt attribute does not mean the content of the image is not available.

Too many web developers are fixated on the opinion that an image must have an alt attribute if its to be accessible. This oversimplification is what fuels the automated accessibility checkers like SiteMorse.

Omitting the `alt` attribute

According to the WHATWG edits to the HTML 5 draft, when an alt attribute is omitted from an image, it can mean the following things:

This image may be critical content, but there is no alternative text available.
There may be content in the image, the author hasn't specified it.

User Agent Accessibility Guidelines fallback

By omitting the alt attribute, the user-agent is thrown on its own devices to figure out how to render the image or associated data. The User Agent Accessibility Guidelines suggests a method of repairing this lack of content (Techniques for checkpoint 2.7: Repair missing content):

Allow configuration to generate repair text when the user agent recognizes that the author has not provided conditional content required by the format specification.

Sufficient techniques

The user agent may satisfy this checkpoint by basing the repair text on any of the following available sources of information: URI reference (as defined in [RFC2396], section 4), content type, or element type. Note, however, that additional information that would enable more helpful repair might be available but not "near" the missing conditional content. For instance, instead of generating repair text on a simple URI reference, the user agent might look for helpful information near a different instance of the URI reference in the same document object, or might retrieve useful information (e.g., a title) from the resource designated by the URI reference.

Even though Roger Johansson quibbles that in many cases the filename contained in the src attribute is pure nonsense that does not provide the user with any useful information about the image. Unfortunately that is what the W3C User Agent Accessibility Guidelines suggest should be done. The implication is that reading out the URL of an image is more beneficial than skipping over it - and that's something I find paradoxical.

If reading the src attribute of an image is a problem, why would UAAG have specified it as a mechanism for repair? Surely they would have realised that the chances of something digestible coming from a punctuation-crazy URL would be extremely low?

Its a puzzling idea, and one that Steve Faulkner has tested and found that low quality alternative content in an alt attribute does a better job than reading out the image source URL. His conclusion is that the accessibility effect of omitting the alt attribute isn't well understood, and requires more practical testing.

In conclusion

People who build accessible websites won't be affected by this change, neither will developers who don't want to know about accessibility. It does provide a solution to the situation where there is no text-equivalent present, and by lowering the validation from a fail to a generated warning, these people can continue building web-standards compliant websites without harming web accessibility.

I agree with Steve Faulkner that it hasn't been established that there's no impact on accessibility to allow the alt attribute to be omitted. I'm concerned about the techniques in the User Agent Accessibility Guidelines - that's where the real accessibility problem is being manifested. Suggesting that the URL of an image could be used to provide a text-equivalent isn't a safe option, and screen reader users who spent hours trawling through Amazon's obidos plus a massively large number patterned URLs will be frustrated by this.

My feeling is that this problem is an error in the User Agent Accessibility Guidelines. That needs to be fixed, and the HTML 5 suggestion should not be implemented in isolation. It will require a coordinated effort between the HTML 5 Working Group, the User Agent Accessibility Guidelines Working Group and screen reader vendors to understand whether this is an idea that does ultimately improve the accessibility experience of disabled people.

We certainly need some justification from the UAAG Working Group as to why they suggest using the image source URL as a means of determining a text equivalent when its not explicitly provided. Its a peculiar suggestion that fails more often than it can succeed.

The bigger problem - and one Flickr, Photobucket and Wikipedia need to get to grips with - is how to encourage text-equivalents to images in a way that it doesn't impact the usability and addictiveness of the website. Its a difficult problem to solve - its not merely an issue of education (not with the millions of Flick users - its unscalable). Perhaps a beyond-the-box solution is needed that can tap into the community itself.

Related information

Investigating the proposed alt attribute recommendations in HTML 5, by Steve Faulkner. A fine example of tackling an issue by constructive and practical means.
Gez Lemon: The HTML 5 image element
WHATWG Blog: Why the alt attribute may be omitted
Simon Willison's link blog discussion
UAAG Techniques Document: Repair missing content - what a user-agent can do when no text-equivalent is present
W3C ATAG: Do not automatically generate equivalent alternatives - guidelines for authoring tools
HTML 5: Current editor draft: Images

[ Weblog | Categories and feeds | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 ]