Accessibility of text-only websites
Summary
Creating text-only versions of websites to improve accessibility is a double edged-sword. It can remove a limited range of accessibility problems, but at the cost of creating new accessibility problems and destroying other accessibility benefits. It can't correct a number of high-profile accessibility problems.
The Betsie approach, as used by both BBC and Norwich Union takes an existing web page as input and strips out features it considers an accessibility problem. Its success largely depends on the initial accessibility of the existing pages, which in both cases is fairly good. Betsie fails on websites dependant on Javascript, makes tabular data inaccessible to its users, and cannot improve websites with bloated navigation.
The problem: an inaccessible website
Starting from an inaccessible website which breaches the Disability Discrimination Act. This raises the requirement either to make the current website accessible, or create an accessible alternative. The analysis and estimates show that making the current website accessible is costly - owing to the work required to remove Javascript dependancies.
The proposed solution: text-only version of the website
Create a replica of the current website which is accessible. This is referred to colloquially as a text-only version of a website. Although it is worth pointing out that accessible websites can use media other than text.
The advantages of a text-only version of the website
The look and the feel of the main website is untouched.
The disadvantages of a text-only version of the website
A text-only version of the website effectively segregates the audience it is targetted at. History demonstrates that segregation does not lead to equality, but to stigmatisation. South Africa under the regime of Apartheid is a clear example of this stigmatisation.
From the perspective of content, having two identical websites leads to a problematic search engine strategy. Which website should take priority on search engine listings - there is no reliable means of controlling this. The problem caused is that the text-only version of the website will score better on search engines because of its higher content density per page (because inaccessible markup has been removed, reducing page size).
The accessible website is hidden behind its inaccessible counterpart. People needing to use the accessible version of the website still need to be able to deal with the inaccessible one. They have to search through an inaccessible page to find the link to the accessible page. This problem could be reduced by ensuring that the "skip to accessible version" link is the first piece of information on the page.
Each inaccessible page needs to be updated to point to its accessible counterpart.
Creating a text-only version of the website
There are three general ways of producing this particular website.
Handcraft each page
Copy each page of a website and strip out all the markup that either produces non-textual effects or presentation. The effort required is similar to the time required to make the existing website accessible.
There is potential for a comfortable level of accessibility when done manually. The downsides (above and beyond the disadvantages of text-only versionsin general) are:
Time consuming, thus expensive to do.
Keeping the two websites in step requires duplicate effort.
Generated by CMS
Get the installed content management system to take on page of content that produces both the inaccessible "hi-tech" version and the "low-tech" accessible version.
The downsides and complications (above and beyond the problems with text-only versions of websites) are:
The content management system has to already be in place, otherwise it is an expensive proposition to install a CMS and populate it with the content. This also needs to include the cost of defining and rolling out a process of how content changes are managed.
For a CMS to produce both versions would require two separate templates, and where the main content isn't accessible, either two copies of the same content needs to be created, or the accessible one is used for both versions.
Generated from inaccessible markup on the fly
Use a server-side script to strip out the selected markup of pages on request and on-the-fly. This is the solution Norwich Union and the BBC are using with Betsie. It basically takes the markup of the a web page, and strips out what the script considers to be inaccessible markup - this includes images, colours, widths, font-styles, javascript and tables. We'll investigate this approach in more detail.
The Betsie approach
Both Norwich Union and the BBC use a script called Betsie to do the automated text-only conversion. Betsie was developed by the BBC for their group of websites. It is an Open Source compatible licensed script written in Perl, with no warranty or attached liability. The script has no built-in accessibility intelligence and uses a very simple set of rules to decide which markup can be stripped out.
Betsie removes all Javascript, table markup, divisions, and images without alt text. This has the side-effect (particularly with tables and divisions) of removing accessibility-friendly markup.
Betsie can't correct a whole host of accessibility problems, notably replacing images with adequate alternative text and ensuring documents are properly structured.
For Betsie to work well requires that the original input page is already reasonably accessible, and that this page suffers from accessibility problems which are best solved by removing markup. Betsie cannot insert markup to improve accessibility. An examination of both the Norwich Union homepage and pages within the BBC indicate a fair degree of accessibility changes have already been made. The main accessibility problem remaining on their main website is fixed font sizes and colour schemes - something Betsie can handle, by removing stylesheets, font tags and colour attributes.
Who benefits from Betsie?
The way the BBC have implemented Betsie on their website, the content is freed from inaccessible practices like fixed font sizes of text. It is freed from bad practices of a fixed-width layout. The BBC have opted for a black background with yellow and magenta text as the default Betsie presentation. So this caters acceptably for people with low or reduced vision.
The improvement in accessibility to people with total sight loss is negligible. It may actually be a disservice to this group of our audience, since this group tend to use speech browsers that take advantage of properly structured markup to help them through the page. Their speech browsers do a far better job of gaining access to inaccessible content, and Betsie removes crucial chunks of markup that speech browsers can use to positive effect.
People with reduced or erratic motor-skills may benefit from the Betsie approach since the size of text, thus the size of clickable regions, can be increased.
People with forms of mental disabilities - such as dyslexia, attention deficit disorder - could find the Betsie version more difficult to access. Since the end result is just a page of big text with no images to add understanding.
Solved problems
The Betsie approach avoids the duplication of content so there are no additional man-day overheads and storage costs in creating new pages.
The script removes markup it deems inaccessible, so the resulting page should be easier to access. The principle of the script is if it spots something inaccessible, instead of presenting it to the user it removes it entirely.
Remaining problems
"Garbage In. Garbage Out."
The Betsie script can only remove elements it considers accessibility problems. It can't add in material to make a page more accessible.
For example, an image without alternative text is a major accessibility problem. The correct approach to accessifying this is to add in adequate alternative text in an alt attribute on the image. Betsie's solution is to remove the image completely. It can't determine what the alternative text should be - it is largely a human-oriented task to succinctly describe the purpose or functionality of an image.
The result of this drawback is that the generated page is by no means guaranteed to be accessible (nor equal treatment in the eyes of the law). The level of accessibility achieved actually depends heavily on the accessibility of the original page.
Search engine indexing
Which website should a search engine index? The pages they index are the pages that will get returned when a visitor searches. The most accessible approach is to ensure that search engines index the accessible version - that way the results returned by a search engine has a far better chance of being accessible to the visitor.
Search engines like good content, and stripping out presentational-type markup gives search engines better material to index (in that the content hasn't been diluted with useless markup). As such accessible pages are likely to score higher on search engines because of their focus on accessible content over presentation.
In terms of search engine indexing, the inaccessible website is basically a second class citizen in terms of keyword searches. We would have to rely on external websites of good standing to link to the inaccessible version of our pages to have a chance of the inaccessible (branded) pages appearing before their accessible (non-branded) counterparts.
Inaccessible website still a barrier
The Betsie approach involves putting a link on every inaccessible page to the "text-only" version. So the only way for a visitor to get access to the accessible page is to actually sift through an inaccessible page in the hope of finding the right link to the accessible version. The inaccessible website is still presenting a barrier - it stands between the visitor and the accessible page.
Also, new users have to know that there is an accessible option. This would need to be signposted quite clearly and immediately on every page.
Every page needs manual changes
Incorporating Betsie would require adding a unique link to the very top of every page of our website linking the current page to its Betsie generated equivalent. This change would also need to include a clear signpost to visitors of this functionality.
Newly created problems
Accessibility-beneficial markup is removed
A table is used in HTML to structure tabular data (like product comparison tables), but it is also misused by site developers for page layout. So far, no script can determine whether a table is correctly used, or used as layout.
Considering most of the tables in use at the moment are for layout purposes the table markup gets stripped out. So in the case of proper table usage, the script removes all accessibility and readability structures making the content within the table close to impossible to access.
Does it really make a website more accessible?
The limitations of Betsie result in the removal of markup where it is used properly. Since useful markup is removed (for example table markup), the assistive technology tools like speech browsers can no longer take advantage of tabular structures to make information in tables easier to access.
Incorporating Betsie into a website
If we went ahead and implemented Betsie on a website, the work required would be:
Install Perl and configuring the webserver to use Perl as a server-side scripting language.
Copy the script to the webserver.
Go through every page of the website and add into the top of every page a link to the Betsie script. At the same time make changes to the navigation to remove their javascript dependancy, plus do something to reduce the 156 links down to something reasonable like 9 links.
At the same time incorporate a list of accessibility fixes that Betsie cannot deal with - adding in alternative text to all images, identifying at the first instance all acronyms and abbreviations. Fixing the situation where two links with the same text link to two different pages.
Redo all pages using forms (application forms) to remove their javascript dependancy, and also ensure that all form validation is also performed on the server. (Otherwise none of the application forms will work when delivered through Betsie). This is where a considerable effort is required for accessibility. Betsie offers nothing to reduce this piece of work.
Move all applications from the secure server to the non-secure webserver. (Betsie does not work with secure web addresses yet). This raises some important Data Protection Act issues that would need to be resolved.
Comparing the above list of changes with the list of changes of making the main website accessible suggests that the time saving we can expect by using Betsie would be in the region of two to five percent of the total effort.