6 Thoughts on accessibility of equation layout
Of course it is more complicated than that.
Isn't it always?
, if you prefer.
The first rule of equation layout accessibility
Don't talk about "math accessibility" when you mean equation layout accessibility.
Mathematics is an ancient domain of human knowledge, formula or equation layout a visual layout technique developed primarily in the 19th and 20th century.
Mathematical content is far more than content that needs equation layout and equation layout appears in many more domains than just mathematics.
We will fail to make equation layout accessible if we think we can treat equation layout identically across mathematics, physics, chemistry, computer science, biology etc (and their various sub-fields). For example, voicing a superscript 2 as "squared" may be a reasonable heuristic for middle school mathematics but a miserable heuristic for chemistry.
We will fail to make mathematical content accessible if we only make equation layout accessible and vice versa.
2. Augmentation is a must (or: aria-labels must work)
Manually overriding accessible name calculations (e.g., via aria-label) on text(-like) content is generally considered a last resort and it is clearly not a long term strategy for accessibility.
But we have aria-labels because we know from experience that there's always a situation where you need them.
Currently, it is extremely hard to augment equation layout with aria-labels let alone anything beyond simple overrides. This is a problem of authoring but much more of rendering and assistive technologies. MathML-based solutions in browsers and AT are particularly bad at this and to some degree this cannot be fixed (we should get to that later).
Whatever solutions might arise in the future of equation layout, like everything else on the web, they must be able to work together with interspersed aria-labels, together with potentially many interspersed aria-labels, and ultimately with only aria-labels.
In other words, deep aria labels aka aria labels all the way down must work as well.
This is a problem as ARIA has limitations when it comes to exposing custom tree structures and making them explorable.
More general author-driven augmentation must also work. Equation layout may have a natural tree structure derived from the DOM but we must be able to modify that. Aria-owns might be a solution here but right now it appears to too limited (either by spec or implementations).
3. Visual quality must come first (and that's a problem)
If equation layout is visually inadequate, it cannot be considered accessible. The problem is: we have no solid basis for measuring this.
While TeX-style layout is generally considered the highest quality among heavy users of equation layout, there seems to be no research evaluating this from an accessibility point of view. For example, TeX layout is largely unconcerned with K12 content and (as a print technology) has no concept for the kind of dynamic modifications we can realize on the web. While there are minor variations, e.g., elementary education preferring sans-serif fonts and requiring fonts with an open glyph for 4, it's unclear to me in how far these preferences are evidence-based (pointers very welcome); in any case, they are also deeply rooted in print and might be moot on the web, e.g., there might be better ways for get whatever effect such variations are supposed to achieve.
On the web, it probably means that equation layout must be flexible enough to allow all kinds of (user or author enabled) customizations. Some obvious questions: What should happen when the line-height changes? The letter spacing? When a transform or animation is applied to a descendant? A color changes on something with a specific color? A font (style) changes on a mathvariant? These all might be useful for accessibility purposes (e.g., with visual impairments, learning disabilities). And there are likely many more things we cannot imagine yet.
I think we greatly lack research as to what features in visual layout might be important for accessibility on the web. Without such research, we have no adequate basis on which to discuss improvements to equation layout and the standards that enable it on the web.
4. Heuristics should be used at authoring time
Heuristics are important across the assistive technology chain to recover from bad input (content). However, heuristics should not be necessary with good markup, e.g., markup that is ideal according to specs.
Equation layout is tremendously ambiguous, primarily due to its history and the limitations of print technology. For example, it is near impossible to differentiate the typical vertical fraction layout from the various other notations that share a similar vertical stack (2-3 children, depending on a line in between).
In other words, even high quality markup for equation layout requires heavy use of heuristics to guess the semantics.
Heuristics for non-visual representation of equation layout have existed for a very long time (and before the web). Today, some assistive technologies integrate heuristics for equation layout when done as MathML markup.
This is a problem because most of these heuristics are fairly bad. Heuristics for equation layout are largely of low quality at scale. It is unsurprising when (e.g.) Nemeth's math speak rules were invented for a person reading to another; such limitations could easily be overcome in that situations. At the scale of the web, it is much easier to run into edge cases where heuristics are too coarse or too fine grained. An almost universal limitation is the restrictions to individual fragments of equation layout, ignoring the context (both equational and otherwise).
Relying on heuristics in AT for good content poses a serious issue (hinted to earlier): heuristics interfere with augmentation. For example, the commonly used heuristic that every superscript 2 is voiced square
, then no override may be possible; even if it is, will an override override the phrasing (e.g., to the power of two
) or also consider the superscript position? Do you need two overrides to clarify? If a heuristic identifies (1+2+3) as a summation and provides summary information (e.g. sum with three summands
), what happens when an author augment one of the + signs (say, with aria-label="times"
)? We would need to augment both the content and the heuristics. As the saying goes: and now we have two problems.
Much like with textual descriptions of other visual content (e.g., image alttext, video captions, SVG descriptions) heuristical tools (human or machine guided) should be used at authoring time.
5. Describing layout is a red herring
There is a position that I encounter every once in a while: just give non-visual users access to the visual layout and stop. That's a very appealing proposition: instead of trying to make sense of visual layout semantically, we just provide information about what letter/word is where on a two-dimensional canvas. It also appeals to a basic idea of equality: visual users only have visual access, why would non-visual users get additional information?
Unfortunately, it is a red herring: it is neither easy nor helpful nor what anyone has ever done.
It is obviously not easy on the web because layout is dynamic; if two users read a document on two different devices, they might have a very different idea as to what is where. Even within equation layout, automatic linebreaking/reflow can shift parts around, more advanced methods (e.g., MathJax's collapsing feature) can vary even more greatly.
But ignoring this larger problem, traditional equation layout has a few odd concepts that make this even more difficult. For example, movable limits can move elements from sub/superscript positions to under/overscript positions based on the context of a subexpression, without any change in the markup; reversely, this can happen when changing text content (thanks to things like the operator dictionary). Another example is the concept of embellished operators which make it difficult to identify reasonable layout blocks to describe; similarly, brackets may or may not be marked up in a way that groups open and closing brackets. Essentially, you will still need to analyze the layout semantically to identify what really belongs where and together with what because that is ultimately a question of why.
As a consequence, the idea of just describing layout is not helpful to anyone, no matter what assistive technologies they might use (even if it's just a screen). Even more so, when you have to do the same analysis as you would for identifying semantics of an expression.
What's more important is that describing layout is not what anyone has ever done (so I would surmise nobody wanted to do it). Some layout information is always ignored, other layout is always inferred as semantic. As way back as Nemeth's math speak rules for print we have had heuristics that will read any superscript 2 as squared, inferring semantics where there are none. Reversely, no assistive technology for equation layout will tell you about the dimension of a stretchy character (neither directly nor transformatively e.g. via audio cues). Again, a good example is a movable limit where rule sets get around describing (unreliable) layout in favor of heuristically determined semantics, e.g., in a sum they might voice sum from [subscript] to [superscript]
, neatly avoiding the layout. Above all, human beings do not speak equation layout as layout. Nobody says vertical bar A vertical bar
they say absolute value or determinant or something completely different, nobody says C O subscript 2
they say CO2 or carbon dioxide or possibly some more precise wording if it appears inside a checmical equation.
Of course, describing visual layout is nevertheless a decent fallback mechanism, e.g., when semantic heuristics fail but you can still recover layout information and it is important to be able to enable users to explore the layout if they need to (e.g., so as to reproduce it for their own work). But edge cases should not limit the expressiveness of accessible equation layout in general.
(An independent issue is to expose layout so that a user can guess how something was authored (e.g., when voicing gives you determinant A
, the questions may be if it was represented visually as det A
or |A|
.)
6. We must strive for multi-directionality
Accessibility is not a one way street, equation layout even more so. Accessibility must handle directionality on many axes.
Accessibility means to improve access to content no matter whether a user can access it with all theoretically possible human capacity or only using a small fragment thereof at a time. Due to its highly compressed form, equation layout requires more back and forth across a particular equation fragment as well as the entire document than most other forms of content. This is both a bug and a feature but either way it won't go away any time soon.
Accessibility means to improve interaction with content so as to allow all users to transfer knowledge better. Equation layout has a huge discrepancy between authoring formats and rendering. We must strive to improve this.
Acccessibility means improving the interaction between human beings. If two students explore a document, they should be able to do so together so that they can engage each other. Therefore, the effects of exploring content should be equivalent between different exploration methods. At the AIM workshop earlier this year, Sam Dooley told the story of a young blind person joyously celebrating that their parent could read my math
as they used an accessible authoring and rendering environment together.
Interaction in these multiple directions will provide more information to more people, enabling wider accessibility, whether people identify as AT users or not. More importantly, it will show the path towards what the web can really do for the knowledge traditionally represented in equation layout.