Introduction to MathML-in-DAISY in 10 Small Chapters

Original Author(s): Michael Zacherle

This document describes MathML and its use within DAISY books. It is meant to be a short starter with further references. The target audience is primarily teachers and students, as well as publishers and developers of DAISY reading products and services. I hope that this document provides information helpful to the audience!

I’d like to acknowledge the advisable assistance of the members of the MathML-in-DAISY working group, especially Dennis Leas, Emilia Persoon and Dr. Neil Soiffer. Some of the content here is derived from References 1 and 5 below.

This document is Copyright © Michael Zacherle (Michael@Zacherle.de), September 15, 2008. It is freely distributable under the Creative Commons Attribution-Share Alike 3.0 Unported License (http://creativecommons.org/licenses/by-sa/3.0/).

1.  Introduction to the issues

Readers with print disabilities have used audio books for a long time. First introduced on cassette tapes for leisure reading, audio books have also been used for educational purposes. With the adoption of the DAISY Standard, digital talking books have become the de facto standard and content production on CDs has been gradually replacing the cassette tape medium. DAISY books may contain audio and/or text, as well as images.

One main problem of conventional audio books and paper braille books affecting educational content is that mathematics and other formulae were treated as either text or images. Hence, there wasn’t enough structure information to enable accessible technology to present the formulae appropriately via audio or braille. Since the formal approval of the MathML-in-DAISY Specification in February 2007 as the first extension to the DAISY Standard, it is now possible to produce and use books that present mathematical content in a synchronized and structured and therefore accessible way.

2. What is the DAISY/NISO Standard?

DAISY is an acronym which stands for Digital Accessible Information SYstem. The term is used to refer to a standard for producing accessible and navigable multimedia documents. In current practice, these documents are digital talking books, digital text books, or a combination of synchronized audio and text books.

DAISY is a globally recognized technical standard to facilitate the creation of accessible content. It was originally developed to benefit people who are unable to read print due to a visual impairment, but it also has broad applications for improved access to printed material and other media in the mainstream. The DAISY Standard has been evolving over the last several years and has been officially recognized by an American standards-making body in 2005. Whereas books produced in the DAISY 2.02 standard are the most common ones by far, the use of math is only possible in the new DAISY/NISO Standard that was introduced in 2005. This new standard is officially called “ANSI/NISO Z39.86-2005″ but commonly known as “DAISY 3”.

The DAISY Consortium has been selected by the National Information Standards Organization (NISO) as the official maintenance agency for the DAISY/NISO Standard, officially, the ANSI/NISO Z39.86, Specifications for the Digital Talking Book. A more thorough description of the DAISY/NISO Standard is given in Reference 6, below.

3. What is the MathML Standard?

MathML is a W3C recommendation. The W3C is the world-wide organization that creates the standards for the Web. The MathML specification is, as a consequence, a normative document, which allows MathML to be highly compatible. Also, it was created by the Math Working Group composed of people from several countries and diverse scientific fields, so MathML takes into account the needs from many different professions, countries, and uses.

MathML is a so called “XML” (eXtensible Markup Language) language. This means that new features can be added as needs arise – see, for instance, the Arabic mathematical notation – or, can become deprecated if experience shows they are useless. Finally, MathML can be used in combination with other Web languages. As a bonus, MathML can easily be created with existing formula editors and be exported to or imported from computer algebra systems like Maple and Mathematica. It could be processed by search engines, therefore providing multiple benefits to the user. Please note that the use of so called “presentation MathML” is provided for DAISY/NISO, in contrast to “content MathML”.

An example of the formula “- a/b” expressed in presentation MathML is:

<math xmlns=’http://www.w3.org/1998/Math/MathML’>

<mrow>

<mo>-</mo>

<mfrac>

<mi>a</mi>

<mi>b</mi>

</mfrac>

</mrow>

</math>

As you can see, no one would want to read or write MathML directly. The use of additional tools is therefore necessary.

One of the biggest advantages of XML is that it has to be well-formed, so that if you open a tag you have to close it later on. This way, malformed MathML may be spotted immediately during the validation process; inconsistencies are therefore avoided.

4. How do DAISY and MathML act together?

The MathML extension is the first extension of the DAISY specification. The approach taken makes use of the existing extension mechanism specified by Z39.86-2005.

There are many problems associated with the use of images by authors and readers with and without visual disabilities when working with digital documents containing math. These include:

  • the inability to magnify the image or change its colors
  • fixed speech (based on alt text) that cannot be tailored to an individual’s needs
  • no local navigation and exploration of the mathematical structure
  • no synchronized highlighting of active parts of the image and audio
  • inability to be translated to a braille math code

MathML offers a solution to these problems. Because it is an XML application and has been designed to work with XHTML, using MathML in Z39.86-2005 was the direction that the MathML Modular Extension Working Group pursued.

MathML is not directly read. Especially for a blind person using a braille display, the DAISY player has to convert MathML into a notation suitable for that person. Currently, there are literally dozens of different math notations used worldwide. The use of LAMBDA, LaTeX and Nemeth is planned by different organizations working on DAISY players. Using MathML within DAISY documents allows for a unified and well-defined storage, while the presentation of the math content to the user is dependent on the player used. For some possibilities on examples how math content may be presented to the user see chapter 9¾.

5. What if my player doesn’t support MathML (fallback issues)

The modular MathML extension is meant to encourage advanced MathML players to provide a rich experience when reading mathematics. However, this extension also recognizes that mathematics may not be a focus for all vendors and provides a fallback mechanism. For the common case of an audio only player, a predefined audio rendering is provided. There are no local navigation points within that rendering, which is something an advanced MathML player might provide. An advanced MathML player could allow a user to explore the structure of the expression tactilely using a refreshable braille display and/or with audio without having to listen to the expression in its entirety. A future version of the DAISY MathML specification may add finer SMIL granularity within the math audio stream. For players that do not support MathML, an alternate image is provided as part of the MathML. Basic MathML players must either recognize MathML enough to locate the image reference provided on the MathML element, or they must support XSLT, which is a language for transforming XML documents into other XML documents, and apply a supplied transform indicated in the metadata of the DTB Package file.

The MathML-in-DAISY Specification therefore groups players into 3 categories:

  • Players that do not comply with this specification. These players know nothing about MathML. They do not extract the altimg and/or alttext from a <math> tag nor do they apply a stylesheet to transform the math to an image group. They ignore MathML and use only the audio as their fallback behavior. These players are referred to as MathML-unaware players.
  • Players that conform to this specification but can not natively render the MathML. They fall back to using either the XSL transform or grab the alttext or altimg attributes from the <math> tag. These players are referred to as Basic MathML players.
  • Players that natively support MathML. They therefore offer the ability to magnify the equation or change its colors, the tailoring of the speech to an individual’s needs, the ability to navigate and explore the mathematical structure, synchronized highlighting of audio and text, and the ability to translate to a braille math code for use on a refreshable braille display. These players are referred to as Advanced MathML players.

6. Where do we stand now?

The MathML-in-DAISY extension was formally approved and is therefore ready for use. Production tools are currently in development for the production of DAISY math books. One particular DAISY software player has been demonstrated as capable of displaying math content at an international conference in March 2008. As new products and services are developed by DAISY Member organizations, they are announced on the DAISY Consortium Web site, and MathML capabilities are specifically tracked.

The best way to overcome the chicken and egg problem is to produce DAISY books using the math extension. With the fallback capabilities, their use would be possible today with existing players, while new developments would add more possibilities and features.

7. Summary for students (users)

If you want to use digital books to learn math, physics or chemistry, DAISY books with MathML are just for you. The first books are now becoming generally available. Until then, keep yourself informed, ask your library about digital math books and get yourself a new advanced MathML DAISY player.

8. Summary for teachers (producers)

Ask your software vendor about DAISY production tools with math capabilities. If you don’t know exactly what DAISY books are, then have a look at the DAISY Consortium home page (http://www.daisy.org/) and watch out especially for these new production tools.

9. Summary for libraries and publishers

Make yourself comfortable with DAISY books and MathML. Explore the references and start producing and testing DAISY books without math just to get a feeling for the issues at hand. Get yourself some new production tools with math capabilities as soon as they are available, and make sure that all computers in the library are equipped with DAISY reading software. Perhaps you can attend some conferences or meetings?

9¾. Math for the blind

This chapter is a bit out of the ordinary and will provide more in-depth technical material about math notations, a sort of extra credit reading if you really didn’t have enough so far.

By separating the storage from the presentation, the user as well as the publisher experience major benefits. In regular digital texts, everybody generating or converting literature for the blind has to choose the way to store math content by using a certain notation (LaTeX, Nemeth, AMS, Marburg, LAMBDA, or other). For some examples of these notations please see Table 1. As the blind user would read the text exactly as it was written, he would have to know this same notation as all of the books from this producer usually contained math in this notation (and this notation alone). To read these documents, everybody would have to learn this notation. Students from other universities trying to access the different digital libraries had to learn the different appropriate math notations as well, if only to read a single book per notation. Student exchange and combined digital repositories for the blind were substantially limited by this factor.

With DAISY books and MathML, the publisher now only has to be concerned about proper MathML within the document. The reader may then choose a DAISY player, either a hardware player, a software player on a computer, or a mobile phone/PDA, which supports the math notation of choice. Reading DAISY books from organizations unable to support specific math notations is now possible because of the uniform storage of math content as MathML.

Coding Example
Traditional Notation Picture showing the two dimensional, graphical math formula
LaTeX 1+\sqrt{ ( (x**2-y**2) / (x+y) ) * (x-y) }=0
AMS 1+( ( ( x**2-y**2) / (x+y) ) * (x-y) ) //2=0
Nemeth #1+>?X^2″-Y^2″/X+Y#(X-Y)] .K #0

Table 1– Some math notations

10. Further information (standards, links, articles, books)

  1. DAISY 3 Version of  this article “Introduction to MathML-in-DAISY in 10 Small Chapters”
    NOTE: To fully access the DAISY 3 version, you’ll need a MathML compliant DAISY reader software. You can download a 30 day trial version of the gh PLAYER 2.2 Premium.
  2. DAISY Consortium: http://www.daisy.org
  3. MathML-in-DAISY Project: http://www.daisy.org/project/mathml
  4. MathML-in-DAISY Structure Guidelines: http://www.daisy.org/z3986/structure/SG-DAISY3/part2-math.html
  5. W3C MathML recommendation: http://www.w3.org/Math/
  6. Maths, Informatique, Jeux: http://www.maths-informatique-jeux.com/international/why_you_should_use_mathml.php
  7. “DAISY 3: The Standard for Accessible Multimedia Books”: IEEE MultiMedia, ISSN 1070-986X, Vol. 15, No. 4, October-December 2008
  8. The MathML Handbook: Pavi Sandhu (Paperback, 2002, ISBN 1-58450-249-5)
Share