LaTeX to HTML to CANVAS, with LaTeXML.
Workflow
1. Introduction
These notes available at [14] show how to convert a complex LaTeX file to HMTL by LaTeXML and later use it in CANVAS with the purpose of making the content accessible to users with visual impairment. Below we refer to CANVAS Ally Accessibility tool accessibility score as CAA, and these notes as well as the associated example receive high CAA.
1.1. Motivation
As well known, the PDF files created from LaTeX directly are not accessible, in general, unless they have no math content (see [12] for more information how to make PDF from such LaTeX files accessible).
Workarounds to this issue are possible, as tested and reported by several colleagues. These workarounds might involve additional (manual) conversion steps such as from PDF to Word documents, or the use of third party software such as github actions. One can also use tools such as CANVAS Rich Text Editor (for allowable math content). These steps might work very well for courses where only one formula at a time would need to be rendered, or for content creators satisfied with learning a new tool or depending on the third party software.
However, these manual tasks do not allow the reverse compatibility back to LaTeX and thus may not easily allow reuse of the CANVAS content in the future courses. They also do not support the reuse of legacy or new LaTeX documents with complex structure, a trademark of LaTeX in the form of theorem-like environments, automatic referencing, rich equation writing compatibility, and similar flexibility appreciated by the many LaTeX users. Finally, the author prefers to not be dependent on third party software.
The author has compiled these notes for her own reference. We sincerely hope that as new accessibility tools continue to be developed and tested, these notes might become obsolete. I hope that day comes soon. For now, read on.
1.2. About this document.
In this document we show the steps and test some LaTeX style elements for class notes with mathematical content converted to HTML to be uploaded to CANVAS. These might be useful for those who create LaTeX documents with multiple structured equations, theorems and proofs, tables, and graphics. We do not claim to test a full picture of the mathematical topic or to the slew of possibilities that LaTeX or LaTeXML offer.
1.3. Basic information.
- (1)
- (2)
-
(3)
LaTeXML is specifically designed to produce semantic HTML5 from LaTeX, rather than just reproducing the visual layout. It aims to preserve the structural and semantic meaning of the document, making it suitable for accessibility, search, and machine-readable applications.
- (4)
-
(5)
Once you produce your HTML (and other) files, you can upload these to CANVAS, or copy their content to CANVAS. See Section 2.4 for these two modes.
Generally, CANVAS finds the HTML version of this document accessible with a high CAA. One can also use various screen readers to interpret this document. See Section 1.4 for more.
Throughout the document, I make some recommendations.
1.4. CANVAS and other accessibility tools
CANVAS provides various tools to test and support accessibility. See, in particular, [1] (Access to CANVAS might be required).
In particular, we use CANVAS Ally Accessibility tool. The score assigned by this tool to your CANVAS content (page, html file, other files) is abbreviated below as CAA score.
Among other tools, CANVAS offers a Rich Text Editor, which can be used to input directly small portions of LaTeX code to use as mathematical symbols.
Another opportunity is to use CANVAS Editor in its HTML mode (View HTML, raw HTML) as a way to embed content created by LaTeXML directly in CANVAS.
CANVAS documentation [1] refers to various screen readers including the NVDA we tested.
There is also Microsoft Immersive Reader available to read CANVAS pages, and Assignments; we abbreviate it below as CANVAS Reader. This tool is active after you press on a button on top right corner or a prompt over your document when you scroll.
While CANVAS reader reads CANVAS pages, unfortunately it cannot read HTML files uploaded to CANVAS; for these, you can use an external reader such as NVDA.
1.5. Screen readers
One can also use screen readers external to CANVAS; I have installed and tested NVDA for Windows 11. Their installation is easy but there are some challenges. For example, NVDA (once installed) does not work “on demand” but rather runs (always) in the background. If used only for testing purposes, it should be installed with hot keys that help to turn it off and on after restart. I used CapsLock Q.
2. The use of LaTeXML package
As extensively discussed by LaTeXML creators, the tool can be installed on Windows, MAC, and Linux systems.
In this document we report on the installation of LaTeXML and its use on a Linux system followed by some work in CANVAS. For some steps, I provide scripts, all are in the zip file [12].
My understanding is that the LaTeXML tools can be also installed on a MAC with homebrew and such, but I do not have experience with these. LaTeXML should also work on a Windows computer, but I did not have success with these so far.
2.1. Install LaTeXML
On Linux Ubuntu 22.04 I used the command line environment
sudo apt install latexml
As an optional step, identify the location of the file latexml.sty. You may need it for processing on your computer or with Overleaf. On my system it is at
/usr/share/texmf/tex/latex/latexml/latexml.sty
Another file you might need is LatexML.css; downloadable from Internet as well as available in [12].
2.2. Before you work with LaTeXML for a given file
The tool LaTeXML works similar to LaTeX but may be less forgiving. If you use its command line version, you might drown in an extensive list of warnings and errors.
Recommendation 1.
Ensure first that your file processes cleanly in LaTeX (e.g., by pdflatex or in Overleaf).
You might also change your LaTeX style a little bit. In particular, swallow your pride and your style prejudice, and adhere to the style restrictions described in [13], those listed in Section 3, and those described extensively in [6]. In fact, some elements of LaTeX files may or may not work well when processed by LaTeXML. See Section 3.1. You can also embed LaTeXML specific conditional statements, if you wish.
2.3. Convert to XML and HTML
Convert the LaTeX file Handout.tex to html using a two-step process described, e.g., in [8] as follows.
Assume you are working in a command line Linux environment.
Let your file be called Handout.tex
Take STEP1 and STEP2, or variants of the latter.
2.3.1. STEP1: LaTeX to XML
STEP1 produces an xml file
latexml Handout.tex --dest=Handout.xml
I provide STEP1.csh that does the job. This step may produce errors, so I recommend you keep it separate from others.
2.3.2. STEP2: XML to HTML
Next, we go to STEP2 which produces html file and css and other files.
The options for this step require you type one long line. You may want to copy and paste to a file without line breaks). Alternatively, I provide scripts STEP2.csh (or STEP2nosec.csh and STEP2nosecnonav and STEP2best.csh) that do the job. See Recommendation 4.
The code below produces one monolithic HTML file without navigation between sections.
latexmlpost Handout.xml --dest=Handout.html --javascript="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js?config=MML_HTML" --urlstyle=file --timestamp=0
Optionally, for a long document divided into sections you can split your the document so that each of the multiple sections to be placed in separate files (such as S1.html, S2.html… for this document). You can also have a set of navigation lines in the HTML code (similar to table of content) on top. This is done as follows
latexmlpost Handout.xml --dest=Handout.html --split --splitat=section --navigation=context --format=html5 --javascript="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js?config=MML_HTML" --urlstyle=file --timestamp=0
Finally, LaTeXML produces a lot of auxiliary files in its conversion steps. To not get lost, I recommend that all the needed output goes to a folder Handout-folder and that the format enforced is HTML5
latexmlpost Handout.xml --dest=handout-folder/Handout.html --split --splitat=section --navigation=context --format=html5 w--javascript="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js?config=MML_HTML" --urlstyle=file --timestamp=0
Recommendation 2.
Use script STEP2best.csh which produces clean output well suited for CANVAS.
2.3.3. Testing
You should now test your HTML files by opening
Handout.html
in your browser. Make sure all the files created by STEP2 are in the same folder, including your section files such as S1.html as well as LaTeXML.css.
If needed, redo your LaTeX files and redo the HTML. See Recommendation 1.
2.4. Upload and use in CANVAS
The resulting html file(s) can be used in CANVAS in one of two ways: as HTML files or within CANVAS pages.
2.4.1. Upload the HTML files to CANVAS
You can upload Handout.html (and any other files such as S1.html, x1.jpg, and css files that LaTeXML produced …) directly to CANVAS and point to these from your course page.
The file(s) receive a high CANVAS score.
The file(s) can be read by an external screen reader tool such as NVDA. The screen reader will read the file properly including the math but reading math in a way somewhat inferior to the CANVAS Immersive Reader. For example, it will read as “u x”, rather than “u sub X” as the latter.
Note that if you chose to split the file into sections, and they are all uploaded to CANVAS, they will be visible with appropriate navigation links. However, the screen reader might not be able to uncover the links to the individual sections.
2.4.2. Embed the HTML code by copying and pasting in CANVAS page
This step requires that you open the HTML file with your text editor, and copy and paste the file contents to a CANVAS “Page” within CANVAS. (click Edit to View to HTML Editor to raw HTML).
This file can be read by the Immersive Reader. You can click the Immersive Reader button to test. The math is read the same way as math content created by Rich Text Editor.
This page might receive a high CAA, but we are not able to see these directly. Canvas Ally reports concerns in a summary page.
However, the Canvas Reader is designed to read the content of the current page only and does not navigate to the “next section” within a single Canvas page.
2.4.3. CAA score notes
The files Handoutnosec.html we tested receive 88% and 98% score. The concern is that “The HTML content does not have a language set”; also, we hear that “guidance not available yet. We are updating the guidance for this issue.” check
2.5. Scripts and utilities
Handout.tex
Handout.html
LaTeXML2CANVAS.tex
Handout.tex
STEP1.csh
STEP2.csh
STEP2nosec.csh
STEP2nosecnonav.html
STEP2best.csh
3. LaTeX elements that work or not with LaTeXML
In addition to the LaTeX style elements already discussed in [13], here are the notes on the use of the LaTeXML related use. We also mention other issues.
3.1. Style of LaTeX preferred by LaTeXML
-
(1)
Develop LaTeXML specific commands/environments or non-LaTeXML specific commands. You can do this by including in the preamble
\usepackage{latexml} % \iflatexml % Code to be executed ONLY when processed by LaTeXML \else % Code to be executed ONLY when processed by standard LaTeX \fiYou will need the file latexml.sty for this (Section 2.1).
-
(2)
Avoid using extra (font) styles such as boldface and italics, and so on. While this strategy contradicts my own style preferences, it accelerates the efforts towards accessibility presented here.
Recommendation 3.
Stick to plain LaTeX as much as possible, e.g, let the floats be floats. Use section headers and headers for tables and other style elements as recommended in [13].
-
(3)
Section numbering.
To make sure your sections are numbered correctly you should include in the preamble the statement\setcounter{secnumdepth}{3}Without this trick I have seen the code does not process cleanly by LaTeXML.
The unnumbered versions such as subsection* work also well. -
(4)
Splitting sections: during conversion to HTML, you can choose to split the HTML code into separate HTML files containing the sections or not, as given in Section 2.3.1.
In Canvas, the HTML files for all sections of the document must be uploaded at the same time.
If you upload multiple documents at different times, and each has several sections, you have to disambiguate the file names for each of the documents by editing the HTML files directly. This might be complicated and fragile.Recommendation 4.
Do not split the document into multiple HTML files, and use STEP2nosec.html script rather than STEP2.csh.
-
(5)
When parsing equations, LaTeXML is more picky than standard LaTeX. For example, you may get warnings for some math that is not set up right, e.g. when a parenthesis is missing. See Recommendation 1.
-
(6)
References to equations processed by LaTeXML work well, but LaTeXML is more picky than, e.g., pdflatex.
Environment equation works well.
Environment eqnarray works well.
Environment subequations works well.
However, LaTeXML is picky: a warning will be given if the environment eqnarray is used for only one equation. -
(7)
The tables process fine with LaTeXML and the resulting HTML works very well. However, you need to have LatexML.css to see the vertical or horizontal lines are added in the resulting HTML file. You will also have all the tables listed as they appear in text.
Finally, their CAA score is problematic in the two options listed in Section 2.4.
WARNING 5.
The output of tables and the use of HTML code with the tables in CANVAS is still being investigated.
-
(8)
The processing of pictures, as announced in LaTeXML manual [2], should work. One can obtain HTML code with newly created images (ImageMagick produces them) as output of the process. You should use alt-text of course as recommended.
However, it is not clear how to take advantage of this in CANVAS. In fact, copying an image file required by the HTML code to CANVAS will result in a low CAA.
WARNING 6.
Working with images is still being tested in CANVAS. While LaTeXML processes these fine, their upload to CANVAS as raw HTML requires some further study. Currently images are not processed-able in this option.
-
(9)
There are many environments I have not used or tested because they are reported to have issues when processed with LaTeXML and since they might present issues due to lowe accessibility.
These include tikz, the use of color, shaded, mdframed, and complex tables. Also, beamer, alas. (Section 4.2.3)
4. Future steps for use of LaTeXML for documents in CANVAS
For now, we are optimistic regarding the short-term efficiency and stability of our workflow using LaTeXML.
4.1. Short-term
The LaTeX files converted to html by LaTeXML and uploaded to CANVAS have a high CAA. The LaTeX style modification requires some effort, but for what we tested, it seems like this is a reasonable effort.
4.2. Mid-term and long-term planning
I am unsure about long-term prospects of the various installation of LaTeXML for widespread use as a tool for LaTeX to accessible CANVAS math materials. Work is needed from both communities.
4.2.1. Development team of LaTeXML and their success is amazing, but what about the resources?
The LaTeXML tool is great. The LaTeXML package is developed and maintained by a small group of developers listed at github. Some changes to the repo are from over 2 years ago. The package is not available on CTAN or through Overleaf. Some materials are available on github [4].
4.2.2. Installation on Windows
This is a sore spot for the author.
The LaTeXML manual recommends for installation on Windows the use the packages chocolatey or Strawberry perl. While I was able to install both on my Windows 11 system, and seemingly was able to install LaTeXML on my system, some components of the installation were reported as broken and incomplete. This might be due to my particular system and its long life and too many operating systems and too many software installation changes and instances including cygwin, mobaxterm, and Windows Linux Subsystem.
The author plans to take a more aggressive pass at this when time allows.
4.2.3. The lack of integration of beamer with LaTeXML
Well, this favorite package for presentations has also been very useful for slides which constituted class notes.
Beamer is not integrated at this time with LaTeXML. In particular, the basic enviornments such as
\begin{frame}
...
\end{frame}
\frame{
\frametitle{my frame title}
... my frame content...
}
do not work right off the bat.
However, there are numerous posts in the community requesting this integration which suggest there might be opportunities in the future.
Perhaps one can also find workarounds based on the conditional iflatexml statement. For example, define some environments which will work with both LaTeXML and plain LaTeX.
4.3. Other options
Every now and then we encounter a mention of other tools, in particular, for LaTeX to HTML. I recommend to check each time about the math.
4.3.1. Pandoc
Pandoc can convert from LaTeX to HTML, and to and between many other file formats. Pandoc is easy to install. However, its ability to parse complex math is very limited, and the documents cannot be easily read by screen readers.
4.3.2. MathML
There is a community around W3 developers who created the gold standards for all we have today. See their current activity at https://www.w3.org/Math/.
4.3.3. BookML
BookML [9] is a recent add-on (is powered by) LaTeXML. It can work in many modalities: standalone (after installation), and on Overleaf through github.
Pros are as follows: citing from the available information, “BookML is a fully automated solution for the production of accessible html content straight from LATEX, based on LATExml for the widest LATEX compatibility and bookdown tool for a modern and accessible look. Integration with Overleaf is provided via a GitHub action. Outputs are also packaged as SCORM for ease of use in higher education. Created by and maintained for maths lecturers at the University of Leeds.”
The bookml has a potential as a blueprint for the future community efforts. However, for today’s status, I have concerns listed below.
-
(1)
In the Overleaf-github functionality, BookML requires a paid subscription to Overlaf for the github connection to work, and we will receive an email with the processed document.
-
(2)
BookML is not a lightweight tool, is designed to convert an entire folder of LaTeX files rather than one file alone.
-
(3)
The installation is not immediate. I installed and tested bookml using a distribution from https://github.com/vlmantova/bookml/releases
The tool requires LaTeXML version 8.8 which can be upgraded from sources [4]; see Section 2.
The tool also requires that you have mutool. This is done on my system withsudo apt install mupdf-tools
(I do not recommend AppImage from 2020)
-
(4)
The tool bookml produces a lot of converted files in folder auxdir (also, PDF) rather than just HTML and thus is not lightweight.
References
- [1] https://community.instructure.com/en/kb/articles/662723-what-are-the-canvas-accessibility-standards
- [2] https://math.nist.gov/~BMiller/LaTeXML/
- [3] Wikipedia information on LaTeXML https://en.wikipedia.org/wiki/LaTeXML
- [4] https://github.com/brucemiller/LaTeXML
- [5] https://github.com/brucemiller/LaTeXML/tree/master/lib/LaTeXML/texmf
- [6] Arxiv https://info.arxiv.org/help/submit_latex_best_practices.html
- [7] e.g., https://arxiv.org/html/1404.6549v1
- [8] Ian Price, Converting LaTeX notes to HTML https://www.universityofgalway.ie/media/accessibility/files/Converting-LaTeX-notes-to-HTML.pdf
- [9] Vincenzo Mantova, BookML: LATEX to html, made easy(ish). Powered by LATExml. 14th January 2026 https://vlmantova.github.io/bookml/docs.pdf
- [10] https://ctan.org/pkg/accessibility?lang=en
- [11] https://math.oregonstate.edu/~mpesz/latex/accessible/
- [12] https://math.oregonstate.edu/~mpesz/latex/accessible/accessible.zip
- [13] https://math.oregonstate.edu/~mpesz/latex/accessible/accessible-more-more.pdf
- [14] https://math.oregonstate.edu/~mpesz/latex/accessible/latexml-workflow.html
- [15] https://math.oregonstate.edu/~mpesz/latex/accessible/latexml-handout.html