Accessible PDF (Tagged PDF) and PDF/A | |
SelectPdf can generate tagged (accessible) PDF documents that conform to the PDF/UA accessibility standard, and PDF/A documents (including PDF/A-3) suitable for long term archiving.
A tagged PDF carries a logical structure tree describing the meaning of the content - headings, paragraphs, lists, tables, figures with alternate text, links and the reading order. Assistive technologies such as screen readers use this structure to present the document to users with disabilities. Decorative content (backgrounds, rules, watermarks) is marked as an artifact so it is skipped by assistive technologies.
Tagged (accessible) PDF and PDF/A-3 generation are available only in the full commercial SelectPdf library. They are not available in the free SelectPdf community edition. |
Tagged (PDF/UA) output and PDF/A-3a (level A) require the Blink or Chromium rendering engine - only these engines produce the logical structure tree. The default WebKit engine (and WebKit Restricted) cannot generate tagged content and will raise an error if Tagged or PdfStandard.PdfA3A is requested, so set RenderingEngine to RenderingEngine.Chromium (or RenderingEngine.Blink) for these. The untagged archival levels PdfStandard.PdfA3B and PdfStandard.PdfA3U are produced by every engine, including the default WebKit. See Rendering Engines. |
Set Tagged to true. The converter takes the document structure (headings, lists, tables, figures, links, reading order) directly from the rendering engine's accessible output, so a well authored HTML page produces a well structured PDF. Set Title and Language to describe the document (a title and a language are required for an accessible PDF).
HtmlToPdf converter = new HtmlToPdf(); // tagged / PDF/UA output requires the Blink or Chromium engine converter.Options.RenderingEngine = RenderingEngine.Chromium; // produce a tagged, accessible (PDF/UA) document converter.Options.Tagged = true; converter.Options.Title = "Quarterly Report"; converter.Options.Language = "en-US"; PdfDocument doc = converter.ConvertUrl("https://www.example.com"); doc.Save("accessible.pdf"); doc.Close();
The quality of the resulting accessibility depends on the source HTML: images should have an alt attribute, headings should be used in order, tables should use proper table markup, and so on.
Set PdfStandard to the desired PdfStandard value. For archiving, PDF/A-3 is available at three conformance levels:
PdfStandard.PdfA3B - level B (visual reproduction).
PdfStandard.PdfA3U - level U (all text has Unicode mapping).
PdfStandard.PdfA3A - level A (accessible: includes the tagged logical structure). Selecting this level automatically produces a tagged document, and therefore requires the Blink or Chromium engine. Levels B and U are produced by every engine.
HtmlToPdf converter = new HtmlToPdf(); // PDF/A-3A is archival AND accessible (tagging is enabled automatically); // like all tagged output it requires the Blink or Chromium engine converter.Options.RenderingEngine = RenderingEngine.Chromium; converter.Options.PdfStandard = PdfStandard.PdfA3A; converter.Options.Title = "Invoice 2026-0042"; converter.Options.Language = "en-US"; PdfDocument doc = converter.ConvertUrl("https://www.example.com/invoice"); doc.Save("invoice-pdfa3a.pdf"); doc.Close();
When building a PDF document with the SelectPdf creation API, enable tagging on the PdfDocument and assign a structure type to each page element through its TagType property. Use AlternateText to provide a textual description for figures, and mark purely decorative elements with Artifact.
In a tagged document every content element you add to a page must set either TagType (real content) or Artifact = true (decorative). Adding an element with neither raises an error rather than silently producing an inaccessible document.
Tagged and PDF/A documents require that every font is embedded. Add fonts created from a System.Drawing.Font with the embed parameter set to true (the non-embedded standard base fonts cannot be used). |
// create an accessible PDF/A-3A document PdfDocument doc = new PdfDocument(PdfStandard.PdfA3A); doc.Tagged = true; doc.Language = "en-US"; doc.Title = "Accessible Document"; PdfPage page = doc.AddPage(); // embedded fonts are required for tagged / PDF/A output PdfFont headingFont = doc.AddFont(new System.Drawing.Font("Arial", 16, System.Drawing.FontStyle.Bold), true); PdfFont bodyFont = doc.AddFont(new System.Drawing.Font("Arial", 11), true); // a heading PdfTextElement heading = new PdfTextElement(10, 10, "Annual Report", headingFont); heading.TagType = PdfTagType.Heading1; page.Add(heading); // a paragraph PdfTextElement para = new PdfTextElement(10, 40, 480, "This document is tagged for accessibility.", bodyFont); para.TagType = PdfTagType.Paragraph; page.Add(para); // a figure with alternate text PdfImageElement figure = new PdfImageElement(10, 90, "chart.png"); figure.TagType = PdfTagType.Figure; figure.AlternateText = "Sales grew 20% year over year"; page.Add(figure); // a decorative rule (excluded from the structure) PdfLineElement rule = new PdfLineElement(10, 80, 480, 80); rule.Artifact = true; page.Add(rule); doc.Save("created-accessible.pdf"); doc.Close();
Tables and lists need a nested structure (a table contains rows, a row contains header and data cells; a list contains list items). Open a container with BeginTag(PdfTagType) and close it with EndTag. Elements added, and further containers opened, before the matching EndTag become children of that container.
doc.BeginTag(PdfTagType.Table);
doc.BeginTag(PdfTagType.TableRow);
page.Add(new PdfTextElement(10, 40, "Region", bodyFont) { TagType = PdfTagType.TableHeader });
page.Add(new PdfTextElement(80, 40, "Sales", bodyFont) { TagType = PdfTagType.TableHeader });
doc.EndTag();
doc.BeginTag(PdfTagType.TableRow);
page.Add(new PdfTextElement(10, 60, "North", bodyFont) { TagType = PdfTagType.TableDataCell });
page.Add(new PdfTextElement(80, 60, "100", bodyFont) { TagType = PdfTagType.TableDataCell });
doc.EndTag();
doc.EndTag();The PdfStandard enumeration selects the standard of the generated document. The archival values include PdfA (PDF/A-1b), PdfA2B (PDF/A-2b), and the PDF/A-3 family PdfA3B, PdfA3U and PdfA3A. For accessibility only (without the archival constraints), use Tagged or Tagged with the default PdfStandard.Full.
PDF/A does not permit certain interactive features - sound, movie and screen annotations, and link actions such as launching an external file or running JavaScript. Adding one of these to a PDF/A document raises an error that names the unsupported feature, rather than producing a non-conformant file. |