[PATCH 0/9] Fix broken HTML in API docs

While checking API docs after recent migration to gitlab-pages I've noticed that the footer is not properly rendered. A deeper dig showed that the issue is that empty <div> elements, while formatted by the XML output version of XSLT conversion is turned into the non-pair empty variant. This contradicts the HTML standard which we declare to use for our pages and thus is mis-parsed into dom, where every empty div is treated as a start of a div, thus nesting everything. This series fixes the above and also many other errors pointed out by the HTML validator at: https://validator.w3.org/ Compare: https://libvirt.org/html/libvirt-libvirt-domain.html vs https://pipo.sk.gitlab.io/-/libvirt/-/jobs/6288946946/artifacts/website/html... and the validation: https://validator.w3.org/nu/?doc=https%3A%2F%2Flibvirt.org%2Fhtml%2Flibvirt-... vs https://validator.w3.org/nu/?doc=https%3A%2F%2Fpipo.sk.gitlab.io%2F-%2Flibvi... Also non-API pages validation: https://validator.w3.org/nu/?doc=https%3A%2F%2Flibvirt.org%2Fdocs.html vs https://validator.w3.org/nu/?doc=https%3A%2F%2Fpipo.sk.gitlab.io%2F-%2Flibvi... Peter Krempa (9): docs: site: Don't generate '<?xml' header for HTML documents docs: page: Add 'lang="en"' for all HTML output documents docs: page: Fix declaration of main javascript source docs: index: Fix import of blog planet javascript docs: newapi: Don't generate empty <div> in template for ACL permissions docs: newapi: Avoid empty <div>s when there is no description docs: newapi: Avoid table where every row has an cell with 'colspan' docs: newapi: Properly skip ACL entries if empty docs: newapi: Fix generation of type definition tables docs/index.rst | 2 +- docs/newapi.xsl | 158 ++++++++++++++++++++++++++---------------------- docs/page.xsl | 7 ++- docs/site.xsl | 5 +- 4 files changed, 93 insertions(+), 79 deletions(-) -- 2.43.0

Skip the XML header as it's invalid with <!DOCTYPE HTML> both for the RST-generated pages and for the API docs generated from the API XML. Additionally remove the spurious xsl:output directive from newapi.xsl which is ignored and thus misleading. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/newapi.xsl | 7 +++---- docs/site.xsl | 5 +---- 2 files changed, 4 insertions(+), 8 deletions(-) diff --git a/docs/newapi.xsl b/docs/newapi.xsl index d6f8d88170..38cefb2ca8 100644 --- a/docs/newapi.xsl +++ b/docs/newapi.xsl @@ -17,8 +17,6 @@ <!-- Import the main part of the site stylesheets --> <xsl:import href="page.xsl"/> - <xsl:output method="xml" encoding="UTF-8" indent="yes"/> - <!-- Build keys for all symbols --> <xsl:key name="symbols" match="/api/symbols/*" use="@name"/> @@ -811,8 +809,9 @@ <xsl:document href="{concat($htmldir, '/libvirt-', @name, '.html')}" method="xml" - indent="yes" - encoding="UTF-8"> + omit-xml-declaration="yes" + encoding="UTF-8" + indent="yes"> <xsl:apply-templates select="exsl:node-set($subpage)" mode="page"> <xsl:with-param name="timestamp" select="$timestamp"/> <xsl:with-param name="link_href_base" select="$href_base"/> diff --git a/docs/site.xsl b/docs/site.xsl index c0b56be72f..5a5fea2350 100644 --- a/docs/site.xsl +++ b/docs/site.xsl @@ -22,10 +22,7 @@ <xsl:apply-templates select="exsl:node-set($inchtml)/html:html/html:body/*" mode="content"/> </xsl:template> - <xsl:output - method="xml" - encoding="UTF-8" - indent="yes"/> + <xsl:output method="xml" omit-xml-declaration="yes" encoding="UTF-8" indent="yes"/> <xsl:template match="/"> <xsl:apply-templates select="." mode="page"> -- 2.43.0

Per the w3 HTML validator the 'lang' attribute is suggested. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/page.xsl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/page.xsl b/docs/page.xsl index a51587db6c..7dcbc2d7a4 100644 --- a/docs/page.xsl +++ b/docs/page.xsl @@ -21,7 +21,7 @@ <xsl:param name="asset_href_base"/> <xsl:text disable-output-escaping="yes"><!DOCTYPE html> </xsl:text> - <html data-sourcedoc="{$pagesrc}"> + <html lang="en" data-sourcedoc="{$pagesrc}"> <xsl:comment> This file is autogenerated from <xsl:value-of select="$pagesrc"/> Do not edit this file. Changes will be lost. -- 2.43.0

Per the w3 html validator a HTML/XML comment is not allowed inside the <script> tag, use a space instead as it must be a pair tag. Additionally drop the 'type' attribute as it's not needed (validator warns about it). Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/page.xsl | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/page.xsl b/docs/page.xsl index 7dcbc2d7a4..71c7cfba9d 100644 --- a/docs/page.xsl +++ b/docs/page.xsl @@ -44,8 +44,9 @@ <meta name="go-import" content="{/html:html/html:head/html:meta[@name='go-import']/@content}"/> </xsl:if> - <script type="text/javascript" src="{$asset_href_base}js/main.js"> - <xsl:comment>// forces non-empty element</xsl:comment> + <!-- force non-empty script tag --> + <script src="{$asset_href_base}js/main.js"> + <xsl:text> </xsl:text> </script> </head> <body onload="pageload()"> -- 2.43.0

Similarly to previous commit drop the 'type' attribute which is frowned upon by the HTML standard. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/index.rst b/docs/index.rst index 79afae65da..6b2f2386ee 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -65,7 +65,7 @@ Blog Planet .. raw:: html - <script type="text/javascript" src="js/virt-tools-blog-planet.js"> + <script src="js/virt-tools-blog-planet.js"> </script> <div id="planet"> </div> -- 2.43.0

If an API has no ACLs an empty <div class='acl'/> would be generated which is mis-interpreted by browsers when creating DOM to nest any subsequent elements under it. Don't generate the ACL section div unless it will be filled. Best viewed with 'git show -w' Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/newapi.xsl | 60 +++++++++++++++++++++++++------------------------ 1 file changed, 31 insertions(+), 29 deletions(-) diff --git a/docs/newapi.xsl b/docs/newapi.xsl index 38cefb2ca8..b60680ae97 100644 --- a/docs/newapi.xsl +++ b/docs/newapi.xsl @@ -26,30 +26,34 @@ <xsl:template name="aclinfo"> <xsl:param name="acl"/> - <xsl:if test="count($acl/check) > 0"> - <h5>Access control parameter checks</h5> - <table> - <thead> - <tr> - <th>Object</th> - <th>Permission</th> - <th>Condition</th> - </tr> - </thead> - <xsl:apply-templates select="$acl/check" mode="acl"/> - </table> - </xsl:if> - <xsl:if test="count($acl/filter) > 0"> - <h5>Access control return value filters</h5> - <table> - <thead> - <tr> - <th>Object</th> - <th>Permission</th> - </tr> - </thead> - <xsl:apply-templates select="$acl/filter" mode="acl"/> - </table> + <xsl:if test="count($acl/check) > 0 or count($acl/filter) > 0"> + <div class="acl"> + <xsl:if test="count($acl/check) > 0"> + <h5>Access control parameter checks</h5> + <table> + <thead> + <tr> + <th>Object</th> + <th>Permission</th> + <th>Condition</th> + </tr> + </thead> + <xsl:apply-templates select="$acl/check" mode="acl"/> + </table> + </xsl:if> + <xsl:if test="count($acl/filter) > 0"> + <h5>Access control return value filters</h5> + <table> + <thead> + <tr> + <th>Object</th> + <th>Permission</th> + </tr> + </thead> + <xsl:apply-templates select="$acl/filter" mode="acl"/> + </table> + </xsl:if> + </div> </xsl:if> </xsl:template> @@ -702,11 +706,9 @@ </xsl:if> </dl> </xsl:if> - <div class="acl"> - <xsl:call-template name="aclinfo"> - <xsl:with-param name="acl" select="acls"/> - </xsl:call-template> - </div> + <xsl:call-template name="aclinfo"> + <xsl:with-param name="acl" select="acls"/> + </xsl:call-template> </xsl:template> <xsl:template match="exports" mode="toc"> -- 2.43.0

The various objects we generate API for may have empty description in which case an empty div would be generated when processing the API description. As we're using XML output mode the generator would shorten such divs to the non-pair empty element version, which doesn't work well with HTML5 parsers requiring a pair tag for <div> Avoid empty description <div> elements altogether by skipping it if the description is empty. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/newapi.xsl | 40 +++++++++++++++++++++++++--------------- 1 file changed, 25 insertions(+), 15 deletions(-) diff --git a/docs/newapi.xsl b/docs/newapi.xsl index b60680ae97..539e0a4175 100644 --- a/docs/newapi.xsl +++ b/docs/newapi.xsl @@ -222,6 +222,19 @@ </xsl:if> </xsl:template> + <xsl:template name="formattextdiv"> + <xsl:param name="text"/> + <xsl:param name="divclass"/> + + <xsl:if test="$text"> + <div class="{$divclass}"> + <xsl:call-template name="formattext"> + <xsl:with-param name="text" select="$text"/> + </xsl:call-template> + </div> + </xsl:if> + </xsl:template> + <xsl:template match="macro" mode="toc"> <span class="directive">#define</span><xsl:text> </xsl:text> <a href="#{@name}"><xsl:value-of select="@name"/></a> @@ -287,11 +300,10 @@ <xsl:template match="typedef[@type = 'enum']"> <xsl:variable name="name" select="string(@name)"/> <h3><a id="{$name}"><code><xsl:value-of select="$name"/></code></a></h3> - <div class="description"> - <xsl:call-template name="formattext"> + <xsl:call-template name="formattextdiv"> <xsl:with-param name="text" select="info"/> + <xsl:with-param name="divclass">description</xsl:with-param> </xsl:call-template> - </div> <div class="api"> <pre> <span class="keyword">enum</span><xsl:text> </xsl:text> @@ -343,11 +355,10 @@ <xsl:text>;</xsl:text> </pre> </div> - <div class="description"> - <xsl:call-template name="formattext"> + <xsl:call-template name="formattextdiv"> <xsl:with-param name="text" select="info"/> + <xsl:with-param name="divclass">description</xsl:with-param> </xsl:call-template> - </div> </xsl:template> <xsl:template match="struct" mode="toc"> @@ -452,11 +463,11 @@ <xsl:variable name="name" select="string(@name)"/> <h3><a id="{$name}"><code><xsl:value-of select="$name"/></code></a></h3> <pre class="api"><span class="directive">#define</span><xsl:text> </xsl:text><xsl:value-of select="$name"/></pre> - <div class="description"> - <xsl:call-template name="formattext"> + <xsl:call-template name="formattextdiv"> <xsl:with-param name="text" select="info"/> + <xsl:with-param name="divclass">description</xsl:with-param> </xsl:call-template> - </div><xsl:text> + <xsl:text> </xsl:text> </xsl:template> @@ -604,11 +615,10 @@ <xsl:text>) </xsl:text> </pre> - <div class="description"> - <xsl:call-template name="formattext"> + <xsl:call-template name="formattextdiv"> <xsl:with-param name="text" select="info"/> + <xsl:with-param name="divclass">description</xsl:with-param> </xsl:call-template> - </div> <xsl:if test="arg | return"> <dl class="variablelist"> <xsl:for-each select="arg"> @@ -680,11 +690,11 @@ </xsl:for-each> <xsl:text>)</xsl:text> </pre> - <div class="description"> - <xsl:call-template name="formattext"> + <xsl:call-template name="formattextdiv"> <xsl:with-param name="text" select="info"/> + <xsl:with-param name="divclass">description</xsl:with-param> </xsl:call-template> - </div><xsl:text> + <xsl:text> </xsl:text> <xsl:if test="arg | return/@info"> <dl class="variablelist"> -- 2.43.0

The HTML standard requires that a table column must include at least one row which defines it exclusively, thus having a table where all rows unite it via 'colspan' is illegal. Modify the enum value generator to always output the description field even when it's empty rather than uniting it, as in case when each value doesn't have a description the generated document would violate the standard. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/newapi.xsl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/newapi.xsl b/docs/newapi.xsl index 539e0a4175..36bb41c877 100644 --- a/docs/newapi.xsl +++ b/docs/newapi.xsl @@ -317,9 +317,9 @@ <tr> <td><a id="{@name}"><xsl:value-of select="@name"/></a></td> <td><xsl:text> = </xsl:text></td> + <td class="enumvalue"><xsl:call-template name="enumvalue"/></td> <xsl:choose> <xsl:when test="@info != ''"> - <td class="enumvalue"><xsl:call-template name="enumvalue"/></td> <td> <div class="comment"> <xsl:call-template name="dumptext"> @@ -329,7 +329,7 @@ </td> </xsl:when> <xsl:otherwise> - <td colspan="2" class="enumvalue"><xsl:call-template name="enumvalue"/></td> + <td><xsl:comment> </xsl:comment></td> </xsl:otherwise> </xsl:choose> </tr> -- 2.43.0

The source document can contain an empty '@flags' attribute which passes the test but generates an empty element. Check that flags is non-empty to trigger the fallback. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/newapi.xsl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/newapi.xsl b/docs/newapi.xsl index 36bb41c877..51f159a2f8 100644 --- a/docs/newapi.xsl +++ b/docs/newapi.xsl @@ -62,7 +62,7 @@ <td><a href="../acl.html#object_{@object}"><xsl:value-of select="@object"/></a></td> <td><a href="../acl.html#perm_{@object}_{@perm}"><xsl:value-of select="@perm"/></a></td> <xsl:choose> - <xsl:when test="@flags"> + <xsl:when test="@flags != ''"> <td><xsl:value-of select="@flags"/></td> </xsl:when> <xsl:otherwise> -- 2.43.0

Ensure that all rows have 3 columns and avoid generation of emtpy elements which would be turned by the XML formatter into non-pair td/tr tags which don't work properly with HTML5 parsers. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/newapi.xsl | 45 +++++++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 20 deletions(-) diff --git a/docs/newapi.xsl b/docs/newapi.xsl index 51f159a2f8..aff4cf0d4e 100644 --- a/docs/newapi.xsl +++ b/docs/newapi.xsl @@ -384,7 +384,7 @@ <xsl:for-each select="field"> <xsl:choose> <xsl:when test='@type = "union"'> - <tr><td><span class="keyword">union</span> {</td></tr> + <tr><td colspan="3"><span class="keyword">union</span> {</td></tr> <tr> <td><table> <xsl:for-each select="union/field"> @@ -397,31 +397,35 @@ </span> </td> <td><xsl:value-of select="@name"/></td> - <xsl:if test="@info != ''"> - <td> + <td> + <xsl:if test="@info != ''"> <div class="comment"> <xsl:call-template name="dumptext"> <xsl:with-param name="text" select="@info"/> </xsl:call-template> </div> - </td> - </xsl:if> + </xsl:if> + <xsl:comment> </xsl:comment> + </td> </tr> </xsl:for-each> </table></td> - <td></td></tr> - <tr><td>}</td> - <td><xsl:value-of select="@name"/></td> - <xsl:if test="@info != ''"> + <td colspan="2"><xsl:comment> </xsl:comment></td> + </tr> + <tr> + <td>}</td> + <td><xsl:value-of select="@name"/></td> <td> - <div class="comment"> - <xsl:call-template name="dumptext"> - <xsl:with-param name="text" select="@info"/> - </xsl:call-template> - </div> + <xsl:if test="@info != ''"> + <div class="comment"> + <xsl:call-template name="dumptext"> + <xsl:with-param name="text" select="@info"/> + </xsl:call-template> + </div> + </xsl:if> + <xsl:comment> </xsl:comment> </td> - </xsl:if> - <td></td></tr> + </tr> </xsl:when> <xsl:otherwise> <tr> @@ -433,15 +437,16 @@ </span> </td> <td><xsl:value-of select="@name"/></td> - <xsl:if test="@info != ''"> - <td> + <td> + <xsl:if test="@info != ''"> <div class="comment"> <xsl:call-template name="dumptext"> <xsl:with-param name="text" select="@info"/> </xsl:call-template> </div> - </td> - </xsl:if> + </xsl:if> + <xsl:comment> </xsl:comment> + </td> </tr> </xsl:otherwise> </xsl:choose> -- 2.43.0

On Thu, Feb 29, 2024 at 03:58:14PM +0100, Peter Krempa wrote:
While checking API docs after recent migration to gitlab-pages I've noticed that the footer is not properly rendered.
A deeper dig showed that the issue is that empty <div> elements, while formatted by the XML output version of XSLT conversion is turned into the non-pair empty variant. This contradicts the HTML standard which we declare to use for our pages and thus is mis-parsed into dom, where every empty div is treated as a start of a div, thus nesting everything.
This series fixes the above and also many other errors pointed out by the HTML validator at:
Compare:
https://libvirt.org/html/libvirt-libvirt-domain.html
vs
https://pipo.sk.gitlab.io/-/libvirt/-/jobs/6288946946/artifacts/website/html...
and the validation:
https://validator.w3.org/nu/?doc=https%3A%2F%2Flibvirt.org%2Fhtml%2Flibvirt-...
vs
https://validator.w3.org/nu/?doc=https%3A%2F%2Fpipo.sk.gitlab.io%2F-%2Flibvi...
Also non-API pages validation:
https://validator.w3.org/nu/?doc=https%3A%2F%2Flibvirt.org%2Fdocs.html
vs
https://validator.w3.org/nu/?doc=https%3A%2F%2Fpipo.sk.gitlab.io%2F-%2Flibvi...
Peter Krempa (9): docs: site: Don't generate '<?xml' header for HTML documents docs: page: Add 'lang="en"' for all HTML output documents docs: page: Fix declaration of main javascript source docs: index: Fix import of blog planet javascript docs: newapi: Don't generate empty <div> in template for ACL permissions docs: newapi: Avoid empty <div>s when there is no description docs: newapi: Avoid table where every row has an cell with 'colspan' docs: newapi: Properly skip ACL entries if empty docs: newapi: Fix generation of type definition tables
docs/index.rst | 2 +- docs/newapi.xsl | 158 ++++++++++++++++++++++++++---------------------- docs/page.xsl | 7 ++- docs/site.xsl | 5 +- 4 files changed, 93 insertions(+), 79 deletions(-)
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
participants (2)
-
Pavel Hrdina
-
Peter Krempa