-
-
Couldn't load subscription status.
- Fork 3.7k
Description
Converting the RST input file https://www.kernel.org/doc/Documentation/process/license-rules.rst
to HTML with pandoc (current version 3.8) results in a severe formatting problem of a so-called "simple table" at the end of the file (as can be seen e.g. at https://fossies.org/linux/kernel/v6.17/linux-6.17-rc6.tar.gz/linux-6.17-rc6/Documentation/process/license-rules.rst). The expected behavior (converted by sphinx) can be seen e.g. at https://www.kernel.org/doc/html/v6.17-rc6/process/license-rules.html (also at the end under "MODULE_LICENSE").
Since I am not familiar with RST and pandoc, I could not determine whether the error is in the input file or whether it is a pandoc rendering error. At least for me, the problem seems to be related to the whitespace at the beginning of the table content and the combination of spaces and tabs. Unfortunately, I have not yet found a specification for how whitespace should be handled in this context.
By the way, the "simple table" example within the pytablewriter’s documentation
also did not format correctly with pandoc.
For simplification I created four simpler test cases (T=tab, expanding here to 8 spaces) and checked it with "sphinx" and "pandoc" (by the way, "rst2html" from the "docutils" package gave the same results as "sphinx"):
Example 1 (starting point):
The width of the first column is deliberately chosen as 9 (1 for a character + 8 for an expanded tab)
to avoid further problems.
Source:
========= ======
1 abcdef
2 abcdef
========= ======
1a) "sphinx" HTML output rendered by "firefox" (only a basic representation):
1 abcdef
2 abcdef
Okay, result as expected.
1b) "pandoc" HTML output rendered by "firefox" (only a basic representation):
1 abcdef
2 abcdef
Hmm, result principally okay, but the table width for column 1 is ignored.
Example 2 (before the "2" is a single space):
Source:
========= ======
1 abcdef
2 abcdef
========= ======
2a) "sphinx" HTML output rendered by "firefox" (only a basic representation):
1 abcdef
2 abcdef
Okay, result identical to example 1.
2b) "pandoc" HTML output rendered by "firefox" (only a basic representation):
1 abcdef
2 abcdef
Hmm, an added empty line and formatting problems.
Example 3 (before the "2" is a single tab):
Source:
========= ======
1 abcdef
T2 abcdef
========= ======
3a) "sphinx" HTML output rendered by "firefox" (only a basic representation):
1 abcdef
2 abcdef
Okay, result identical to example 1
(it appears that "sphinx" expands existing tabs to 8 characters each before converting).
3b) "pandoc" rendered HTML (principally):
1 abcdef
2 abcd ef
Hmm, an added empty line and more formatting problems.
Example 4 (more like the problematic part in the original mentioned Linux license file; after the "1" is a single tab; column 1 of row 2 empty resp. "without" a character -> column spanning?):
Source:
========= =============
1T abcdef
T abcdef
========= =============
4a) "sphinx" HTML output rendered by "firefox" (only a basic representation):
1 abcdef abcdef
Okay (?), the "spanned" text from row 2 is added to column 2 of row 1.
4b) "pandoc" rendered HTML (principally):
1 abcd ef
abcd ef
Hmm, not the expected single line with a spanned column 2, but an added empty line and more formatting problems.
Sorry for the somewhat incomplete report.