-
Couldn't load subscription status.
- Fork 160
[Formatter] Recognise MarkDown headings, Lists, Comment Snippets & Tables in formatter #4418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Hi @mateusz-matela, please check this once you are available |
|
I don't think a setting for this is necessary. It made more sense for html tags, as one can put spaces and newlines inside/around them in many different ways that make sense, so one may not want to use the formatter's standard. But for markdown elements there's pretty much only one sensible notation, wouldn't you agree? Note that markdown also supports tables which require a similar approach: https://docs.oracle.com/en/java/javase/23/javadoc/using-markdown-documentation-comments.html#GUID-7269D6B1-BAA2-4260-A295-EBC3DDA3E69C After a glance at the code I can say that the new code would better fit in a separate |
Then I guess I'll remove the UI and option changes 👍 |
maybe this point is actually even stronger - if the setting is turned off, the formatter would mangle the headers and lists into one line, resulting in completely different javadoc content? Nobody would want that! As for tables - I guess they are much more complicated due to columns alignment - maybe that could be a separate issue (with an on/off setting this time?), but for now we'd need to at least detect them to make sure they are not touched. |
Yes, and different types of comments too https://docs.oracle.com/en/java/javase/23/javadoc/using-markdown-documentation-comments.html#GUID-7E6D6A81-1176-4CF2-85EE-97A86ACDA351 this I will do in another PR. I had hard time implementing this |
Agreed :D
Lists where complicated too due to indent style, but was manageable. For table pattern recognition do you have any suggestions ? |
f92dfce to
bf1e64b
Compare
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Fixed
Show fixed
Hide fixed
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Fixed
Show fixed
Hide fixed
ce5d986 to
b628847
Compare
b628847 to
eb48ce3
Compare
84dda38 to
86b0e4f
Compare
|
Hi @mateusz-matela SnippetsSnippetMarkdown.mp4Tablesfor tables currently the formatter recognise the pattern, however it still have some issues in processing column arrangement but it wont break the table structure now. TableMarkdown.mp4Could you please check the new changes ? |
|
Unrelated test failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for tables currently the formatter recognise the pattern, however it still have some issues in processing column arrangement but it wont break the table structure now.
I think full support would mean parsing the whole table, reformatting contents of each cell and determining width of each column as max width of column's cells in order to align everything. This would require a setting as someone my want to keep their unconventional alignment for some reason. I'm not sure it makes sense to implement a partial solution for now, it would be enough to leave tables untouched (as if the new setting is off)
....jdt.core.tests.model/src/org/eclipse/jdt/core/tests/formatter/FormatterRegressionTests.java
Outdated
Show resolved
Hide resolved
....jdt.core.tests.model/src/org/eclipse/jdt/core/tests/formatter/FormatterRegressionTests.java
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/TokenManager.java
Show resolved
Hide resolved
|
|
||
| if (!markerCharFound) { | ||
| this.buffer.append(isMarkdown ? "/// " : " * "); //$NON-NLS-1$ //$NON-NLS-2$ | ||
| this.buffer.append(isMarkdown ? "/// " : token.isSnippetForMarkdown() ? "/// " : " * "); //$NON-NLS-1$ //$NON-NLS-2$ //$NON-NLS-3$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks suspicious - isSnippetForMarkdown() can only be true if isMarkdown is also true, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While debugging I found markdown tokens sent through CommentsPreparator.formatCode(int, int, boolean) for snippet formatting are somehow converted to normal javaDoc types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would make more sense to fix these tokens creation so they have the matching type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @mateusz-matela
On further debugging TextEditsBuilder.bufferLineSeparator(Token, boolean) with snippets of both javaDoc <pre> and markdown ``` , I found in both only snippet start is of type Javadoc & Markdown rest all the actual code snippets are of TokenNameclass TokenNamepublic .. TokenNameRBRACE etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so maybe isMarkdown can be based on parent rather than token? Then it would be the same for all tokens in the same parent and isSnippetForMarkdown would not be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this approach but it made FormatterCommentsBugsTest.testBug236230d() , FormatterCommentsBugsTest.testBug236230b() & FormatterCommentsBugsTest.testBug236230c() tests failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change:
Token parentToken = this.parent.tm.get(this.parentTokenIndex);
boolean isTextBlock = parentToken.tokenType == TokenNameTextBlock;
boolean isMarkdown = parentToken.tokenType == TokenNameCOMMENT_MARKDOWN;
and removed calls to token.isSnippetForMarkdown() and all the formatter tests passed.
Is this how you tried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a different way, this way worked 👍
....jdt.core.tests.model/src/org/eclipse/jdt/core/tests/formatter/FormatterRegressionTests.java
Outdated
Show resolved
Hide resolved
| int tokenIndexLast = this.ctm.findIndex(endPos, ANY, true); | ||
| if (this.ctm.size() - 1 != tokenIndexLast) { | ||
| Token closingToken = this.ctm.get(tokenIndexLast); | ||
| closingToken.putLineBreaksBefore(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this forces empty line before every closing tick? I don't think it should.
Why is it not visible in tests? maybe code formatting overwrites it. There should be a test making sure that snippets that are not java code are not reformatted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this was used to create a blank line after a closing a snippet but closingToken.putLineBreaksBefore(2); was not need, I guess I might've missed to remove it.
There should be a test making sure that snippets that are not java code are not reformatted.
This breaks, need to handle this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a test making sure that snippets that are not java code are not reformatted.
handled 👍
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
085f2f3 to
76a755f
Compare
733a819 to
961f133
Compare
| setComplianceLevel(CompilerOptions.VERSION_23); | ||
| String input = """ | ||
| /// Markdown Snippet | ||
| /// ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the snippet is marked as another language, say python, will the formatter ignore it (as it should)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does not with using CommentsPreparator.formatCode(int, int, boolean)
Sounds like an another issue
61a1718 to
1c2f389
Compare
I have reverted custom formatting to keep current one with
This can be done as a separate issue after this pr 👍 |
|
Hi @mateusz-matela, could u please re-review this once you're available ? |
aab4a64 to
d6611e8
Compare
| public void testMarkdownMultiSnippetCommentsWithoutCode() throws JavaModelException { | ||
| setComplianceLevel(CompilerOptions.VERSION_23); | ||
| String input = """ | ||
| /// `` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.oracle.com/en/java/javase/23/javadoc/using-markdown-documentation-comments.html links to https://spec.commonmark.org/0.31.2/#fenced-code-block which says there must be at least 3 ticks to consider it a code snippet. Oh, and it's also allowed to used tildes instead of ticks - another case to handle!
The javadoc popup in Eclipse is inconsistent with 2 ticks - it shows this as a block with monospace font, but doesn't preserve linebreaks, so I'm not sure what the formatter should do - theoretically it should treat it as normal text and join everything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
....jdt.core.tests.model/src/org/eclipse/jdt/core/tests/formatter/FormatterRegressionTests.java
Outdated
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
|
|
||
| if (!markerCharFound) { | ||
| this.buffer.append(isMarkdown ? "/// " : " * "); //$NON-NLS-1$ //$NON-NLS-2$ | ||
| this.buffer.append(isMarkdown ? "/// " : token.isSnippetForMarkdown() ? "/// " : " * "); //$NON-NLS-1$ //$NON-NLS-2$ //$NON-NLS-3$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so maybe isMarkdown can be based on parent rather than token? Then it would be the same for all tokens in the same parent and isSnippetForMarkdown would not be needed.
| if (this.ctm.size() - 1 != tokenIndexLast) { | ||
| closingToken.putLineBreaksAfter(2); | ||
| } | ||
| if (this.options.comment_format_source) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this option is off, the also disableFormattingExclusively.
And there's also this optional language marker right after ticks - is this available here? If it's something different than java then we should disable formatting even if the code happens to correctly parse as java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled language marker ( ```java ) and disableFormattingExclusively in else part 👍
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
| int i = matcher.start(); | ||
| while (text.charAt(i) != '/') { | ||
| if (text.charAt(i) == '\t') { | ||
| currentIndent += 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this always be 2? Shouldn't it depend on tab width and how spaces align with tab-columns?
maybe TokenManager.getLength(int, int, int) will help here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your example the result in javadoc popup would be the same if you replaced the tab with either 2 or 4 spaces, so it's not conclusive.
Here's an example where the tab is definitely 4 spaces wide thus item C is on a deeper level than item B. If you replaced the tab with 2 spaces, item C would be on the same level as item B.

So to get the precise length of whitespace I this should work (how about renaming tokenIndex and listToken to bullet*?):
int bulletPosInLine = this.ctm.findSourcePositionInLine(bulletToken.originalStart);
int slashPosInLine = this.ctm.findSourcePositionInLine(this.ctm.get(bulletIndex - 1).originalEnd);
int bulletIndent = bulletPosInLine - slashPosInLine - 1;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this, but its breaking many other cases..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, I thought that all beginning slashes are included in the comment token structure and bulletIndex - 1 would point to them. But only the first /// is included, which looks like a mistake, it would be more consistent to not include any. It's a leftover from classical javadoc tokenization where it was necessary to handle line breaks after /**, but for markdown the first token should be skipped (it probably will affect some other places in code, but should make them simpler).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, the fix for slash position: int slashPosInLine = this.ctm.findSourcePositionInLine(text.lastIndexOf('/', matcher.start()) + node.getStartPosition());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, I thought that all beginning slashes are included in the comment token structure and
bulletIndex - 1would point to them. But only the first///is included, which looks like a mistake, it would be more consistent to not include any. It's a leftover from classical javadoc tokenization where it was necessary to handle line breaks after/**, but for markdown the first token should be skipped (it probably will affect some other places in code, but should make them simpler).
Tried this but its affecting entire markdown now. looks like a major architecture change
but regardless of first '///' token is not added in the comment structure in CommentsPreparator.tokenizeMultilineComment(Token), the pattern and whitespace is calculated based on the String text = this.tm.toString(node);
here the text will always skip first /// giving
- Item 1
/// - Item 1
/// - Item 1
Anyway, the fix for slash position:
Tried this, but still its giving invalid results
Speaking of tokenization of markdown comment, I also noticed that it will skip all the beginning slashes,
fixed
| String text = this.tm.toString(node); | ||
| Matcher matcher = MARKDOWN_LIST_PATTERN.matcher(text); // Check for MarkDown lists [Ordered & Unordered]) | ||
| int previousLevel = 0; | ||
| Map<Integer, Token> tokenIndents = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I finally get the idea how this works, though it's quite complicated. I found an example where it doesn't work:
/// - Item A
/// - Item B
/// - Item C
/// - Item D
Here item D in javadoc popup is at the same level as item B, but formatter puts it at the same indent as item A.
mybe it's just my personal brain patterns, but I think a simpler and more robust approach would be to have ArrayDeque<Integer> indentPerLevel instead of this map. For each level of the list it contains srcIndent of the first item on that level. Then with next item you just compare currentIndent with the end of the deque - if current srcIndent is greater, you add next level. Otherwise you remove from the deque until you find the same or smaller indent and that is the level of current item. So current level is equal to indentPerLevel.size() and the indent to set on listToken is 2 * indentPerLevel.size().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mateusz-matela, thanks for the new algo.
it fixes all the indent issue, however I noticed indent doest get applied on 1st markdown token, which is not related to current fix, but need to be fixed later. (Currently added test cases with some heading followed by lists)
I guess other number related issues can be related to indents too
https://www.markdownguide.org/basic-syntax/#adding-elements-in-lists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, so it looks like if currentIndent is too big compared to the last level then we should treat it as not a separate item and probably not preserve the line break.
I wonder if we have to reverse engineer all these rules or are there written down somewhere? And are they even strictly defined for all edge cases, or could some of them produce inconsistent results depending on the javadoc implementation?
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
79c1cc1 to
a9bfc3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and remember to reformat edited methods, I saw a few missing spaces and empty lines
....jdt.core.tests.model/src/org/eclipse/jdt/core/tests/formatter/FormatterRegressionTests.java
Outdated
Show resolved
Hide resolved
org.eclipse.jdt.core/formatter/org/eclipse/jdt/internal/formatter/CommentsPreparator.java
Outdated
Show resolved
Hide resolved
113c905 to
e8868cf
Compare
|
Reformatted and squashed 👍 |
|
It's weird, when I checkout commit e8868cf and run the formatter tests, there are 3 failures and 2 errors. But all github checks have passed. Do the tests also pass for you? |
This commits adds the support to recognize markdown Headings, Lists, Fences & Tables for formatter fix : eclipse-jdt#4337
|
e8868cf to
ec618b4
Compare
ec618b4 to
5221deb
Compare







This commits adds the support to recognize markdown headings, lists, comment fences & tables for formatter and provides constants for their enablement
fixes :
#4337
Lists
Lists.mp4
Headings
Headings.mp4
What it does
How to test
Author checklist