Skip to content

Conversation

mazer-ai
Copy link

@mazer-ai mazer-ai commented Jan 23, 2025

Problem: The numpy format parser from docstring_parser doesn't remove the default value information from type_name, resulting in a mismatch between the name in the function signature and the docstring for any parameters specifying a default value in the docstring.

The numpy style doc is vague about "correct" way to specify default values, this patch handles two common patterns:

Parameters
-----------
foo: Optional[int] = 10
    description of foo here

bar: Optional[int], default is 10
    descriptiopn of bar here

Both of which appear in numpy code and numpy-related projects.

Note that this could also be handled in docstring_parser, but since current pydoclint depends on.a forked version of the package, I opted to handle here it here, where it seemed simpler.

Also, this could be done using a regex to search for and exclude the default info, but poking around, the general consensus seems to be that use of in or .find() for short, static strings is faster than using re.match for
something like this.

Finally, this problem doesn't occur for ReST formatted docstrings, and the specification for default values in google style docstrings is even more poorly defined than for numpy, but from the examples I found on-line, the current parser works fine. So this seems to really just be a numpy issue.

@jsh9
Copy link
Owner

jsh9 commented Jan 26, 2025

Thanks! Let me take a look.

@mazer-ai
Copy link
Author

Shoot -- are those CI errors from my changes? The tests run clean on my local python 3.12 install (macOS ventura).
I'm having a bit of trouble parsing the errors -- seems to have something to do with a \_ escape sequence in test_args.py, but can't figure out how anything I touched would change that.

# bar: int = 10 # noqa: E800
for k, metadata in enumerate(self.parsed.meta):
if metadata.args[0] == 'param':
# use of `in` can be replaced with a pre-compiled `re`, but
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mazer-ai , could you double check your comment here? Because I don't see the use of in here. Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry -- originally used in and then replaced with .find so I could get the substring position in one go.. Both seem faster than trying to use an regex here.. Will update comments.

@jsh9
Copy link
Owner

jsh9 commented Jan 29, 2025

Hi @mazer-ai , could you add some test cases to check your code changes in this PR? Thank you!

# supports a couple different specs:
# Parameters
# ----------
# foo: int, default 10
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we support all the 3 styles mentioned here?

image

And could you also add a note in the documentation (at least in docs/notes_for_users.md, and preferably also in Section 2.7 of README)?

Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all supported. I added some examples to the new test to confirm.

@jsh9
Copy link
Owner

jsh9 commented Jan 29, 2025

Shoot -- are those CI errors from my changes? The tests run clean on my local python 3.12 install (macOS ventura). I'm having a bit of trouble parsing the errors -- seems to have something to do with a \_ escape sequence in test_args.py, but can't figure out how anything I touched would change that.

Hi @mazer-ai , after you make changes, you can run pre-commit run --all-files to auto-format the code, and also run tox to check the CI pipeline locally. (You'd need to install pre-commit and tox in your development environment.)

@mazer-ai
Copy link
Author

mazer-ai commented Jan 30, 2025

Shoot -- are those CI errors from my changes? The tests run clean on my local python 3.12 install (macOS ventura). I'm having a bit of trouble parsing the errors -- seems to have something to do with a \_ escape sequence in test_args.py, but can't figure out how anything I touched would change that.

Hi @mazer-ai , after you make changes, you can run pre-commit run --all-files to auto-format the code, and also run tox to check the CI pipeline locally. (You'd need to install pre-commit and tox in your development environment.)

@jsh9 When I run tox locally, I get those same errors from test_args.py (sorry, first time I've used tox -- used to just running pytest by hand):

tests/utils/test_arg.py
<unknown>:19: SyntaxWarning: invalid escape sequence '\_'
<unknown>:208: SyntaxWarning: invalid escape sequence '\_'
<unknown>:209: SyntaxWarning: invalid escape sequence '\_'
<unknown>:322: SyntaxWarning: invalid escape sequence '\_'
<unknown>:323: SyntaxWarning: invalid escape sequence '\_'

Not sure why these escape sequences are present in test_args.py, but they seem to be illegal escape sequences:

mazer@bridger $ python
Python 3.12.4 (main, Jun  6 2024, 18:26:44) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 'arg1\_\_'
<stdin>:1: SyntaxWarning: invalid escape sequence '\_'
'arg1\\_\\_'
>>>

Any idea if I'm missing something obvious here? Or is this something strange about my dev env?

Sorry about turning something simple into something complicated...

@jsh9
Copy link
Owner

jsh9 commented Jan 30, 2025

Shoot -- are those CI errors from my changes? The tests run clean on my local python 3.12 install (macOS ventura). I'm having a bit of trouble parsing the errors -- seems to have something to do with a \_ escape sequence in test_args.py, but can't figure out how anything I touched would change that.

Hi @mazer-ai , after you make changes, you can run pre-commit run --all-files to auto-format the code, and also run tox to check the CI pipeline locally. (You'd need to install pre-commit and tox in your development environment.)

@jsh9 When I run tox locally, I get those same errors from test_args.py (sorry, first time I've used tox -- used to just running pytest by hand):

tests/utils/test_arg.py
<unknown>:19: SyntaxWarning: invalid escape sequence '\_'
<unknown>:208: SyntaxWarning: invalid escape sequence '\_'
<unknown>:209: SyntaxWarning: invalid escape sequence '\_'
<unknown>:322: SyntaxWarning: invalid escape sequence '\_'
<unknown>:323: SyntaxWarning: invalid escape sequence '\_'

Not sure why these escape sequences are present in test_args.py, but they seem to be illegal escape sequences:

mazer@bridger $ python
Python 3.12.4 (main, Jun  6 2024, 18:26:44) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 'arg1\_\_'
<stdin>:1: SyntaxWarning: invalid escape sequence '\_'
'arg1\\_\\_'
>>>

Any idea if I'm missing something obvious here? Or is this something strange about my dev env?

Sorry about turning something simple into something complicated...

Hi, don't worry about those warnings. Those existed before your PR.

@mazer-ai
Copy link
Author

@jsh9 I think this should address your comments. Let me know if you see anything else.

@mazer-ai mazer-ai requested a review from jsh9 January 30, 2025 22:15
```python
Parameters
----------
arg1 : int, optional
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mazer-ai , I think we probably need more work to specify the behavior here.

Because I think int, optional is a valid way in the docstring for arg1: int | None = None in the function signature, so we should add the logic to check this in the code.

````

The portion following the type hints are ignored and not checked for
congruence with the function signature.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we want to allow annotating default values in the docstring, we need to check the congruence (between default value in the docstring and the default value in the function signature). Otherwise the behavior is surprising and implicit (i.e., we selectively check something, but not others).

Could you think of a way to also check the congruence of the default values as well?

@jsh9
Copy link
Owner

jsh9 commented Feb 16, 2025

Reminder:

We need to update this section in README:

pydoclint/README.md

Lines 187 to 193 in 2e8af22

#### 2.7.3. Pitfall: default values of arguments
_pydoclint_ does not like adding default values of arguments in the docstring,
even if this style is allowed in the numpy docstring style guide.
For more rationale, please read
[this page](https://jsh9.github.io/pydoclint/notes_for_users.html#3-notes-on-writing-type-hints).

@jsh9
Copy link
Owner

jsh9 commented Sep 20, 2025

Hi @mazer-ai , in some earlier code changes (5b850f1, and #258), I added this feature for both numpy style and google style. Newer versions (such as 0.7.3) supports this feature.

Users need to set --check-arg-defaults to True, and currently, only one style of specifying defaults is supported:

my_arg : int, default=0

Styles like "default is 0", "default = 0", "default: 0", etc. are not recognized. I think this restriction is a sensible trade-off to ensure uniform style within the same project.

Therefore, I'm closing this PR. Thank you for your efforts!

If you have other comments/suggestions, please feel free to open a new issue.

@jsh9 jsh9 closed this Sep 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants