gh-82045: Correct and deduplicate "isprintable" docs; add test.#130118
Merged
Conversation
We had the definition of what makes a character "printable" documented in three places, giving two different definitions. The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that. With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database. That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics. Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place. Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs. Also add a thorough test that the implementation agrees with this definition. Author: Greg Price <gnprice@gmail.com> Date: Tue Jul 30 22:55:00 2019 -0700 Co-authored-by: Stan Ulbrych <stanulbrych@gmail.com>
|
Thanks @StanFromIreland for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12. |
|
Thanks @StanFromIreland for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13. |
|
Sorry, @StanFromIreland and @encukou, I could not cleanly backport this to |
|
Sorry, @StanFromIreland and @encukou, I could not cleanly backport this to |
StanFromIreland
added a commit
to StanFromIreland/cpython
that referenced
this pull request
Feb 14, 2025
…pythonGH-130118) We had the definition of what makes a character "printable" documented in three places, giving two different definitions. The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that. With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database. That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics. Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place. Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs. Also add a thorough test that the implementation agrees with this definition. Author: Stan Ulbrych <stanulbrych@gmail.com> Co-authored-by: Greg Price <gnprice@gmail.com> (cherry picked from commit 3402e13)
StanFromIreland
added a commit
to StanFromIreland/cpython
that referenced
this pull request
Feb 14, 2025
…pythonGH-130118) We had the definition of what makes a character "printable" documented in three places, giving two different definitions. The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that. With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database. That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics. Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place. Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs. Also add a thorough test that the implementation agrees with this definition. Author: Greg Price <gnprice@gmail.com> Co-authored-by: Greg Price <gnprice@gmail.com> (cherry picked from commit 3402e13)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Finishing this up for Petr.
We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
The definition in the comment on
_PyUnicode_IsPrintablewas inverted; correct that.With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database. That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.
Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.
Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.
Also add a thorough test that the implementation agrees with this definition.
Author: Greg Price gnprice@gmail.com
Date: Tue Jul 30 22:55:00 2019 -0700
📚 Documentation preview 📚: https://cpython-previews--130118.org.readthedocs.build/