Skip to content

gh-82045: Correct and deduplicate "isprintable" docs; add test.#130118

Merged
encukou merged 2 commits into
python:mainfrom
StanFromIreland:is-printable-finish
Feb 14, 2025
Merged

gh-82045: Correct and deduplicate "isprintable" docs; add test.#130118
encukou merged 2 commits into
python:mainfrom
StanFromIreland:is-printable-finish

Conversation

@StanFromIreland

@StanFromIreland StanFromIreland commented Feb 14, 2025

Copy link
Copy Markdown
Member

Finishing this up for Petr.


We had the definition of what makes a character "printable" documented in three places, giving two different definitions.

The definition in the comment on _PyUnicode_IsPrintable was inverted; correct that.

With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database. That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.

Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.

Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.

Also add a thorough test that the implementation agrees with this definition.

Author: Greg Price gnprice@gmail.com
Date: Tue Jul 30 22:55:00 2019 -0700


📚 Documentation preview 📚: https://cpython-previews--130118.org.readthedocs.build/

We had the definition of what makes a character "printable"
documented in three places, giving two different definitions.

The definition in the comment on `_PyUnicode_IsPrintable` was
inverted; correct that.

With that correction, the two definitions turn out to be equivalent --
but to confirm that, you have to go look up, or happen to know, that
those are the only five "Other" categories and only three "Separator"
categories in the Unicode character database.  That makes it hard for
the reader to tell whether they really are the same, or if there's
some subtle difference in the intended semantics.

Fix that by cutting the C API docs' and the C comment's copies of
the subtle details, in favor of referring to the Python-level docs.
That ensures it's explicit that these are all meant to agree, and
also lets us concentrate improvements to the wording in one place.

Speaking of which, borrow some ideas from the C comment, along with
other tweaks, to hopefully add a bit more clarity to that one
newly-centralized copy in the docs.

Also add a thorough test that the implementation agrees with
this definition.

Author:    Greg Price <gnprice@gmail.com>
Date:      Tue Jul 30 22:55:00 2019 -0700
Co-authored-by: Stan Ulbrych <stanulbrych@gmail.com>
@encukou encukou changed the title gh-15300: Correct and deduplicate "isprintable" docs; add test. gh-82045: Correct and deduplicate "isprintable" docs; add test. Feb 14, 2025
@encukou encukou merged commit 3402e13 into python:main Feb 14, 2025
@encukou encukou added needs backport to 3.12 only security fixes needs backport to 3.13 bugs and security fixes labels Feb 14, 2025
@miss-islington-app

Copy link
Copy Markdown

Thanks @StanFromIreland for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington-app

Copy link
Copy Markdown

Thanks @StanFromIreland for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

@miss-islington-app

Copy link
Copy Markdown

Sorry, @StanFromIreland and @encukou, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 3402e133ef26736296c07992266a82b181a5d532 3.12

@miss-islington-app

Copy link
Copy Markdown

Sorry, @StanFromIreland and @encukou, I could not cleanly backport this to 3.13 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 3402e133ef26736296c07992266a82b181a5d532 3.13

@StanFromIreland StanFromIreland deleted the is-printable-finish branch February 14, 2025 17:23
StanFromIreland added a commit to StanFromIreland/cpython that referenced this pull request Feb 14, 2025
…pythonGH-130118)

We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.

With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database.  That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.

Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.

Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.

Also add a thorough test that the implementation agrees with this definition.

Author: Stan Ulbrych <stanulbrych@gmail.com>

Co-authored-by: Greg Price <gnprice@gmail.com>
(cherry picked from commit 3402e13)
StanFromIreland added a commit to StanFromIreland/cpython that referenced this pull request Feb 14, 2025
…pythonGH-130118)

We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.

With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database.  That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.

Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.

Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.

Also add a thorough test that the implementation agrees with this definition.

Author:    Greg Price <gnprice@gmail.com>

Co-authored-by: Greg Price <gnprice@gmail.com>
(cherry picked from commit 3402e13)
@terryjreedy terryjreedy removed needs backport to 3.12 only security fixes needs backport to 3.13 bugs and security fixes labels Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants