re: documentation claim that special characters lose their special meaning inside […] seems wrong

# Documentation

The claim at:
https://github.com/python/cpython/blob/d0c6ba956fca28785ad4dea6423cd44fd1124cad/Doc/library/re.rst?plain=1#L253-L255
seems wrong at least for `\`.

Consider the following example:
```
>>> bool(re.search(string=b"a\\b",pattern=b"[\\\n\r]"))
False
```

My expectation would be that after backslash-unescaping the `b"…"`-string, `pattern` is assigned the sequence of:<br>
literal `\`, the line-feed "character", the carriage-return "character"

If it would be true, that "Special characters lose their special meaning inside sets.", then the resolved `\` in the unescaped `pattern` should match the one in my test string `b"a\\b"`, however it does not.

I guess what Python actually "sees" is:<br>
backslash-escaped line-feed "character", the carriage-return "character"<br>
which probably effectively yields:<br>
the line-feed "character", the carriage-return "character"

Now you could argue that the `\` is not considered a special-character for the terms of the regular expression syntax... but it is, at least already because of:
https://github.com/python/cpython/blob/d0c6ba956fca28785ad4dea6423cd44fd1124cad/Doc/library/re.rst?plain=1#L504-L507
and ff..

Also, even the section that explains `[…]` mentions the escaping functionality of it:
https://github.com/python/cpython/blob/d0c6ba956fca28785ad4dea6423cd44fd1124cad/Doc/library/re.rst?plain=1#L249-L250


I think:
https://github.com/python/cpython/blob/d0c6ba956fca28785ad4dea6423cd44fd1124cad/Doc/library/re.rst?plain=1#L253-L255
should be improved to document that:
- `\` is exempt from this
- whether or this is only the case for characters that are actually special with respect to the RE bracket expression, i.e. `[0\-9]` is `0`, `-` and `9`, because the `-` was special in that position. But what about `[\-9]`? Here, the `-` would not have been special, so it the result `\`, `-` and `9` or just `-` and `9`?
- or whether this is simply the case for any character following the `\` ... ones that are special **outside** and RE bracket expression, like `\$`, `\D`. `\w` or `\number`... and/or ones that are never special, like `\ü`.

Thanks,
Chris.


### Linked PRs
* gh-106517
* gh-132365

	The special sequences consist of ``'\'`` and a character from the list below.
	If the ordinary character is not an ASCII digit or an ASCII letter, then the
	resulting RE will match the second character. For example, ``\$`` matches the
	character ``'$'``.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

re: documentation claim that special characters lose their special meaning inside […] seems wrong #106482

Documentation

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	* Special characters lose their special meaning inside sets. For example,
	``[(+*)]`` will match any of the literal characters ``'('``, ``'+'``,
	``'*'``, or ``')'``.

	``[0-9A-Fa-f]`` will match any hexadecimal digit. If ``-`` is escaped (e.g.
	``[a\-z]``) or if it's placed as the first or last character

Uh oh!

re: documentation claim that special characters lose their special meaning inside […] seems wrong #106482

Description

Documentation

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions