Feature or enhancement
Implement the set operations of Unicode Technical Standard #18 RL1.3 in re character classes, together with nested sets.
gh-74534 added FutureWarnings in Python 3.7 for the ambiguous constructs (--, &&, ~~, ||, and a leading [) as preparation for this syntax; this issue turns them into operators.
Two character sets are combined by an operator, where an operand may be a nested set in brackets:
[A--B] — difference: a character in A but not in B.
[A&&B] — intersection: a character in both A and B.
[A||B] — union: a character in A or B.
[A~~B] — symmetric difference: a character in A or B but not both.
Operators have no precedence and apply left to right; nested sets are used to group. A leading ^ complements the whole result.
For example, [a-z--[aeiou]] matches an ASCII lowercase consonant and [\w&&[a-z]] matches an ASCII lowercase letter.
Linked PRs
Feature or enhancement
Implement the set operations of Unicode Technical Standard #18 RL1.3 in
recharacter classes, together with nested sets.gh-74534 added
FutureWarnings in Python 3.7 for the ambiguous constructs (--,&&,~~,||, and a leading[) as preparation for this syntax; this issue turns them into operators.Two character sets are combined by an operator, where an operand may be a nested set in brackets:
[A--B]— difference: a character inAbut not inB.[A&&B]— intersection: a character in bothAandB.[A||B]— union: a character inAorB.[A~~B]— symmetric difference: a character inAorBbut not both.Operators have no precedence and apply left to right; nested sets are used to group. A leading
^complements the whole result.For example,
[a-z--[aeiou]]matches an ASCII lowercase consonant and[\w&&[a-z]]matches an ASCII lowercase letter.Linked PRs