feat(921200): move regexp to regex-assembly#4409
Conversation
Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>
|
📊 Quantitative test results for language: |
There was a problem hiding this comment.
Pull request overview
This pull request moves the LDAP injection detection regex (rule 921200) from inline format to regex-assembly format to enhance readability and maintainability. The PR also adds comprehensive test coverage with descriptive test names and organizes tests into negative (false positive) and positive (should trigger) categories.
Changes:
- Created new regex-assembly file (921200.ra) with documented LDAP injection detection patterns
- Updated the rule configuration to use the newly assembled regex
- Enhanced test coverage from 10 to 30 test cases with descriptive names and better organization
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| regex-assembly/921200.ra | New regex-assembly file with LDAP injection patterns, character class definitions, and comprehensive documentation of attack patterns |
| rules/REQUEST-921-PROTOCOL-ATTACK.conf | Updated regex pattern for rule 921200 compiled from the regex-assembly file |
| tests/regression/tests/REQUEST-921-PROTOCOL-ATTACK/921200.yaml | Expanded from 10 to 30 test cases, improved test descriptions, reorganized with clear negative/positive test sections |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…r class definitions (#4411) * Initial plan * fix(921200): use separate character class definitions to preserve original regex semantics - Created ldap-prefix-chars definition (excludes : but NOT comma) - Created ldap-pattern-chars definition (excludes comma and = but NOT colon) - Updated prefix to use ldap-prefix-chars - Updated pattern 1 to use ldap-pattern-chars - Updated pattern 3 to use ldap-pattern-chars for first char class and ldap-prefix-chars for second char class - Generated regex is identical to original, preserving matching behavior - Validated with crs-toolchain regex compare and format Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> * docs(921200): improve clarity of character class documentation - Reordered character class definitions to highlight differences (colon first vs comma first) - Clarified that ldap-prefix-chars excludes colon and special chars but NOT comma - Clarified that ldap-pattern-chars excludes comma and equals but NOT colon Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: fzipi <3012076+fzipi@users.noreply.github.com>
franbuehler
left a comment
There was a problem hiding this comment.
I found a small difference between the old and new regex. See comment.
But all attacks are detected with both regexes.
Tests look good too.
| # * https://github.com/SpiderLabs/owasp-modsecurity-crs/issues/276#issue-126581660 | ||
|
|
||
| SecRule REQUEST_COOKIES|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@rx ^[^:\(\)\&\|\!<>\~]*\)\s*(?:\((?:[^,\(\)\=\&\|\!<>\~]+[><~]?=|\s*[&!|]\s*(?:\)|\()?\s*)|\)\s*\(\s*[\&\|\!]\s*|[&!|]\s*\([^\(\)\=\&\|\!<>\~]+[><~]?=[^:\(\)\&\|\!<>\~]*)" \ | ||
| SecRule REQUEST_COOKIES|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@rx ^[^!&\(\):<>\|~]*\)[\s\x0b]*(?:\((?:[^!&\(\),<->\|~]+[<>~]?=|[\s\x0b]*[!&\|][\s\x0b]*[\(\)]?[\s\x0b]*)|\)[\s\x0b]*\([\s\x0b]*[!&\|][\s\x0b]*|[!&\|][\s\x0b]*\([^!&\(\),<->\|~]+[<>~]?=[^!&\(\):<>\|~]*)" \ |
There was a problem hiding this comment.
There is a small difference in the second NOT character class:
Old version: [^,\(\)\=\&\|\!<>\~]
New version: [^!&\(\),<->\|~]. -> Difference is the new - instead of the old =.
I don't know if this is on purpose or not.
Besides of that, the regex is identical.
I'll now have a look at the tests, the research papers and if the attacks are detected.
There was a problem hiding this comment.
I've just seen, this is also the same as <-> means <,> and =. So this looks good to me.
franbuehler
left a comment
There was a problem hiding this comment.
This looks good to me.
what
why
refs