Skip to content

fix(942410): cleaning of duplicates with 942151#4336

Merged
fzipi merged 35 commits into
coreruleset:mainfrom
touchweb-vincent:patch-10
Jan 27, 2026
Merged

fix(942410): cleaning of duplicates with 942151#4336
fzipi merged 35 commits into
coreruleset:mainfrom
touchweb-vincent:patch-10

Conversation

@touchweb-vincent
Copy link
Copy Markdown
Contributor

@touchweb-vincent touchweb-vincent commented Nov 13, 2025

Hello,

This PR initially started as a simple typo fix, but then @fzipi pointed out a duplication that didn’t make much sense either.

As a result, the sequences from regex-assembly/942410.ra that were present in regex-assembly/include/sql-injection-function-names.ra but missing from regex-assembly/exclude/sql-injection-function-names-fps-pl1.ra have been removed.

The unit tests for rule 94210, which were no longer relevant, have been redistributed under the unit tests for 942151.

Two sequences (likelihood and unlikely) that were present in regex-assembly/exclude/sql-injection-function-names-fps-pl1.ra but missing from regex-assembly/942410.ra have been added back to the set.

Comments have also been added at the top of the unit test files to explain how to handle false positives across the different rules.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Nov 13, 2025

📊 Quantitative test results for language: eng, year: 2023, size: 10K, paranoia level: 1:
🚀 Quantitative testing did not detect new false positives

fzipi
fzipi previously approved these changes Nov 13, 2025
@fzipi
Copy link
Copy Markdown
Member

fzipi commented Nov 13, 2025

I wonder why we don't use the same regexp on this one 🤔

@touchweb-vincent
Copy link
Copy Markdown
Contributor Author

touchweb-vincent commented Nov 13, 2025

You’re right, there are duplicate sequences - they should probably be cleaned up. I’ll take a look at this as soon as possible.

This rule contains more high false-positive risk sequences (including “mod”) https://www.diffchecker.com/fr/1ZnlZDrj/

@touchweb-vincent
Copy link
Copy Markdown
Contributor Author

I think there’s another typo in “cr32”; it should be “crc32” in MySQL. But maybe “cr32” exists in a database system I’m not thinking of (like Oracle or PostgreSQL).

@touchweb-vincent
Copy link
Copy Markdown
Contributor Author

touchweb-vincent commented Nov 13, 2025

@fzipi i clean up duplicate - if you want check :

<?php

$data_942151 = explode("\n", file_get_contents('942151'));
$data_942410 = explode("\n", file_get_contents('942410'));

foreach ($data_942151 as $d) {
    $array_ogin[trim($d)] = 1;
}

foreach ($data_942410 as $d) {
    if (!isset($array_ogin[trim($d)])) {
        $array_942410[] = trim($d);
    }
}
file_put_contents('942410_new', implode("\n", $array_942410));

i fix another typo on last_insert_id in #4337 - reason why you will not find it in this PR.

@touchweb-vincent touchweb-vincent changed the title fix(942410): wrong function name fix(942410): cleaning of doublon with 942151 Nov 13, 2025
@touchweb-vincent touchweb-vincent changed the title fix(942410): cleaning of doublon with 942151 fix(942410): Cleaning of duplicates with 942151 Nov 13, 2025
@touchweb-vincent
Copy link
Copy Markdown
Contributor Author

I had to reorganize the unit tests. Sorry in advance for the headache it’ll give to whoever does the review…

Comment thread rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf
@fzipi
Copy link
Copy Markdown
Member

fzipi commented Jan 24, 2026

@touchweb-vincent Can you fix conflicts?

@touchweb-vincent
Copy link
Copy Markdown
Contributor Author

@fzipi done

Comment thread regex-assembly/942410.ra Outdated
fzipi
fzipi previously approved these changes Jan 27, 2026
Copy link
Copy Markdown
Member

@fzipi fzipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR cleans up duplicate SQL function names between rules 942151 (PL1) and 942410 (PL2), removing 174 duplicate functions from 942410 while maintaining all detection capabilities.


Rule Architecture

Rule 942151 (Paranoia Level 1)

  • Formula: include/sql-injection-function-names.ra MINUS exclude/sql-injection-function-names-fps-pl1.ra
  • Contains: 268 SQL functions
  • Excludes: 10 high-FP functions (convert, left, lower, position, likelihood, unlikely, etc.)

Rule 942410 (Paranoia Level 2)

  • Purpose: Detect high-FP SQL functions that are too risky for PL1
  • Before: 239 functions (171 duplicates + 68 unique)
  • After: 67 unique functions

Changes Made

✓ Removes Duplicates (174 functions)

  • Removed 171 SQL functions already covered by rule 942151
  • Removed 3 obsolete entries (typos like cr32 instead of crc32)

✓ Adds Missing Functions (2 functions)

  • likelihood and unlikely (were in exclude list but missing from old 942410)

✓ Test Redistribution (50 tests)

  • Moved 50 tests from 942410.yaml942151.yaml
  • Tests for functions that now belong to 942151
  • Perfect match: 50 removed = 50 added

✓ Documentation Improvements

  • Added comment to 942410.ra explaining it's an extended version with high-FP sequences
  • Added comment to exclude/sql-injection-function-names-fps-pl1.ra noting additions should also go to 942410.ra

Final State: 942410 Now Contains 67 Functions

Category 1: From Exclude List (10 functions)

High FP at PL1, checked at PL2:

convert, degrees, elt, left, likelihood, lower, position, quarter, space, unlikely

Category 2: Other Unique Functions (57 functions)

Very common words that would cause massive FPs at PL1:

abs, acos, avg, bin, cast, char, charset, chr, count, date, day, default,
field, floor, format, hour, if, in, is, last, length, ln, local, log, max,
min, minute, mod, month, now, password, pi, power, rawtonhex, rawtonhextoraw,
repeat, replace, reverse, right, round, second, sign, sleep, stddev, sum,
tan, time, to_char, to_days, to_nchar, to_seconds, upper, user, values,
version, week, year

Rationale: These are legitimate SQL functions but also extremely common English words/programming terms.


Validation Results

Check Status Details
Logic No detection capabilities lost
Duplicates All duplicates properly removed
PL1/PL2 separation Maintains correct paranoia level distinction
Tests 50 tests cleanly migrated, none lost
Documentation Clear explanations added
False positives Quantitative testing passed (no new FPs)

Benefits

  1. Eliminates 172 lines of duplicate regex patterns - smaller compiled rules
  2. Improves maintainability - clear separation of concerns between PL1/PL2
  3. Better documentation - explains relationship between files
  4. Prevents future duplication - guidance for contributors
  5. Maintains functionality - all existing detections still work

Co-authored-by: Felipe Zipitría <3012076+fzipi@users.noreply.github.com>
@fzipi fzipi added this pull request to the merge queue Jan 27, 2026
Merged via the queue into coreruleset:main with commit 7ff533a Jan 27, 2026
8 checks passed
@touchweb-vincent touchweb-vincent deleted the patch-10 branch January 27, 2026 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants