feat: update restricted files and file extensions by EsadCetiner · Pull Request #4299 · coreruleset/coreruleset

EsadCetiner · 2025-10-21T04:48:07Z

This PR was compiled from multiple sources:

Nikto Scan
Files commonly accessed by bots
CrowdSec's Sec Lists (https://github.com/crowdsecurity/sec-lists/blob/master/web/sensitive_data.txt, https://github.com/crowdsecurity/sec-lists/blob/master/web/path_traversal.txt)
https://github.com/github/gitignore
Sensitive files I personally found
GitLab omnibus config files

I've removed entries that were not relevant for web applications to keep false positives to a minimum, and commented out ones that cause false positives.

These rules should be overhauled in the medium to long term, but this should be fine for now. I've opened an issue in #4300 to discuss further improvements to these rules.

closes #4248

github-actions · 2025-10-21T04:48:44Z

📊 Quantitative test results for language: eng, year: 2023, size: 10K, paranoia level: 1:
🚀 Quantitative testing did not detect new false positives

azurit

Looks good but it is a bigger PR so i suggest also someone else looks at it.

touchweb-vincent · 2025-10-24T10:06:27Z

A heartfelt thought for all the Ms. and Mr. PACs who will no longer be able to register on sites using email addresses like XX.pac@

This file is one of those I plan to propose splitting, if you agree with the approach detailed in the other PR.

Believe it or not, we’ve actually seen some “Mister Boto” accounts on the E-commerce sites we manage.

Indeed, you can say that OWASP seniors will create proper exclusions, since the scope of these rules is far from arguments such as email addresses.

However, the vast majority of hosting professionals will simply disable the rule in a hurry after receiving a telling-off from their e-commerce clients - and never re-enable it again

EsadCetiner · 2025-10-24T10:44:53Z

@touchweb-vincent Thanks for picking that up, turns out .pac doesn't have any security impact for web apps so I've removed it.

touchweb-vincent · 2025-10-25T05:30:41Z

Thanks.

However, the issue remains the same with all sequences of fewer than 5 characters that match the pattern

^[\.\w]{1,4}$

And this dataset is also unusable as-is in e-commerce contexts because of base64/high entropy inputs (bank signature / marketing cookie etc), where we need to isolate

^[\w\/]{1,6}$

That's why it would be great if you allowed me to split theses datasets.

I understand that this may be an unpleasant moment for the long-time members - ourselves included - but we’re preparing for the future by making things easier for newcomers.

EsadCetiner · 2025-10-25T05:45:26Z

@touchweb-vincent That's because mainly (In my opinion at least) we're reaching the limit as to what can be done using the @PmFromFile operator. I opened an issue at #4248 to discuss moving these rules to using a regex (Among other things) which should allow us to build more complex rules and maybe create a suffix we can add to all entries to reduce false positives to a minimum.

I'm happy to make a small change to this PR to reduce false positives, but bigger changes will take more effort.

fzipi · 2025-10-25T14:03:24Z

So, .pac files are for proxy autoconfiguration. I've used them in the past and they might disclose information about internal services. Just my 2 cents.

azurit · 2025-10-26T10:06:45Z

Should be readded.

touchweb-vincent · 2025-10-27T07:45:30Z

I agree that switching from PmFromFile to regex assembly can help make the rules more precise and therefore reduce false positives.

However, splitting becomes much harder for third parties - especially for newcomers. It’s far easier to split and manage a simple file than to break apart a complex regex assembly.

dune73 · 2025-10-27T07:52:34Z

Do you see users splitting rules so they can exclude some of the patterns, but not all, as part of their local integration?

I would argue that only PROs with an advanced understanding of CRS are doing that and for them making the step to regex assembly is doable. Also based on the comments provided with the rule.

touchweb-vincent · 2025-10-27T07:57:42Z

After talking extensively with many of them while running the diagnoser, the overwhelming majority simply disable the rule because of false positives. But for the small percentage who try to do it properly, they take the time to rationalize the rules based on the risk of false positives.

For those users, moving from PmFromFile to regex assembly makes things more complicated than it actually solves.

At least in its current state, and without splitting those datasets by false positive risk - typically against base64 or high-entropy inputs.

dune73 · 2025-10-27T08:00:50Z

These insights are very interesting. Thank you.

azurit · 2025-10-27T08:27:07Z

I still don't get why writing a proper exclusion is not a solution here.

touchweb-vincent · 2025-10-27T08:38:04Z

Azurit - please don’t take this as criticism or anything unpleasant; that’s really not my intention. And if it comes across that way, it’s only because of my limited command of English.

Have you ever deployed these rules across more than a hundred PHP e-commerce sites, each generating over €1M per year, and therefore relying on a very large volume of third-party integrations, including ad networks?

I can assure you, it’s an abject chaos. The people trying to deploy OWASP rules don’t have the time to do it properly unless we help them at least a little by rationalizing these datasets by false positive risk.

azurit · 2025-10-27T08:42:07Z

I'm running CRS globally, without option for users to disable it, on my own webhosting services which i'm providing for ~10 thousands of domains (all sort of web software - WordPress, Joomla, Drupal, OpenCart, Magento, Prestashop, ...).

touchweb-vincent · 2025-10-27T08:48:03Z

How many purely e-commerce sites generating over one million per year do you manage - sites that therefore very likely have dozens of dependencies with third-party services such as logistics providers, mailing companies, or retargeting/marketing platforms?

Have you enabled all the rules up to at least PL2? If so, at what score do you start blocking? 30? 60?

I’d be curious to run our WAF diagnoser on your infrastructure to better understand your tolerance levels.

azurit · 2025-10-27T08:49:22Z

I'm running on PL1 which is recommended if you don't want to mess up with FPs too much. Blocking from 5.

azurit · 2025-10-27T08:53:31Z

touchweb-vincent · 2025-10-27T09:08:53Z

If you are who you claim to be, Azurit, then please understand that you’re either among those who have shown total dedication for years to reach that level.

We’re trying to be that way as well. But we also see the difficulties our peers are facing - and many of them would really appreciate a bit of help.

If you have a SIEM that alerts you on every single block, then you know that high-entropy inputs are everywhere on ecommerce context - and that means thousands of exclusions to create.

If we rationalized this based on false positive risk, our peers would have significantly fewer issues, since they could more broadly disable the rule sets with a high false positive risk on high-entropy inputs.

I completely understand that it’s frustrating to think we’re adding more work for ourselves with additional exclusions - since we separate these datasets, we’ll also have extra work here, simply because third parties don’t have the time to handle things properly and end up disabling everything hastily at the first false positive.

Now, there’s a deeper reflection to be had about whether OWASP should acknowledge that the vast majority of people simply don’t have the time to manage this cleanly - and, as such, try to approach the problem differently.

Disabling a fragmented rule on a more limited scope is still less harmful than disabling the entire rule altogether.

This approach helps our peers take ownership of the project, and as a result, we help open-source solutions better protect themselves.

touchweb-vincent · 2025-10-27T09:23:25Z

We don’t have the statistical data here to quickly isolate the exclusions related only to PL1 - it’s true that this volume is mainly due to the fact that we strive to work at PL2, and partly at PL3.

We currently have more than 4,956 iterations of the keyword ctl:ruleRemove on our shared segment, and 5,003 on client-specific segments.

Please understand that when most third parties see dozens or even hundreds of exclusions to configure, they completely panic.

EsadCetiner · 2025-10-27T09:24:24Z

@azurit If you have 740 rule exclusions at PL-1 then I can see rule exclusions being in the thousands if your running at PL-2, anecdotally the false positives roughly double at PL-2. I don't see why @touchweb-vincent would exaggerate that claim, I can easily see that happening.

@touchweb-vincent @azurit Can you please at least move this discussion about splitting rules into an issue instead of discussing this in PRs that have absolutely no relevance to the subject.

azurit · 2025-10-27T09:27:33Z

Running on higher PL than 1 comes with FPs and exclusions. That's how it is (official from us, developers).

fzipi · 2025-10-27T10:04:03Z

@touchweb-vincent @azurit The discussions on experience, usage and other rules in different PLs are relevant but not here. Please create a different issue to discuss 🙏 🙏 🙏 .

fzipi · 2025-10-27T10:22:12Z

@touchweb-vincent The restricted extensions apply on rule 920440 only, and the rule applies to REQUEST_BASENAME only, not on any ARGS. When you mentioned about Mr.pac, is this a common finding? E.g. applications are sending POST /register/surname/Mr.pac as REST parameters?

touchweb-vincent · 2025-10-27T10:35:56Z

@fzipi .pac (lfi-os-files.data) was in lfi-os-files.data which has a broader scope :

coreruleset/rules/REQUEST-930-APPLICATION-ATTACK-LFI.conf

Line 99 in dc750a6

    
           SecRule REQUEST_COOKIES|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@pmFromFile lfi-os-files.data" \

fzipi · 2025-10-27T10:54:31Z

Perfect, now it makes more sense.

EsadCetiner · 2026-01-14T05:14:41Z

I updated this PR to include files for GitLab omnibus.

Copilot

Pull request overview

This PR updates the restricted files and file extensions lists used by the OWASP Core Rule Set (CRS) to detect and block access to sensitive files. The updates are compiled from multiple security sources including Nikto Scan, CrowdSec's security lists, GitHub's gitignore templates, and GitLab omnibus configuration files.

Changes:

Added numerous sensitive configuration files, credentials files, cache directories, and system files to the restricted lists
Added new restricted file extensions (.jks, .swap, .swo, .tfstate) to prevent access to Java keystores, swap files, and Terraform state files
Consolidated some directory paths (e.g., replaced specific var/www subdirectories with the parent var/www)

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
rules/restricted-upload.data	Added sensitive files and directories to upload restrictions; contains duplicate entries and missing cross-file consistency
rules/restricted-files.data	Added sensitive files, GitLab configs, Python cache, Windows system files, MySQL configs, log files, and directory roots; has alphabetical ordering issues
rules/lfi-os-files.data	Added sensitive files and /etc directories for various services (GitLab, Docker, mail servers, etc.); has alphabetical ordering issue and a typo
rules/REQUEST-901-INITIALIZATION.conf	Updated default restricted extensions list to include .jks, .swap, .swo, and .tfstate
crs-setup.conf.example	Updated documentation to reflect new restricted extensions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fzipi

I would go on and approve this one, as it is compliant on how we have managed things up to now.

There is context in this PR that should go to an issue for further discussion, as I think there is value in trying to figure it out what is next for the project.

Who will take the lead and follow up?

EsadCetiner · 2026-01-26T01:53:31Z

@fzipi The discussion that happened in this was completely unrelated to this PR, it was about splitting CRS rules based on their character length to different rules. There originally was this PR which proposed to do such a thing for one rule, but then this PR was hijacked to discuss that topic #4304.

I did open another issue though that was generally about improving the local file inclusion rules to improve the detection of true positives: #4300

feat: update restricted files and file extensions

34db29b

EsadCetiner added the release:new-detection In this PR we introduce a new detection label Oct 21, 2025

EsadCetiner and others added 2 commits October 23, 2025 14:58

fix false positive with json password fields

52e9c01

Merge branch 'main' into feat-update-restricted-files

014f82b

azurit reviewed Oct 24, 2025

View reviewed changes

Comment thread rules/lfi-os-files.data Outdated

azurit reviewed Oct 24, 2025

View reviewed changes

Comment thread rules/lfi-os-files.data Outdated

apply code review suggestions

61025f9

azurit previously approved these changes Oct 24, 2025

View reviewed changes

fix: remove .pac

124dd78

EsadCetiner dismissed azurit’s stale review via 124dd78 October 24, 2025 10:44

EsadCetiner and others added 2 commits October 27, 2025 14:19

add .pac

ab195e2

Merge branch 'main' into feat-update-restricted-files

635f69a

theseion mentioned this pull request Nov 3, 2025

Monthly Chat Agenda November (2025-11-03) #4317

Closed

Merge branch 'main' into feat-update-restricted-files

c361827

github-actions Bot added the Stale label Dec 18, 2025

EsadCetiner removed the Stale label Dec 22, 2025

theseion mentioned this pull request Jan 5, 2026

Monthly Chat Agenda January 2026 (2026-01-05) #4392

Closed

feat: add gitlab omnibus files

df78319

Merge branch 'main' into feat-update-restricted-files

51a3dc1

fzipi requested a review from Copilot January 19, 2026 19:44

Copilot started reviewing on behalf of fzipi January 19, 2026 19:45 View session

Copilot AI reviewed Jan 19, 2026

View reviewed changes

EsadCetiner and others added 4 commits January 20, 2026 10:18

Apply suggestions from code review

92b5dc8

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

apply copilot suggestions

8836ec8

Merge branch 'main' into feat-update-restricted-files

1c96d42

Merge branch 'main' into feat-update-restricted-files

734b011

fzipi approved these changes Jan 26, 2026

View reviewed changes

EsadCetiner added this pull request to the merge queue Jan 26, 2026

Merged via the queue into coreruleset:main with commit 3f83126 Jan 26, 2026
8 checks passed

EsadCetiner deleted the feat-update-restricted-files branch January 26, 2026 01:55

theseion mentioned this pull request Feb 2, 2026

Monthly Chat Agenda February 2026 (2026-02-02) #4438

Closed

S0obi mentioned this pull request Feb 10, 2026

False positive for 930130 with .pac #4452

Closed

Uh oh!

Conversation

EsadCetiner commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

azurit left a comment

Choose a reason for hiding this comment

Uh oh!

touchweb-vincent commented Oct 24, 2025

Uh oh!

EsadCetiner commented Oct 24, 2025

Uh oh!

touchweb-vincent commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EsadCetiner commented Oct 25, 2025

Uh oh!

fzipi commented Oct 25, 2025

Uh oh!

azurit commented Oct 26, 2025

Uh oh!

touchweb-vincent commented Oct 27, 2025

Uh oh!

dune73 commented Oct 27, 2025

Uh oh!

touchweb-vincent commented Oct 27, 2025

Uh oh!

dune73 commented Oct 27, 2025

Uh oh!

azurit commented Oct 27, 2025

Uh oh!

touchweb-vincent commented Oct 27, 2025

Uh oh!

azurit commented Oct 27, 2025

Uh oh!

touchweb-vincent commented Oct 27, 2025

Uh oh!

azurit commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

azurit commented Oct 27, 2025

Uh oh!

touchweb-vincent commented Oct 27, 2025

Uh oh!

touchweb-vincent commented Oct 27, 2025

Uh oh!

EsadCetiner commented Oct 27, 2025

Uh oh!

azurit commented Oct 27, 2025

Uh oh!

fzipi commented Oct 27, 2025

Uh oh!

fzipi commented Oct 27, 2025

Uh oh!

touchweb-vincent commented Oct 27, 2025

Uh oh!

fzipi commented Oct 27, 2025

Uh oh!

EsadCetiner commented Jan 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fzipi left a comment

Choose a reason for hiding this comment

EsadCetiner commented Oct 21, 2025 •

edited

Loading

github-actions Bot commented Oct 21, 2025 •

edited

Loading

touchweb-vincent commented Oct 25, 2025 •

edited

Loading

azurit commented Oct 27, 2025 •

edited

Loading