feat: update restricted file extensions#4316
Conversation
for more information, see https://pre-commit.ci
|
📊 Quantitative test results for language: |
RedXanadu
left a comment
There was a problem hiding this comment.
I think the title of this PR is misleading. This looks like a rework of these rules, not just an 'update'.
There seem to be 6 new rules, by my count, and many rule changes happening here. The PR is a bit difficult to read as one big PR and to really see what is being proposed…
Could you please list the changes you are proposing so they can be better and more easily considered/reviewed? For example: "Add to default block list: .foo, .bar", "Add new rule to block: .baz", "Move .tar detection to new rule at PL3", etc.
I'm not sure I agree with some of the changes here, but it's genuinely difficult to see exactly what the changes are / what the reasoning is here.
|
@RedXanadu i edited the description of the PR - does it look OK? |
EsadCetiner
left a comment
There was a problem hiding this comment.
@touchweb-vincent Your changes to PL-1 looks fine, but I'm not entirely sure if I agree with CRS blocking the .sh file extension.
I don't like blocking compressed files, I think a targeted approach is much better such as here to handle issues related to compressed files. That should remain as a suggestion inside of crs-setup.conf.
Same goes for blocking office documents, txt files, json, etc. This might make sense for a specific environment but I don't like it being a default.
|
We’ve been seeing daily scans targeting archive files for years, with many different patterns coming from various bots. I don’t have a better proposal than suggesting a rule that filters them by default - it’s set to PL2 because we all know there will be some false positives, although they should normally be quite limited. I don’t see how you could specifically filter archives like .zip files - their names are always unpredictable. I don’t find these lists any more “specific” than the one already defined in the restricted_extensions variable. These two new lists simply complement it, with varying levels of paranoia depending on the false positives observed. |
Do you know what they've been targeting specifically? That would help make a better informed decision.
Some of these entries just don't make sense to me, how often are you really placing office documents within webroot? For something like Nextcloud or Sharepoint these documents would be behind authentication, this doesn't seen like something CRS should be blocking out of the box. |
|
Here’s a sample of what we’ve seen passing through. Many self-adapting patterns based on the hostname are also observed. __MACOSX.zip It’s actually quite common for us to find .csv or .pdf files exposed to everyone under predictable paths, in directories not protected by .htaccess, and filled with personal data from a GDPR standpoint. These types of documents are particularly prone to that kind of exposure - and since there would inevitably be (many) edge cases to handle, we’ve placed this rule in PL3. We can move it to PL4 if you’d feel more comfortable with that. |
|
@touchweb-vincent Thanks, some of those entries look tricky to block without modifying the restricted file extensions. I'll add this to the agenda for discussion. |
|
I begin a split of this PR to clarify things. |
|
Hello. For what it's worth, I think it is good that we regularly evaluate blocked files, and I'm generally in favour of expanding the set of blocked file extensions and patterns. It's true that malicious bots and crackers are constantly scanning the web for many of the aforementioned file extensions. They do this to harvest whatever data they can, and I'm sure they find a lot of confidential data by doing so. One may be surprised and dismayed to learn how often developers put and leave database dumps, deploy scripts, sensitive log files, documentation, and other sensitive information in the document root directories of Internet applications -- I see this happen a lot, even though I warn against doing so. For that reason, at my workplace, we have our own custom rules to block many of these extensions that are not included in CRS rules. That said, I think this list includes combinations that are very rarely actually associated with any files that actually exist (examples: wp-content2023.tgz, staging.rar, webapps.sqlite). We have to remember that while we do want to have outstanding coverage, CRS and all WAFs are intended to be a generalised form of front-line defence, rather than something than single-handedly block every possible threat. As well, WAF administrators are welcome to make their own rules to expand coverage based on their risk profiles, attack surfaces, and applications, and I personally, encourage others to do so. I don't think it is necessary or preferred that some of the more unusual patterns listed above are included in CRS rules, because maintaining such a large list would be an administrative burden for the CRS team that would very likely yield negligible benefit. Here, it's also worth remembering that just because scripts are scanning for certain patterns, does not mean they actually correspond to any real file or location on any given end-point. Instead, I would be a lot more comfortable blocking all .tgz, .rar, and .sqlite files, rather than patterns like 'wp-content2023.tgz' -- not only is this method easier to manage for CRS maintainers, it is more straightforward for WAF administrators to make exclusions. If we make CRS administration overly challenging or confusing, we make it more likely administrators will be to make wide exclusions, stick to PL1, or to simply switch to cloud solutions like Cloudflare, which degrades security for everyone. |
EDIT :
This PR has been splitted to clarify things.
Hello,
Here’s a proposal to add some detections in the main tx.restricted_extensions list:
.back / .bck / .bk / .bkp – backup variants
.sav
.sh
I’ve created two new datasets : restricted_extensions_extended (PL2) and restricted_extensions_document (PL3).
restricted_extensions_extended block - PL2 : .7z/ .br/ .bz/ .bz2/ .cab/ .cpio/ .gz/ .gzip/ .jar/ .json/ .lz/ .lzh/ .rar/ .tar/ .tbz/ .tbz2/ .tgz/ .txz/ .txt/ .wpress/ .xml/ .zst/ .xz/ .yaml/ .yml/ .zip/ .zst/
restricted_extensions_document - PL3 : .doc/ .docm/ .docx/ .csv/ .odg/ .odp/ .ods/ .odt/ .oxps/ .pdf/ .pptm/ .pptx/ .xlf/ .xls/ .xlsm/ .xlsx/ .xsd/ .xslt/