Skip to content

feat: update restricted file extensions#4316

Closed
touchweb-vincent wants to merge 19 commits into
coreruleset:mainfrom
touchweb-vincent:patch-9
Closed

feat: update restricted file extensions#4316
touchweb-vincent wants to merge 19 commits into
coreruleset:mainfrom
touchweb-vincent:patch-9

Conversation

@touchweb-vincent
Copy link
Copy Markdown
Contributor

@touchweb-vincent touchweb-vincent commented Nov 3, 2025

EDIT :

This PR has been splitted to clarify things.


Hello,

Here’s a proposal to add some detections in the main tx.restricted_extensions list:

.back / .bck / .bk / .bkp – backup variants
.sav
.sh

I’ve created two new datasets : restricted_extensions_extended (PL2) and restricted_extensions_document (PL3).

restricted_extensions_extended block - PL2 : .7z/ .br/ .bz/ .bz2/ .cab/ .cpio/ .gz/ .gzip/ .jar/ .json/ .lz/ .lzh/ .rar/ .tar/ .tbz/ .tbz2/ .tgz/ .txz/ .txt/ .wpress/ .xml/ .zst/ .xz/ .yaml/ .yml/ .zip/ .zst/

restricted_extensions_document - PL3 : .doc/ .docm/ .docx/ .csv/ .odg/ .odp/ .ods/ .odt/ .oxps/ .pdf/ .pptm/ .pptx/ .xlf/ .xls/ .xlsm/ .xlsx/ .xsd/ .xslt/

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Nov 3, 2025

📊 Quantitative test results for language: eng, year: 2023, size: 10K, paranoia level: 1:
🚀 Quantitative testing did not detect new false positives

Copy link
Copy Markdown
Member

@RedXanadu RedXanadu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the title of this PR is misleading. This looks like a rework of these rules, not just an 'update'.

There seem to be 6 new rules, by my count, and many rule changes happening here. The PR is a bit difficult to read as one big PR and to really see what is being proposed…

Could you please list the changes you are proposing so they can be better and more easily considered/reviewed? For example: "Add to default block list: .foo, .bar", "Add new rule to block: .baz", "Move .tar detection to new rule at PL3", etc.

I'm not sure I agree with some of the changes here, but it's genuinely difficult to see exactly what the changes are / what the reasoning is here.

@touchweb-vincent
Copy link
Copy Markdown
Contributor Author

@RedXanadu i edited the description of the PR - does it look OK?

Copy link
Copy Markdown
Member

@EsadCetiner EsadCetiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@touchweb-vincent Your changes to PL-1 looks fine, but I'm not entirely sure if I agree with CRS blocking the .sh file extension.

I don't like blocking compressed files, I think a targeted approach is much better such as here to handle issues related to compressed files. That should remain as a suggestion inside of crs-setup.conf.

Same goes for blocking office documents, txt files, json, etc. This might make sense for a specific environment but I don't like it being a default.

@touchweb-vincent
Copy link
Copy Markdown
Contributor Author

We’ve been seeing daily scans targeting archive files for years, with many different patterns coming from various bots.

I don’t have a better proposal than suggesting a rule that filters them by default - it’s set to PL2 because we all know there will be some false positives, although they should normally be quite limited.

I don’t see how you could specifically filter archives like .zip files - their names are always unpredictable.

I don’t find these lists any more “specific” than the one already defined in the restricted_extensions variable.

These two new lists simply complement it, with varying levels of paranoia depending on the false positives observed.

@EsadCetiner
Copy link
Copy Markdown
Member

@touchweb-vincent

We’ve been seeing daily scans targeting archive files for years, with many different patterns coming from various bots.

Do you know what they've been targeting specifically? That would help make a better informed decision.

I don’t find these lists any more “specific” than the one already defined in the restricted_extensions variable.

Some of these entries just don't make sense to me, how often are you really placing office documents within webroot? For something like Nextcloud or Sharepoint these documents would be behind authentication, this doesn't seen like something CRS should be blocking out of the box.

@touchweb-vincent
Copy link
Copy Markdown
Contributor Author

touchweb-vincent commented Nov 4, 2025

Here’s a sample of what we’ve seen passing through. Many self-adapting patterns based on the hostname are also observed.

__MACOSX.zip
_dump.sql
.lz
.sql.lz
.sql.rar
.sql.tar
.sql.zip
.sqlite
_bak.sql.zip
_wp-content.zip
1.rar
1.tar
1.txt
1.zip
123.txt
123.zip
123456.tar.gz
2022.tar.gz
2024.lz
2024.rar
2024.sql
2024.sqlite
2024.tar
2024.zip
2017_backup.bz
2017_backup.gz
2017_backup.tar
2017_backup.tgz
2017_backup.tmp
a.bz
a.tar
a.txt
a.zip
aa.tgz
access_tokens.db
account.rar
account.tar
account.zip
admin.lz
admin.rar
admin.sql
admin.sqlite
admin.tar
admin.txt
admin.zip
administrator.zip
admins.zip
adminweb.tar
all.tar
api.lz
api.rar
api.sql
api.tar
api.zip
app.lz
app.sqlite
app.rar
app.tar
app.zip
application.rar
application.tar
application.zip
april_backup.sql.tar
arc.zip
archive.zip
assest.rar
assest.tar
assest.zip
assets.rar
assets.tar
assets.zip
b.txt
b.zip
back.rar
back.tar.gz
back.zip
back2024.zip
back/Archive.zip
back/backup.sql.tar
back/mysql.sql
back/public_html.tar.gz
back/web.zip
back/website.zip
backend.rar
backend.tar
backend.zip
backoffice.rar
backoffice.tar
backoffice.zip
backup.7z
backup.bak
backup.gzip
backup.lz
backup.rar
backup.sql
backup.tar
backup.tar.gz
backup.zip
backup_1.lz
backup_1.tar
backup_1.sql
backup_1.tar
backup_1.tar
backup_1.zip
backup_2.lz
backup_2.tar
backup_2.sql
backup_2.zip
backup_2022.bkp
backup_3.lz
backup_3.sql
backup_3.tar
backup_3.zip
backup_4.lz
backup_4.sql
backup_4.tar
backup_4.zip
backup_archive_db.sql
backup_archive_wp.sql
backup_db.tar
backup_final.sql
backup_site_wp.sql
backup-db.tar
backup-sql-wordpress.sql
backup-wordpress-db.gz
backup-wordpress-db.tar
backup1.zip
backup123.sql.tar
backup2.zip
backup2022.sql
backup2023.sql
backup2023.zip
backup2024.zip
backup3.zip
backupdb.tgz
backupfiles.tar
backuplocalhost.sql.tar
backups.lz
backups.rar
backups.sql
backups.sqlite
backups.tar
backups.zip
backup/
backups/
bak/
bak.lz
bak.sql
bak.tgz
bak.zip
base.sql
base2024.zip
billing.zip
bill.zip
bills.zip
bin.lz
bin.rar
bin.sql
bin.tar
bin.zip
bk.sql
blog.rar
blog.tar
blog.zip
boutique.tar
cards.rar
cards.zip
cc.rar
cc.zip
clickandbuilds.zip
clients.rar
clients.tar
clients.zip
com.bz
com.gz
com.rar
com.sql
com.tar
com.tgz
com.zip
common.rar
common.tar
common.zip
conf.rar
conf.tar
conf.zip
conf/conf.lz
conf/conf.rar
conf/conf.sql
conf/conf.sqlite
conf/conf.zip
configurations.rar
configurations.tar
configurations.zip
credentials.txt
content.zip
cron.rar
cron.tar
cron.zip
crons.rar
crons.tar
crons.zip
d.zip
dashboard.rar
dashboard.tar
dashboard.zip
data.lz
data.rar
data.sql.lz
data.sql.rar
data.sql.tar
data.sql.zip
data.sqlite
data.tar
data.zip
database.lz
database.rar
database.sql
database.zip
database_02.gz
database_02.sql
database_structure.tar
database-backup.sql
database-backup.tar
database-backup.tgz
database-backup.zip
database-backup-wp.sql
database-full-backup.sql
databasedump.tar
databases.sql
databases.tar
databases.zip
daily.sql
db.lz
db.rar
db.sql
db.tar
db.tgz
db.zip
db_archive_wp.sql
db_backup.tar
db_config.sql
db_copy_backup.gz
db_data.sql
db-archive.sql
db-backup-wp.sql
db-dump.sql
db-full-backup.zip
db2024.zip
dbdump.sql
dbs.rar
dbs.zip
dev.txt
dev.zip
directory.rar
directory.zip
docs.zip
domain.zip
domains.zip
download.zip
dump.lz
dump.rar
dump.sql
dump.tar
dump.zip
entire-wp-backup.sql
environments.rar
environments.tar
environments.zip
eshop.zip
export_data.sql
export_db.sql
export_db.tar
file.rar
file.sql
file.tar
file.zip
files.rar
files.sql
files.tar
files.zip
final_backup.sql
final_wordpress_backup.sql
full_backup.sql
fr.rar
fr.tar
fr.tgz
fr.sql
fr.zip
ftp.lz
ftp.rar
ftp.sql
ftp.sqlite
ftp.tar
ftp.zip
ftp1.zip
function.rar
function.tar
function.zip
functions.rar
functions.tar
functions.zip
gate.zip
hdocs.bz
hdocs.lz
hdocs.rar
hdocs.sql
hdocs.sqlite
hdocs.tar
hdocs.zip
htdocs.lz
htdocs.rar
htdocs.sql
htdocs.sqlite
htdocs.tar
htdocs.tgz
htdocs.zip
html.lz
html.rar
html.sqlite
html.sql
html.tar
html.tar.gz
html.zip
html_backup.sql.tar.gz
htmlbackup.sql.tar
httpdocs.rar
httpdocs.tar
httpdocs.tgz
httpdocs.zip
includes.zip
import.rar
import.zip
index.zip
inetpub.lz
inetpub.rar
inetpub.sql
inetpub.sqlite
inetpub.zip
joomla.zip
js.rar
js.tar
js.zip
land.7z
last.rar
last.zip
Latest.tgz
local-backup.sql.tgz
localhost.sql
localhost_backup.sql.gz
localhost_database.sql
localhost-backup.tar
localhostdump.sql
localhostdump.tar
localhostdump.zip
old.tar
orders.tar
orders.tar.gz
package.lz
package.rar
package.sql
package.sqlite
package.tar
package.zip
panel.zip
pass.zip
password.zip
passwords.zip
payment.rar
payment.tar
payment.zip
payments.zip
pma.sql
private.zip
prod.zip
public.lz
public.sql
public.tar
public.zip
public_html.bz
public_html.lz
public_html.rar
public_html.sql
public_html.sqlite
public_html.tar
public_html.txt
public_html.zip
public2023.tar
magento.sql
magento.txt
magento.zip
main.sql
main.zip
manage.rar
manage.tar
manage.zip
master.rar
master.tar
master.zip
mdb.txt
mdb.zip
media.lz
media.rar
media.sql
media.sqlite
media.tar
media.zip
member.rar
member.tar
member.zip
members.rar
members.tar
members.zip
module.rar
module.tar
module.zip
modules.rar
modules.tar
modules.zip
my.rar
my.tar
my.zip
my.rar
my.tar
my.zip
mydb_backup.tar
mysql.rar
mysql.zip
new.rar
new.tar.gz
new.sql
new.zip
NON.tar.gz
old.lz
old.rar
old.sql
old.tar.gz
old.zip
old/.bash_history
old/Archive.zip
old/backup.sql.tar
old/bak.gz
old/directory.rar
old/index.zip
old/public_html.zip
old/www.tar
old-2024.tgz
oldbackup.tgz
oldbkp.tar
oldbkp.tgz
oldsite.tar.gz
order.rar
order.tar
order.zip
orders.rar
orders.zip
output.lz
output.sql
output.sqlite
output.rar
output.zip
pass.zip
password.zip
passwords.zip
plugin.rar
plugin.tar
plugin.zip
plugins.rar
plugins.tar
plugins.zip
pub/1.rar
pub/123.rar
pub/b.rar
pub/back.rar
pub/bk.rar
pub/cfg.rar
pub/config.rar
pub/contact.rar
pub/cron.rar
pub/db.rar
pub/file.rar
pub/fm.rar
pub/i.rar
pub/in.rar
pub/index2.rar
pub/indexx.rar
pub/indx.rar
pub/ini.rar
pub/info.rar
pub/install.rar
pub/load.rar
pub/log.rar
pub/m.rar
pub/manager.rar
pub/online.rar
pub/php.rar
pub/phpinfo.rar
pub/pm.rar
pub/s.rar
pub/seo.rar
pub/sh.rar
pub/up.rar
pub/upload.rar
pub/v.rar
pub/ve.rar
public_backup.sql
public_shtml.tar
Release.lz
Release.rar
Release.sql
Release.sqlite
Release.zip
resources.rar
resources.tar
resources.zip
restore/backup.gz
restore/public_html.rar
restore/www.sql
restore/www.zip
ROOT.lz
ROOT.rar
ROOT.sql
root.sql
ROOT.sqlite
ROOT.tar
root.tar
root.tgz
ROOT.zip
root.zip
secure_wp_db.sql.gz
server.zip
site.rar
site.sql
site.tar
site.zip
site_data_backup.sql
site1.tgz
site-full-backup.sql
sitepack.tar.gz
sites.zip
sh.zip
shell.zip
shop.rar
shop.tar
shop.zip
sql.lz
sql.rar
sql.sql
sql.tar
sql.zip
sql/backup.sql
sql/database.sql
sql/dump.sql
sql/localhost.sql
sql/public_html.gz
sql/sql.sql
sql_dump.sql
sql_export_wp.gz
sqldump.tgz
smiff.rar
smiff.zip
snif.rar
snif.zip
src.lz
src.rar
src.sql
src.sqlite
src.tar
src.zip
ssh.zip
ssl.zip
stage.rar
stage.tar
stage.zip
staging.rar
staging.tar
staging.zip
static.zip
static1.zip
store.rar
store.tar
store.zip
structure.sql
system.rar
system.tar
system.zip
temp.lz
temp.rar
temp.sql
temp.sqlite
temp.tar
temp.zip
test.lz
test.rar
test.sql
test.sqlite
test.tar
test.zip
tmm_db_migrate.sql
tmp.lz
tmp.rar
tmp.sql
tmp.sqlite
tmp.tar
tmp.zip
valid.rar
valid.zip
upload.lz
upload.sqlite
upload.rar
upload.sql
upload.tar
upload.zip
uploads.lz
uploads.sql
uploads.tar
uploads.zip
users.txt
users.zip
w.txt
w.zip
web.lz
web.rar
web.sql
web.tar
web.zip
web1.zip
webapps.lz
webapps.rar
webapps.sql
webapps.sqlite
webapps.tar
webapps.zip
webbak.tar
webbak.tgz
webbak.zip
website.lz
website.rar
website.tar
website.tgz
website.zip
website_backup.sql
website-db-backup.sql
wordpress.sql
wordpress.tar.gz
wordpress.zip
wordpress_export_archive.sql
wordpress_final_secure_backup.sql
wordpress_latest_secure_backup.sql
wordpress-database-backup.sql
wordpress-database-gzip.tar
wordpress-database-zip.sql
wordpress-sql-backup.zip
wordpress-tar-gz-db-backup.tgz
wp.sql
wp.tar
wp.tgz
wp.zip
wp_database.sql.zip
wp_db_export.zip
wp_encrypted_archive.sql
wp_snapshot.sql
wp_sql_dump.sql
wp_user_backup.sql
wp_users.sql
wp-back.tar
wp-back.tgz
wp-content%20(1).zip
wp-content.tar
wp-content_2022.tar
wp-content_2023.tgz
wp-content2023.tgz
wp-content.zip
wp-database.zip
wp-database-tar-backup.tar
wp-db.gz
wp-db-snapshot.tgz
wp-content/1.sql
wp-content/admin.sql
wp-content/admin2023.sql
wp-content/admin-backup.tgz
wp-content/all.sql
wp-content/all.tar
wp-content/all.tgz
wp-content/all.zip
wp-content/alldb_backup.tar
wp-content/back.sql
wp-content/back.tar
wp-content/backup/backup.sql
wp-content/backup-site.sql
wp-content/backups.sql
wp-content/backupwp.sql
wp-content/bk.sql
wp-content/data.sql
wp-content/database.sql
wp-content/database_dump.sql
wp-content/databse.tar
wp-content/db.sql
wp-content/db_backup.tar
wp-content/dump.sql
wp-content/file-backup.tar
wp-content/files.sql
wp-content/htdocs.tar
wp-content/htmlback.sql
wp-content/latest.zip
wp-content/localhost.sql
wp-content/localhost_dump.sql
wp-content/localhostdb.sql
wp-content/localhostdb.zip
wp-content/old.sql
wp-content/oldbackup.zip
wp-content/oldbckp.sql
wp-content/plugins%20(1).zip
wp-content/public_html.sql
wp-content/public_html.tar
wp-content/result.tar
wp-content/server.sql
wp-content/site.sql
wp-content/site.zip
wp-content/site_backup.tgz
wp-content/site1.tar
wp-content/site2.tar
wp-content/sql.sql
wp-content/src.tar
wp-content/tmm_db_migrate.sql
wp-content/websitebackup.tar
wp-content/web.sql
wp-content/wordpress.sql
wp-content/wordpress.tar
wp-content/wordpress.tgz
wp-content/wordpress.zip
wp-content/wp-includes.zip
wp-content/www.sql
wp-content/wwwroot2022.tgz
wp-includes.tgz
wp-site-database.sql
wpdb_backup.sql
wpdb_snapshot.sql
wp/backup(1).tar
www.bz
www.bz2
www.lz
www.rar
www.sql
www.tar
www.tar.bz2
www.tar.gz
www.zip
www_dump.sql
wwwback.tar
wwwdata.tar
wwwroot.lz
wwwroot.rar
wwwroot.sql
wwwroot.sqlite
wwwroot.tar
wwwroot.zip
wwwroot2022.tgz
wwwroot2024.zip
wwwrootback.tgz

It’s actually quite common for us to find .csv or .pdf files exposed to everyone under predictable paths, in directories not protected by .htaccess, and filled with personal data from a GDPR standpoint.

These types of documents are particularly prone to that kind of exposure - and since there would inevitably be (many) edge cases to handle, we’ve placed this rule in PL3.

We can move it to PL4 if you’d feel more comfortable with that.

@EsadCetiner
Copy link
Copy Markdown
Member

@touchweb-vincent Thanks, some of those entries look tricky to block without modifying the restricted file extensions.

I'll add this to the agenda for discussion.

@touchweb-vincent
Copy link
Copy Markdown
Contributor Author

touchweb-vincent commented Nov 7, 2025

@michelamarie
Copy link
Copy Markdown

Hello. For what it's worth, I think it is good that we regularly evaluate blocked files, and I'm generally in favour of expanding the set of blocked file extensions and patterns.

It's true that malicious bots and crackers are constantly scanning the web for many of the aforementioned file extensions. They do this to harvest whatever data they can, and I'm sure they find a lot of confidential data by doing so. One may be surprised and dismayed to learn how often developers put and leave database dumps, deploy scripts, sensitive log files, documentation, and other sensitive information in the document root directories of Internet applications -- I see this happen a lot, even though I warn against doing so. For that reason, at my workplace, we have our own custom rules to block many of these extensions that are not included in CRS rules.

That said, I think this list includes combinations that are very rarely actually associated with any files that actually exist (examples: wp-content2023.tgz, staging.rar, webapps.sqlite). We have to remember that while we do want to have outstanding coverage, CRS and all WAFs are intended to be a generalised form of front-line defence, rather than something than single-handedly block every possible threat. As well, WAF administrators are welcome to make their own rules to expand coverage based on their risk profiles, attack surfaces, and applications, and I personally, encourage others to do so. I don't think it is necessary or preferred that some of the more unusual patterns listed above are included in CRS rules, because maintaining such a large list would be an administrative burden for the CRS team that would very likely yield negligible benefit. Here, it's also worth remembering that just because scripts are scanning for certain patterns, does not mean they actually correspond to any real file or location on any given end-point. Instead, I would be a lot more comfortable blocking all .tgz, .rar, and .sqlite files, rather than patterns like 'wp-content2023.tgz' -- not only is this method easier to manage for CRS maintainers, it is more straightforward for WAF administrators to make exclusions. If we make CRS administration overly challenging or confusing, we make it more likely administrators will be to make wide exclusions, stick to PL1, or to simply switch to cloud solutions like Cloudflare, which degrades security for everyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants