Skip to content

fix: handle case-mismatched zip entry names in DOCX pre-processing#1967

Open
hanhan761 wants to merge 1 commit into
microsoft:mainfrom
hanhan761:fix-1812-docx-badzipfile-casing
Open

fix: handle case-mismatched zip entry names in DOCX pre-processing#1967
hanhan761 wants to merge 1 commit into
microsoft:mainfrom
hanhan761:fix-1812-docx-badzipfile-casing

Conversation

@hanhan761
Copy link
Copy Markdown

Summary

Add _fix_zip_name_casing() to pre_process.py that patches local file header names to match the central directory before opening the zip. This prevents BadZipFile crashes on .docx files where zip entry names differ in casing between local headers and central directory (common with legal document systems and older Word versions).

Issue

Fixes #1812

Verification

  • Existing docx tests (test_docx_comments, test_docx_equations) pass
  • _fix_zip_name_casing() runs as the first step in pre_process_docx with no side effects on normal files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BadZipFile crash on .docx files with case-mismatched zip entry names

1 participant