Drop and vendor-in lockfile and python-daemon dependencies#65513
Drop and vendor-in lockfile and python-daemon dependencies#65513jscheffl wants to merge 2 commits intoapache:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR removes the deprecated lockfile and python-daemon dependencies from Airflow core by vendoring minimal PID-file and daemonization utilities into airflow.utils, and updates affected providers/tests to use the new implementations (with version-based fallbacks in providers).
Changes:
- Drop
lockfileandpython-daemonfromairflow-coredependencies and introduce vendored replacements (airflow.utils.pidfile,airflow.utils.daemon). - Switch core daemon CLI utilities and process PID checks to the vendored implementations, updating related unit tests.
- Update Celery and Edge3 providers to conditionally import PID-file helpers based on the running Airflow version.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| providers/edge3/src/airflow/providers/edge3/version_compat.py | Adds an Airflow 3.2.2+ compatibility constant used for conditional imports. |
| providers/edge3/src/airflow/providers/edge3/cli/worker.py | Uses vendored PID-file removal helper on Airflow 3.2.2+, falls back to lockfile otherwise. |
| providers/edge3/src/airflow/providers/edge3/cli/signalling.py | Uses vendored PID-file helpers on Airflow 3.2.2+, falls back to lockfile otherwise. |
| providers/celery/tests/unit/celery/cli/test_celery_command.py | Updates daemon-related mocks to patch DaemonContext directly. |
| providers/celery/src/airflow/providers/celery/version_compat.py | Adds an Airflow 3.2.2+ compatibility constant used for conditional imports. |
| providers/celery/src/airflow/providers/celery/cli/celery_command.py | Uses vendored PID-file helpers on Airflow 3.2.2+, falls back to lockfile otherwise. |
| airflow-core/tests/unit/utils/test_pidfile.py | Adds unit tests for the new vendored PID-file utilities. |
| airflow-core/tests/unit/utils/test_daemon.py | Adds unit tests for the new vendored daemon context implementation. |
| airflow-core/tests/unit/cli/commands/test_kerberos_command.py | Updates daemon-related mocks to patch DaemonContext. |
| airflow-core/tests/unit/cli/commands/test_api_server_command.py | Updates daemon-related mocks to patch DaemonContext. |
| airflow-core/src/airflow/utils/process_utils.py | Switches PID lock import from lockfile to vendored PIDLockFile. |
| airflow-core/src/airflow/utils/pidfile.py | New vendored PID-file utilities replacing deprecated lockfile. |
| airflow-core/src/airflow/utils/daemon.py | New vendored daemon context replacing python-daemon. |
| airflow-core/src/airflow/cli/commands/daemon_utils.py | Switches CLI daemonization to vendored DaemonContext and TimeoutPIDLockFile. |
| airflow-core/pyproject.toml | Removes lockfile and python-daemon from core dependencies. |
| airflow-core/newsfragments/65513.significant.rst | Documents dependency removal and provider compatibility implications. |
|
|
||
| def test_creates_file_with_correct_permissions(self, tmp_path): | ||
| pidfile = str(tmp_path / "test.pid") | ||
| write_pid_to_pidfile(pidfile) |
There was a problem hiding this comment.
test_creates_file_with_correct_permissions asserts an exact mode of 0o644, but file creation mode is masked by the process umask. This makes the test environment-dependent/flaky. Set a known umask in the test (and restore it) or assert based on the current umask.
| write_pid_to_pidfile(pidfile) | |
| previous_umask = os.umask(0o022) | |
| try: | |
| write_pid_to_pidfile(pidfile) | |
| finally: | |
| os.umask(previous_umask) |
| def test_writes_current_pid(self, tmp_path): | ||
| pidfile = str(tmp_path / "test.pid") | ||
| write_pid_to_pidfile(pidfile) | ||
| content = open(pidfile).read() |
There was a problem hiding this comment.
This test reads the pidfile using open(...).read() without a context manager, leaving the file handle to be closed by GC. Use a with open(...) block to avoid leaking file descriptors during the test run.
| content = open(pidfile).read() | |
| with open(pidfile) as file: | |
| content = file.read() |
| import sys | ||
|
|
There was a problem hiding this comment.
Redundant inner import sys inside test_redirects_to_devnull_when_none (the module already imports sys at top). Please remove the inner import to keep imports consistent and avoid function-local imports.
| import sys |
| import atexit | ||
| import os | ||
| import pwd | ||
| import resource | ||
| import signal | ||
| import sys |
There was a problem hiding this comment.
airflow.utils.daemon unconditionally imports Unix-only modules (pwd, resource) and is imported at module import time by airflow.cli.commands.daemon_utils. This will raise ModuleNotFoundError on non-POSIX platforms even when --daemon isn’t used. Consider guarding these imports / limiting this module to POSIX (e.g., import-time fallback that raises a clear error only when daemonization is attempted).
| def _is_detach_process_context_required() -> bool: | ||
| if os.getppid() == 1: | ||
| return False | ||
| try: | ||
| if sys.__stdin__ is not None and hasattr(sys.__stdin__, "fileno"): | ||
| fd = sys.__stdin__.fileno() | ||
| import socket | ||
|
|
||
| sock = socket.fromfd(fd, socket.AF_INET, socket.SOCK_RAW) | ||
| sock.getsockopt(socket.SOL_SOCKET, socket.SO_TYPE) | ||
| return False | ||
| except (OSError, ValueError): |
There was a problem hiding this comment.
_is_detach_process_context_required does an inline import socket (against the repo guidance to keep imports at module level) and creates a socket via socket.fromfd(...) without closing it. Please move the import to module scope and ensure any temporary socket objects are closed to avoid leaking file descriptors.
| if target_stream is None: | ||
| target_fd = os.open(os.devnull, os.O_RDWR) | ||
| else: | ||
| target_fd = target_stream.fileno() | ||
| os.dup2(target_fd, system_stream.fileno()) | ||
|
|
||
|
|
There was a problem hiding this comment.
_redirect_stream opens os.devnull when target_stream is None but never closes the returned fd after dup2. This leaks a file descriptor for each redirected stream. Close the temporary fd after duplication (taking care not to close stdio fds if they’re reused).
| if target_stream is None: | |
| target_fd = os.open(os.devnull, os.O_RDWR) | |
| else: | |
| target_fd = target_stream.fileno() | |
| os.dup2(target_fd, system_stream.fileno()) | |
| system_fd = system_stream.fileno() | |
| if target_stream is None: | |
| target_fd = os.open(os.devnull, os.O_RDWR) | |
| try: | |
| if target_fd != system_fd: | |
| os.dup2(target_fd, system_fd) | |
| finally: | |
| if target_fd != system_fd: | |
| os.close(target_fd) | |
| return | |
| os.dup2(target_stream.fileno(), system_fd) |
908a8cf to
fccb1d4
Compare
As the python library lockfile is deprecated as well as the currently used dependency "python-daemon" since years still carries the same transitive dependency (has not fixed the issue since 5+ years!) in https://pagure.io/python-daemon/issue/42 even though there is a PR open since 3+ years (!) in https://pagure.io/python-daemon/pull-request/81 in this PR we pragmatically vendor-in the logic - it is each less than 500 LoC and cutting off the dependency is easier assuming that PRs not fixed since years is like a dead horse.
Most problematic both Celery and Edge provider packages have a dependency as well including daemon tooling in Airflow core. So this PR also adjusts providers. Known limitation:
Marking with milestone 3.2.2 assuming the core parts of the PR will be back-ported.
Was generative AI tooling used to co-author this PR?
Claude Opus 4.6
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.