-
Notifications
You must be signed in to change notification settings - Fork 110
feat: Add cloud deployment example with event replication #389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add cloud deployment example with event replication #389
Conversation
Summary of ChangesHello @kabir, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request delivers a robust cloud deployment example for A2A agents, showcasing a scalable and resilient architecture on Kubernetes. It integrates PostgreSQL for task persistence and Kafka for real-time event replication, allowing multiple agent instances to operate cohesively. The example comes with comprehensive deployment scripts and a verification client, making it easy to set up and test a distributed A2A agent environment. This directly addresses issues #281 and #373 by providing a concrete implementation for cloud-native A2A agents. Highlights
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
11dc2c1 to
f82bcd3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a cloud deployment example for the A2A agent using Kubernetes, PostgreSQL, and Kafka. It includes deployment scripts, Kubernetes configurations, and a test client to demonstrate multi-pod behavior with load balancing and event replication. The changes involve adding new files for the deployment setup and modifying the pom.xml to include the new module. The review focuses on ensuring the deployment scripts are robust, the Kubernetes configurations are secure and efficient, and the documentation is clear and comprehensive.
| if ! kubectl get crd kafkas.kafka.strimzi.io > /dev/null 2>&1; then | ||
| echo "Installing Strimzi operator..." | ||
| kubectl create -f 'https://strimzi.io/install/latest?namespace=kafka' -n kafka |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script is installing the Strimzi operator without checking if the kafka namespace already exists. This can lead to errors if the namespace was previously created with different configurations. It's recommended to check for the namespace's existence before attempting to create it.
| if ! kubectl get crd kafkas.kafka.strimzi.io > /dev/null 2>&1; then | |
| echo "Installing Strimzi operator..." | |
| kubectl create -f 'https://strimzi.io/install/latest?namespace=kafka' -n kafka | |
| if ! kubectl get namespace kafka > /dev/null 2>&1; then | |
| echo "Creating kafka namespace..." | |
| kubectl create namespace kafka | |
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually have this earlier, on lines 167-170. Also it looks like your suggestion skips installing the operator :-D
| echo "Waiting for Kafka to be ready (using KRaft mode, typically 2-3 minutes)..." | ||
|
|
||
| # Monitor progress while waiting | ||
| for i in {1..60}; do | ||
| echo "Checking Kafka status (attempt $i/60)..." | ||
| kubectl get kafka -n kafka -o wide 2>/dev/null || true | ||
| kubectl get pods -n kafka -l strimzi.io/cluster=a2a-kafka 2>/dev/null || true | ||
|
|
||
| if kubectl wait --for=condition=Ready kafka/a2a-kafka -n kafka --timeout=10s 2>/dev/null; then | ||
| echo -e "${GREEN}✓ Kafka deployed${NC}" | ||
| break | ||
| fi | ||
|
|
||
| if [ $i -eq 60 ]; then | ||
| echo -e "${RED}ERROR: Timeout waiting for Kafka${NC}" | ||
| kubectl describe kafka/a2a-kafka -n kafka | ||
| kubectl get events -n kafka --sort-by='.lastTimestamp' | ||
| exit 1 | ||
| fi | ||
| done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We loop 60 times (10 minutes total)
- We display status on each attempt
- On timeout, we run kubectl describe and show recent events
- We exit with error code 1
We DO exit on timeout (line 232: exit 1). This is already robust.
| if [ "$($CONTAINER_TOOL inspect -f '{{.State.Running}}' "${REG_NAME}" 2>/dev/null || true)" != 'true' ]; then | ||
| echo "Creating registry container..." | ||
| $CONTAINER_TOOL run \ | ||
| -d --restart=always -p "127.0.0.1:${REG_PORT}:5000" --network bridge --name "${REG_NAME}" \ | ||
| registry:2 | ||
| echo -e "${GREEN}✓ Registry container created${NC}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script checks if the registry container is running, but it doesn't verify if the registry is healthy and accessible before pushing images. This can lead to image push failures if the registry is in a bad state. Add a check to verify the registry's health before pushing images.
| if [ "$($CONTAINER_TOOL inspect -f '{{.State.Running}}' "${REG_NAME}" 2>/dev/null || true)" != 'true' ]; then | |
| echo "Creating registry container..." | |
| $CONTAINER_TOOL run \ | |
| -d --restart=always -p "127.0.0.1:${REG_PORT}:5000" --network bridge --name "${REG_NAME}" \ | |
| registry:2 | |
| echo -e "${GREEN}✓ Registry container created${NC}" | |
| if [ "$($CONTAINER_TOOL inspect -f '{{.State.Running}}' "${REG_NAME}" 2>/dev/null || true)" != 'true' ]; then | |
| echo "Creating registry container..." | |
| $CONTAINER_TOOL run \ | |
| -d --restart=always -p "127.0.0.1:${REG_PORT}:5000" --network bridge --name "${REG_NAME}" \ | |
| registry:2 | |
| echo -e "${GREEN}✓ Registry container created${NC}" | |
| else | |
| echo -e "${GREEN}✓ Registry container already running${NC}" | |
| # Verify registry is accessible | |
| if ! curl -s http://localhost:${REG_PORT}/v2/ > /dev/null 2>&1; then | |
| echo -e "${RED}ERROR: Registry not accessible${NC}" | |
| exit 1 | |
| fi | |
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check this on lines 127-135
| kubectl describe kafka/a2a-kafka -n kafka | ||
| kubectl get events -n kafka --sort-by='.lastTimestamp' | ||
| exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check this on line 241
Add Kubernetes deployment example demonstrating multi-instance A2A agents with database persistence and Kafka-based event replication.
3b51b16 to
bd665fd
Compare
bd665fd to
78f807d
Compare
…#389) Add Kubernetes deployment example demonstrating multi-instance A2A agents with database persistence and Kafka-based event replication. Fixes a2aproject#281 and a2aproject#373 --------- Co-authored-by: Farah Juma <fjuma@redhat.com>
Add Kubernetes deployment example demonstrating multi-instance A2A agents with database persistence and Kafka-based event replication.
Fixes #281 and #373