feat: add GCP Infrastructure Manager Terraform modules [9.3] #3880

amirbenun · 2026-01-22T07:58:19Z

Adds GCP Infrastructure Manager Terraform modules for deploying Elastic Agent on GCP, providing two deployment options: elastic-agent with a service account for VM-based deployments, and credentials-json for credential-based authentication. The implementation includes modular Terraform configurations with compute instance, service account, and startup validation components, along with comprehensive deploy scripts that handle environment setup, terraform initialization, and deployment automation with proper error handling.

### Summary of your changes Replaces deprecated GCP Deployment Manager with modern Infrastructure Manager (Terraform) for deploying Elastic Agent CSPM integration. Provides identical resources with improved tooling and user experience. #### New Directory: deploy/infrastructure-manager/gcp-elastic-agent/ Files Added: main.tf - Main infrastructure configuration (compute instance, network, service account, IAM bindings) variables.tf - Input variable definitions outputs.tf - Deployment outputs service_account.tf - Standalone service account deployment for agentless mode terraform.tfvars.example - Example configuration for main deployment service_account.tfvars.example - Example configuration for SA-only deployment README.md - Comprehensive deployment guide #### Resources Created Identical to Deployment Manager implementation: Compute instance (Ubuntu, n2-standard-4, 32GB disk) with Elastic Agent pre-installed Service account with roles/cloudasset.viewer and roles/browser VPC network with auto-created subnets IAM bindings (project or organization scope) Optional SSH firewall rule #### Compatibility The new deployment script `infrastructure-manager/deploy.sh` is compatible with kibana deployment command of the form: ```bash gcloud config set project elastic-security-test && \ FLEET_URL=https://a6f784d2fb4d48bea7724fbe41ef17d3.fleet.us-central1.gcp.qa.elastic.cloud:443 \ ENROLLMENT_TOKEN=<REDUCTED> \ STACK_VERSION=9.2.3 \ ./deploy.sh ``` ### Related Issues - Resolves: elastic#3132 (cherry picked from commit fdf76cc)

### Summary of your changes Adds a new method for deploying GCP service account credentials using GCP Infrastructure Manager (Terraform-based) as an alternative to the existing Deployment Manager approach in deploy/deployment-manager/. The key improvement is that service account keys are now stored securely in Secret Manager rather than being exposed in deployment outputs. The script creates a service account with cloudasset.viewer and browser roles, stores the JSON key in Secret Manager, and retrieves it locally to KEY_FILE.json for use in the Elastic Agent GCP integration. Supports both project-level and organization-level deployments via the ORG_ID environment variable. ### Screenshot/Data  ### Related Issues  ### Checklist - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have added the necessary README/documentation (if appropriate) #### Introducing a new rule? - [ ] Generate rule metadata using [this script](https://github.com/elastic/cloudbeat/tree/main/security-policies/dev#generate-rules-metadata) - [ ] Add relevant unit tests - [ ] Generate relevant rule templates using [this script](https://github.com/elastic/cloudbeat/tree/main/security-policies/dev#generate-rule-templates), and open a PR in [elastic/packages/cloud_security_posture](https://github.com/elastic/integrations/tree/main/packages/cloud_security_posture) (cherry picked from commit e20e115)

…y script (elastic#3865) Refactors the error handling in the GCP Elastic Agent Infrastructure Manager deployment script to use a more idiomatic shell pattern. The change replaces the explicit exit code capture and conditional check with a direct if-not pattern for the gcloud command, making the script more readable and following shell scripting best practices. (cherry picked from commit 59c61ae)

Copilot

Pull request overview

This PR adds comprehensive GCP Infrastructure Manager Terraform modules for deploying Elastic Agent on GCP. It provides two deployment options: a full elastic-agent deployment with VM and service account, and a lightweight credentials-json module for creating service accounts with JSON keys for credential-based authentication.

Changes:

Added modular Terraform configurations for GCP Infrastructure Manager deployments
Implemented automated deploy scripts with environment setup and validation
Created startup validation mechanism using GCP guest attributes to ensure successful agent installation

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
deploy/infrastructure-manager/gcp-elastic-agent/variables.tf	Defines input variables for elastic-agent deployment including fleet configuration and validation settings
deploy/infrastructure-manager/gcp-elastic-agent/main.tf	Main Terraform configuration orchestrating service account, compute instance, and validation modules
deploy/infrastructure-manager/gcp-elastic-agent/outputs.tf	Exposes deployment outputs including instance details and service account information
deploy/infrastructure-manager/gcp-elastic-agent/setup.sh	Bootstrap script for enabling required GCP APIs and configuring service accounts
deploy/infrastructure-manager/gcp-elastic-agent/deploy.sh	Main deployment script handling environment variables and Infrastructure Manager deployment
deploy/infrastructure-manager/gcp-elastic-agent/deploy_service_account.sh	Convenience wrapper for credentials-json deployment
deploy/infrastructure-manager/gcp-elastic-agent/README.md	Comprehensive documentation for deployment options and troubleshooting
deploy/infrastructure-manager/gcp-elastic-agent/modules/compute_instance/*	Module for creating GCP compute instance with Elastic Agent startup script
deploy/infrastructure-manager/gcp-elastic-agent/modules/service_account/*	Module for creating service account with appropriate IAM bindings
deploy/infrastructure-manager/gcp-elastic-agent/modules/startup_validation/*	Module for validating startup script completion via guest attributes
deploy/infrastructure-manager/gcp-credentials-json/*	Standalone module for creating service account with JSON key credentials

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-22T08:05:44Z

deploy/infrastructure-manager/gcp-elastic-agent/modules/startup_validation/validate_startup.sh

+    response=$(curl -s -H "Authorization: Bearer $TOKEN" \
+        "https://compute.googleapis.com/compute/v1/projects/${PROJECT_ID}/zones/${ZONE}/instances/${INSTANCE_NAME}/getGuestAttributes?queryPath=elastic-agent/$key" \
+        2>/dev/null || echo '{}')


The validation script passes sensitive information (TOKEN) to curl without proper precautions. While the script uses 'set +x' to avoid logging, the token is still visible in process listings. Consider using a more secure method like storing it in a temporary file with restricted permissions or using stdin for the Authorization header.

Suggested change

response=$(curl -s -H "Authorization: Bearer $TOKEN" \

"https://compute.googleapis.com/compute/v1/projects/${PROJECT_ID}/zones/${ZONE}/instances/${INSTANCE_NAME}/getGuestAttributes?queryPath=elastic-agent/$key" \

2>/dev/null || echo '{}')

response=$(printf 'Authorization: Bearer %s\n' "$TOKEN" | \

curl -s -H @- \

"https://compute.googleapis.com/compute/v1/projects/${PROJECT_ID}/zones/${ZONE}/instances/${INSTANCE_NAME}/getGuestAttributes?queryPath=elastic-agent/$key" \

2>/dev/null || echo '{}')

Copilot · 2026-01-22T08:05:44Z

deploy/infrastructure-manager/gcp-elastic-agent/setup.sh

+
+# Accept parameters
+PROJECT_ID="$1"
+SERVICE_ACCOUNT="$2"


The script doesn't validate that required parameters (PROJECT_ID and SERVICE_ACCOUNT) are provided before using them. If these are empty or missing, the script will fail with unclear error messages. Consider adding validation at the beginning of the script to check if parameters are provided.

Suggested change

SERVICE_ACCOUNT="$2"

SERVICE_ACCOUNT="$2"

# Validate required parameters

if [ -z "${PROJECT_ID}" ] || [ -z "${SERVICE_ACCOUNT}" ]; then

echo "Usage: $0 <PROJECT_ID> <SERVICE_ACCOUNT_NAME>" >&2

echo "Error: PROJECT_ID and SERVICE_ACCOUNT are required." >&2

exit 1

fi

Copilot · 2026-01-22T08:05:45Z

deploy/infrastructure-manager/gcp-elastic-agent/deploy.sh

+# Required environment variables (no defaults - must be provided)
+# FLEET_URL, ENROLLMENT_TOKEN, STACK_VERSION


The script doesn't validate required environment variables (FLEET_URL, ENROLLMENT_TOKEN, STACK_VERSION) before using them. While there are comments indicating they're required, the script should explicitly check these variables and provide clear error messages if they're missing before attempting deployment.

Copilot · 2026-01-22T08:05:45Z

deploy/infrastructure-manager/gcp-elastic-agent/modules/compute_instance/main.tf

+  # Determine install command based on version
+  install_command = startswith(var.elastic_agent_version, "9.") ? "sudo ./elastic-agent install --non-interactive --install-servers" : "sudo ./elastic-agent install --non-interactive"


The version check uses 'startswith' to distinguish version 9.x from earlier versions, but this approach may break if Elastic Agent introduces breaking changes in future major versions (e.g., 10.x, 11.x). Consider a more robust version comparison or explicitly document which versions are supported.

Suggested change

# Determine install command based on version

install_command = startswith(var.elastic_agent_version, "9.") ? "sudo ./elastic-agent install --non-interactive --install-servers" : "sudo ./elastic-agent install --non-interactive"

# Extract numeric major version from the Elastic Agent version string

elastic_agent_major = tonumber(regex("^([0-9]+)", var.elastic_agent_version))

# Determine install command based on major version

install_command = local.elastic_agent_major == 9 ? "sudo ./elastic-agent install --non-interactive --install-servers" : "sudo ./elastic-agent install --non-interactive"

Copilot · 2026-01-22T08:05:45Z

deploy/infrastructure-manager/gcp-credentials-json/deploy.sh

+KEY_FILE="KEY_FILE.json"
+if ! gcloud secrets versions access latest --secret="${SECRET_NAME}" --project="${PROJECT_ID}" | base64 -d >"${KEY_FILE}"; then
+    echo -e "${RED}Error: Failed to retrieve key from Secret Manager.${RESET}"
+    exit 1
+fi


The KEY_FILE.json is written to the current directory without checking if the file already exists or setting restrictive permissions. This could accidentally overwrite existing files or leave sensitive credentials with overly permissive file permissions. Consider checking for file existence, prompting before overwriting, and setting file permissions to 600 after creation.

Suggested change

KEY_FILE="KEY_FILE.json"

if ! gcloud secrets versions access latest --secret="${SECRET_NAME}" --project="${PROJECT_ID}" | base64 -d >"${KEY_FILE}"; then

echo -e "${RED}Error: Failed to retrieve key from Secret Manager.${RESET}"

exit 1

fi

KEY_FILE="KEY_FILE.json"

# Prevent accidental overwrite of an existing key file

if [ -e "${KEY_FILE}" ]; then

echo -e "${RED}Error: ${KEY_FILE} already exists. Refusing to overwrite existing file.${RESET}"

echo "Please move or remove the existing file and rerun this script."

exit 1

fi

# Ensure the key file is created with restrictive permissions (600-equivalent)

OLD_UMASK=$(umask)

umask 077

if ! gcloud secrets versions access latest --secret="${SECRET_NAME}" --project="${PROJECT_ID}" | base64 -d >"${KEY_FILE}"; then

umask "${OLD_UMASK}"

echo -e "${RED}Error: Failed to retrieve key from Secret Manager.${RESET}"

exit 1

fi

umask "${OLD_UMASK}"

Copilot · 2026-01-22T08:05:47Z

deploy/infrastructure-manager/gcp-elastic-agent/modules/startup_validation/outputs.tf

@@ -0,0 +1,4 @@
+output "validated" {
+  description = "Whether validation was performed and succeeded"


The output value always returns var.enabled, which simply reflects whether validation was enabled, not whether it actually succeeded. If validation is enabled but fails, this output would still show true. Consider tracking the actual validation result through the terraform_data resource or documenting this limitation clearly.

Suggested change

description = "Whether validation was performed and succeeded"

description = "Whether validation was requested (var.enabled); does not indicate whether validation actually succeeded"

Copilot · 2026-01-22T08:05:48Z

deploy/infrastructure-manager/gcp-elastic-agent/modules/startup_validation/validate_startup.sh

+INSTANCE_NAME="$3"
+TIMEOUT="$4"
+
+MAX_ATTEMPTS=$((TIMEOUT / 10))


If TIMEOUT is set to a value less than 10 (e.g., 5), the calculation MAX_ATTEMPTS=$((TIMEOUT / 10)) will result in 0, causing the while loop to never execute. Consider adding validation to ensure TIMEOUT is at least 10, or adjust the logic to handle small timeout values appropriately.

Suggested change

MAX_ATTEMPTS=$((TIMEOUT / 10))

# Validate TIMEOUT and ensure at least one attempt

if ! [[ "$TIMEOUT" =~ ^[0-9]+$ ]] || [ "$TIMEOUT" -le 0 ]; then

echo "Invalid TIMEOUT value: '$TIMEOUT'. It must be a positive integer (seconds)." >&2

exit 1

fi

if [ "$TIMEOUT" -lt 10 ]; then

MAX_ATTEMPTS=1

else

MAX_ATTEMPTS=$((TIMEOUT / 10))

fi

Copilot · 2026-01-22T08:05:48Z

deploy/infrastructure-manager/gcp-credentials-json/main.tf

+resource "google_secret_manager_secret_version" "sa_key" {
+  secret      = google_secret_manager_secret.sa_key.id
+  secret_data = google_service_account_key.elastic_agent_key.private_key


The service account key private_key is stored directly in Secret Manager. While Secret Manager is secure, the key is stored in base64-encoded format in the Terraform state file. Ensure that remote state with encryption is configured for production use, or document this security consideration in the README.

Copilot · 2026-01-22T08:05:48Z

deploy/infrastructure-manager/gcp-elastic-agent/modules/startup_validation/validate_startup.sh

@@ -0,0 +1,61 @@
+#!/bin/bash
+set -e


The script uses 'set +x' on line 3 after 'set -e' on line 2, which disables debug output. This appears intentional to avoid logging the sensitive GCP_ACCESS_TOKEN, which is good for security. However, the comment should clarify this is for security reasons to prevent token exposure in logs.

Suggested change

set -e

set -e

# Disable xtrace to avoid logging sensitive values such as GCP_ACCESS_TOKEN in shell debug output

Copilot · 2026-01-22T08:05:48Z

deploy/infrastructure-manager/gcp-elastic-agent/modules/compute_instance/main.tf

+    # Download Elastic Agent
+    ElasticAgentArtifact=elastic-agent-${var.elastic_agent_version}-linux-x86_64
+    ARTIFACT_URL="${var.elastic_artifact_server}/$ElasticAgentArtifact.tar.gz"
+
+    log "Downloading Elastic Agent from $ARTIFACT_URL"
+    if ! curl -f -L -O --connect-timeout 30 --max-time 300 "$ARTIFACT_URL"; then
+      report_failure "Failed to download Elastic Agent from $ARTIFACT_URL"
+    fi
+    log "Download successful"
+
+    # Verify download
+    if [ ! -f "$ElasticAgentArtifact.tar.gz" ]; then
+      report_failure "Downloaded file not found: $ElasticAgentArtifact.tar.gz"
+    fi
+
+    # Extract archive
+    log "Extracting $ElasticAgentArtifact.tar.gz"
+    if ! tar xzvf "$ElasticAgentArtifact.tar.gz"; then
+      report_failure "Failed to extract $ElasticAgentArtifact.tar.gz"
+    fi
+
+    # Verify extraction
+    if [ ! -d "$ElasticAgentArtifact" ]; then
+      report_failure "Extracted directory not found: $ElasticAgentArtifact"
+    fi
+
+    cd "$ElasticAgentArtifact"
+
+    # Install Elastic Agent
+    log "Installing Elastic Agent with command: ${local.install_command}"
+    if ! ${local.install_command} --url=${var.fleet_url} --enrollment-token=${var.enrollment_token}; then
+      report_failure "Elastic Agent installation command failed"
+    fi


The startup script for the compute instance downloads and executes the Elastic Agent binary directly from a mutable URL without any checksum or signature verification. If var.elastic_artifact_server is pointed at a malicious endpoint or the artifact host/DNS/TLS is compromised, an attacker could deliver and run arbitrary code as root on this VM via curl + tar + elastic-agent install. Add explicit integrity verification (e.g., pin the expected artifact hash or verify a vendor signature before extraction and installation) and restrict elastic_artifact_server to trusted artifact hosts only.

amirbenun added 3 commits January 22, 2026 09:55

amirbenun requested a review from a team as a code owner January 22, 2026 07:58

Copilot AI review requested due to automatic review settings January 22, 2026 07:58

Copilot started reviewing on behalf of amirbenun January 22, 2026 07:58 View session

mergify bot assigned amirbenun Jan 22, 2026

amirbenun linked an issue Jan 22, 2026 that may be closed by this pull request

Replace GCP deployment-manager scripts with infrastructure-manager #3132

Open

2 tasks

amirbenun changed the title ~~feat: add GCP Infrastructure Manager Terraform modules~~ feat: add GCP Infrastructure Manager Terraform modules [9.3] Jan 22, 2026

Copilot AI reviewed Jan 22, 2026

View reviewed changes

amirbenun mentioned this pull request Jan 22, 2026

Replace GCP deployment-manager scripts with infrastructure-manager #3132

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add GCP Infrastructure Manager Terraform modules [9.3] #3880

feat: add GCP Infrastructure Manager Terraform modules [9.3] #3880

amirbenun commented Jan 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Copilot AI Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-SERVICE_ACCOUNT="$2"
+SERVICE_ACCOUNT="$2"
+# Validate required parameters
+if [ -z "${PROJECT_ID}" ] || [ -z "${SERVICE_ACCOUNT}" ]; then
+    echo "Usage: $0 <PROJECT_ID> <SERVICE_ACCOUNT_NAME>" >&2
+    echo "Error: PROJECT_ID and SERVICE_ACCOUNT are required." >&2
+    exit 1
+fi

		# Required environment variables (no defaults - must be provided)
		# FLEET_URL, ENROLLMENT_TOKEN, STACK_VERSION

		# Determine install command based on version
		install_command = startswith(var.elastic_agent_version, "9.") ? "sudo ./elastic-agent install --non-interactive --install-servers" : "sudo ./elastic-agent install --non-interactive"

-  # Determine install command based on version
-  install_command = startswith(var.elastic_agent_version, "9.") ? "sudo ./elastic-agent install --non-interactive --install-servers" : "sudo ./elastic-agent install --non-interactive"
+  # Extract numeric major version from the Elastic Agent version string
+  elastic_agent_major = tonumber(regex("^([0-9]+)", var.elastic_agent_version))
+  # Determine install command based on major version
+  install_command = local.elastic_agent_major == 9 ? "sudo ./elastic-agent install --non-interactive --install-servers" : "sudo ./elastic-agent install --non-interactive"

-KEY_FILE="KEY_FILE.json"
-if ! gcloud secrets versions access latest --secret="${SECRET_NAME}" --project="${PROJECT_ID}" | base64 -d >"${KEY_FILE}"; then
-    echo -e "${RED}Error: Failed to retrieve key from Secret Manager.${RESET}"
-    exit 1
-fi
+KEY_FILE="KEY_FILE.json"
+# Prevent accidental overwrite of an existing key file
+if [ -e "${KEY_FILE}" ]; then
+    echo -e "${RED}Error: ${KEY_FILE} already exists. Refusing to overwrite existing file.${RESET}"
+    echo "Please move or remove the existing file and rerun this script."
+    exit 1
+fi
+# Ensure the key file is created with restrictive permissions (600-equivalent)
+OLD_UMASK=$(umask)
+umask 077
+if ! gcloud secrets versions access latest --secret="${SECRET_NAME}" --project="${PROJECT_ID}" | base64 -d >"${KEY_FILE}"; then
+    umask "${OLD_UMASK}"
+    echo -e "${RED}Error: Failed to retrieve key from Secret Manager.${RESET}"
+    exit 1
+fi
+umask "${OLD_UMASK}"

		@@ -0,0 +1,4 @@
		output "validated" {
		description = "Whether validation was performed and succeeded"

	description = "Whether validation was performed and succeeded"
	description = "Whether validation was requested (var.enabled); does not indicate whether validation actually succeeded"

-MAX_ATTEMPTS=$((TIMEOUT / 10))
+# Validate TIMEOUT and ensure at least one attempt
+if ! [[ "$TIMEOUT" =~ ^[0-9]+$ ]] || [ "$TIMEOUT" -le 0 ]; then
+    echo "Invalid TIMEOUT value: '$TIMEOUT'. It must be a positive integer (seconds)." >&2
+    exit 1
+fi
+if [ "$TIMEOUT" -lt 10 ]; then
+    MAX_ATTEMPTS=1
+else
+    MAX_ATTEMPTS=$((TIMEOUT / 10))
+fi

	set -e
	set -e
	# Disable xtrace to avoid logging sensitive values such as GCP_ACCESS_TOKEN in shell debug output

feat: add GCP Infrastructure Manager Terraform modules [9.3] #3880

Are you sure you want to change the base?

feat: add GCP Infrastructure Manager Terraform modules [9.3] #3880

Conversation

amirbenun commented Jan 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant