Skip to content

Conversation

@gustavolira
Copy link
Member

Description

Extract common functions from utils.sh into focused modules to improve maintainability and reduce code duplication.

Which issue(s) does this PR fix

PR acceptance criteria

Please make sure that the following steps are complete:

  • GitHub Actions are completed and successful
  • Unit Tests are updated and passing
  • E2E Tests are updated and passing
  • Documentation is updated if necessary (requirement for new features)
  • Add a screenshot if the change is UX/UI related

How to test changes / Special notes to the reviewer

@gustavolira
Copy link
Member Author

/review

@rhdh-qodo-merge
Copy link

rhdh-qodo-merge bot commented Dec 8, 2025

PR Reviewer Guide 🔍

(Review updated until commit b3ffc23)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🏅 Score: 86
🔒 Security concerns

Sensitive information handling:
common::oc_login echoes "OCP version: $(oc version)". If oc is configured to print server URLs, ensure no tokens are included. Also, several functions manipulate secrets (Postgres credentials) and patch files using sed; confirm no secrets are logged. Add explicit checks to avoid echoing sensitive env vars (like K8S_CLUSTER_TOKEN) on failures.

🔀 Multiple PR themes

Sub-PR theme: Extract common, wait, and operator modules and wire into utils.sh

Relevant files:

  • .ibm/pipelines/lib/common.sh
  • .ibm/pipelines/lib/k8s-wait.sh
  • .ibm/pipelines/lib/operators.sh
  • .ibm/pipelines/utils.sh

Sub-PR theme: Refactor jobs to use common::oc_login and new lib utilities

Relevant files:

  • .ibm/pipelines/jobs/auth-providers.sh
  • .ibm/pipelines/jobs/ocp-nightly.sh
  • .ibm/pipelines/jobs/ocp-operator.sh
  • .ibm/pipelines/jobs/ocp-pull.sh
  • .ibm/pipelines/jobs/upgrade.sh

Sub-PR theme: Documentation updates for modular pipeline libraries

Relevant files:

  • .ibm/pipelines/README.md
  • .ibm/pipelines/lib/README.md

⚡ Recommended focus areas for review

Matching Pods

k8s_wait::deployment picks the first pod whose name matches the resource string via grep. This can select unintended pods (partial matches, multiple replicas) and miss readiness of Deployments/StatefulSets. Consider scoping by label/owner or using rollout status for Deployment/StatefulSet to avoid flaky waits.

for ((i = 1; i <= max_attempts; i++)); do
  local pod_name
  pod_name=$(oc get pods -n "$namespace" | grep "$resource_name" | awk '{print $1}' | head -n 1)

  if [[ -n "$pod_name" ]]; then
    local pod_ready_status
    pod_ready_status=$(oc get pod "$pod_name" -n "$namespace" -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}')

    if [[ "$pod_ready_status" == "True" ]]; then
      log::success "Resource '$resource_name' is ready in namespace '$namespace'"
      return 0
    fi

    local container_statuses
    container_statuses=$(oc get pod "$pod_name" -n "$namespace" -o jsonpath='{.status.containerStatuses}')
    log::debug "Pod '$pod_name' status: Ready=$pod_ready_status, Containers=$container_statuses"
  else
    log::debug "No pods found matching '$resource_name' in namespace '$namespace'"
  fi
📚 Focus areas based on broader codebase context

Missing Dependency Check

The function common::oc_login calls log::error, but this module does not source the logging library. Ensure lib/log.sh is sourced or avoid using log:: calls here to prevent runtime errors when used standalone. (Ref 8)

common::oc_login() {
  if ! command -v oc &> /dev/null; then
    log::error "oc command not found. Please install OpenShift CLI."
    return 1
  fi

  oc login --token="${K8S_CLUSTER_TOKEN}" --server="${K8S_CLUSTER_URL}" --insecure-skip-tls-verify=true
  echo "OCP version: $(oc version)"
  return 0

Reference reasoning: Other bash utilities in the referenced scripts rely on local logging helpers before calling log functions, demonstrating the pattern of ensuring dependencies are available within the script context.

📄 References
  1. redhat-developer/rhdh-operator/hack/validate-image-digests.sh [1-55]
  2. redhat-developer/rhdh-operator/hack/validate-image-digests.sh [56-69]
  3. redhat-developer/rhdh-operator/hack/validate-image-digests.sh [104-145]
  4. redhat-developer/rhdh-operator/hack/validate-image-digests.sh [71-103]
  5. redhat-developer/rhdh-operator/hack/validate-image-digests.sh [146-157]
  6. redhat-developer/rhdh-operator/hack/validate-image-digests.sh [159-190]
  7. redhat-developer/rhdh-operator/hack/validate-image-digests.sh [192-199]
  8. redhat-developer/rhdh-operator/config/profile/rhdh/plugin-infra/plugin-infra.sh [1-49]

@gustavolira
Copy link
Member Author

/review
-i
--pr_reviewer.require_score_review=true
--pr_reviewer.require_can_be_split_review=true
--pr_reviewer.require_soc2_ticket=false
--pr_reviewer.num_code_suggestions="2"

@rhdh-qodo-merge
Copy link

Persistent review updated to latest commit b3ffc23

@gustavolira
Copy link
Member Author

/improve
--pr_code_suggestions.num_code_suggestions_per_chunk="2"
--pr_code_suggestions.commitable_code_suggestions=true

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

@zdrapela
Copy link
Member

zdrapela commented Dec 9, 2025

Just testing my privileges /approve

@zdrapela
Copy link
Member

zdrapela commented Dec 9, 2025

/approve

@openshift-ci openshift-ci bot added the approved label Dec 9, 2025
@zdrapela
Copy link
Member

zdrapela commented Dec 9, 2025

/unapprove

@zdrapela
Copy link
Member

zdrapela commented Dec 9, 2025

/remove-approve

@openshift-ci openshift-ci bot removed the approved label Dec 9, 2025
@zdrapela
Copy link
Member

zdrapela commented Dec 9, 2025

/lgtm

gustavolira and others added 21 commits January 21, 2026 08:15
Replaced the hardcoded OPENSHIFT_OPERATORS_NAMESPACE with OPERATOR_NAMESPACE in the cluster_setup_ocp_helm() and cluster_setup_ocp_operator() functions. This change improves flexibility and consistency in the deployment checks for OpenShift Pipelines.
Enhanced the configmap retrieval process by implementing a wait mechanism that checks for the existence of the default dynamic plugins configmap created by the operator. This change allows the script to wait for up to 2.5 minutes before failing, improving reliability in scenarios where the configmap may take time to be created. Additionally, updated error logging to provide more context if the configmap is not found after the wait period.
Removed unnecessary blank lines and adjusted spacing in the configmap retrieval logic to enhance code readability. These changes contribute to a cleaner codebase without altering functionality.
Added detailed diagnostic information to the k8s_wait::deployment function. When a timeout occurs, the script now logs pod status, pod description, pod logs, and recent events in the specified namespace. This improvement aids in troubleshooting deployment issues by providing more context on the state of resources at the time of failure.
Refactored the plugin merging process to intelligently combine custom and default plugins, ensuring that custom plugins take precedence while avoiding conflicts. This change enhances the flexibility of plugin management and preserves the operator's default plugin states.
Updated the plugin merging process to extract default plugins into a separate array and ensure deduplication by package name. This change improves the clarity of the merging strategy and enhances the robustness of plugin management while maintaining custom plugin precedence.
Removed the complex merging process for orchestrator plugins and streamlined the function to focus on waiting for the Backstage resource to be ready. Updated logging to reflect the new approach, enhancing clarity and maintainability of the code.
…nt function

Implemented a wait mechanism to ensure that the external PostgreSQL database is fully ready before proceeding with the RBAC instance deployment. This change enhances the reliability of the deployment process by allowing immediate connection for the database creation job, and includes error logging for deployment failures.
… and deployment

Enhanced the `enable_orchestrator_plugins_op` function to include error handling for the Backstage resource check, logging an error if the resource is not found. Additionally, implemented a wait mechanism in the `deploy_rhdh_operator` function to ensure the Backstage deployment is created by the operator, with appropriate logging for success and warnings for potential asynchronous creation.
…urce in operator deployment

Enhanced the `deploy_rhdh_operator` function to verify the availability of the PostgresCluster CRD before deploying the Backstage resource. Implemented a wait mechanism to ensure the PostgresCluster resource is created by the operator, with detailed logging for success and error scenarios. This change improves the reliability of the deployment process and aids in troubleshooting.
…rator deployment

Updated the `deploy_rhdh_operator` function to wait for either a PostgresCluster or StatefulSet resource to be created by the operator. Improved logging to provide clarity on which resource is being checked and added error handling for cases where neither resource is created within the specified wait time. This change enhances the reliability of the deployment process and aids in troubleshooting.
…sources

Refined the `deploy_rhdh_operator` function to eliminate unnecessary whitespace and improve the clarity of the wait mechanism for database resource creation. This update enhances the readability of the code while maintaining the existing functionality and logging for resource checks.
Refined the `enable_orchestrator_plugins_op` function to improve the process of enabling orchestrator plugins. This update includes extracting and merging custom and default dynamic plugins, applying the merged configmap, and restarting the Backstage deployment. Enhanced logging and error handling were added to ensure clarity and reliability during the plugin enabling process.
…r Backstage resource

Enhanced the `deploy_rhdh_operator` function to log an error if the Backstage deployment is not created within the specified wait time. Added additional logging to check the status of the Backstage CR and the operator logs for better troubleshooting. This change improves the reliability of the deployment process and aids in identifying issues during the Backstage resource creation.
…p function

Updated the plugin merging process in the `enable_orchestrator_plugins_op` function to utilize yq for improved clarity and efficiency. The new implementation merges default and custom plugins while ensuring deduplication by package name, enhancing the robustness of plugin management and maintaining custom plugin precedence.
…utils.sh

Removed the wait logic for Backstage deployment readiness after enabling orchestrator plugins. Updated logging to clarify that deployment verification will occur in subsequent calls, enhancing the clarity of the process.
Replace timeout bash -c subshells with proper polling loops to fix
log:: functions not working inside subprocesses.

Functions fixed:
- wait_for_svc(): rewritten with polling loop
- wait_for_endpoint(): rewritten with polling loop
- check_operator_status(): rewritten with polling loop
- waitfor_crunchy_postgres_*(): now uses k8s_wait::crd
- waitfor_tekton_pipelines(): now uses k8s_wait::crd
- install_pipelines_operator(): now uses k8s_wait::crd
- delete_tekton_pipelines(): rewritten with polling loop

Benefits:
- log::info/success/error now work correctly
- Consistent polling pattern across all wait functions
- Reduced code duplication by using k8s_wait::crd
- Variables now use 'local' keyword (DISPLAY_NAME → display_name)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update all Backstage Custom Resource manifests from v1alpha4 to v1alpha5.

Changes:
- apiVersion: rhdh.redhat.com/v1alpha4 → v1alpha5
- Remove spec.application.image (not supported in v1alpha5)
- Add spec.deployment.patch to override container images
- Configure dynamic-plugins-root volume with 10Gi storage

The v1alpha5 API requires using deployment patches to customize
the container image instead of the direct image field.

Files updated:
- rhdh-start.yaml
- rhdh-start-rbac.yaml
- rhdh-start-runtime.yaml

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update the function comment to accurately describe behavior:
- The operator DOES create backstage-dynamic-plugins-* configmap
- The function merges operator defaults with custom plugins
- Custom plugins override defaults on package conflicts

The previous comment incorrectly stated the operator doesn't create
the default configmap, which contradicted the actual code behavior.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The ephemeral volume patch for dynamic-plugins-root was conflicting
with the operator's default volume configuration. Removed the volume
patch and let the operator handle the volume creation automatically
(as it did in v1alpha4).

This fixes the deployment timeout issue in e2e-ocp-operator-nightly
where the backstage pod failed to become ready after enabling
orchestrator plugins.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Temporarily disable the SonarQube dynamic plugin in the showcase sanity values file to address potential issues related to its integration. This change is part of ongoing adjustments to plugin management.

# Conflicts:
#	.ibm/pipelines/value_files/diff-values_showcase-sanity-plugins.yaml
@github-actions
Copy link
Contributor

The image is available at:

/test e2e-ocp-helm

@sonarqubecloud
Copy link

@github-actions
Copy link
Contributor

@openshift-ci
Copy link

openshift-ci bot commented Jan 23, 2026

@gustavolira: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aks-helm-nightly 7fef4a8 link false /test e2e-aks-helm-nightly
ci/prow/e2e-ocp-operator-nightly f957dde link false /test e2e-ocp-operator-nightly
ci/prow/e2e-ocp-helm 0d5ff9d link true /test e2e-ocp-helm

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants