The CI system can be controlled with a few commands as comments in pull requests:
[citest]- Trigger a re-test for all machines[citest bad]- Trigger a re-test for all machines with an error or failure status[citest pending]- Trigger a re-test for all machines with a pending status[citest commit:<sha1>]- Whitelist a commit to be tested if the submitter is not trusted
It is also possible to whitelist all commits in a PR with the needs-ci tag.
However, this should be only used with trusted submitters since they can still
change the commits in the PR.
To make changes to the CI environment, the changes should be first committed
to the master branch, and deployed to the staging environment. Once they
are confirmed to be working, the changes should be merged to the production
branch and deployed to the production environment. For the centos7 based
tests, use linux-system-roles-centos7 for production and use
linux-system-roles-centos7-staging for staging.
Steps:
- Make changes to
masterbranch. - Git commit.
- Git push to your github fork.
- Deploy changes to the
stagingenvironment. - If you want to test changes to the run-tests script that have not been
merged into the upstream
masterbranch:
$ oc edit bc linux-system-roles-staging
# look for the source.git.ref - change the uri to point to your fork
$ oc start-build --follow linux-system-roles-staging
... build logs ...
# if the build succeeds, a new staging pod should rollout - the
# dc has a trigger on new images - if not, do this
$ oc rollout latest dc/linux-system-roles-staging
- Test one or more PRs to ensure that the staging changes are working. Be sure to check the staging pod logs for problems too.
- If there are errors in building or testing, repeat from Step 1.
- Submit a PR against
test-harnessin github for themasterbranch. - Once the PR is merged,
git pullthe changes to your localmasterbranch. - Deploy the changes to the
stagingenvironment. (If you edited the buildconfig to test your unmerged changes, this step will replace them) - Merge the changes from the
masterbranch toproductionbranch and submit a PR againsttest-harnessin github for theproductionbranch. - Once the PR is merged,
git pullthe changes to your localproductionbranch. - Deploy the changes to the
productionenvironment.
A podman container which runs integration tests for open pull requests on
linux system roles repositories. It runs
all playbooks matching tests/tests*.yml (intentionally identical to
the pattern used in the Fedora Standard Test Interface) of the
repository against various virtual machines.
If using OpenShift, see below.
To build a container for local testing, run
buildah [--log-level debug] bud -f Dockerfile[.ext] -t lsr:{latest,$name} .
If buildah seems to hang, Ctrl-C, then rerun with the log level. For example
buildah --log-level debug bud -f Dockerfile -t lsr:latest .
You can also use podman or docker.
podman build -t linuxsystemroles/test-harness:latest .
If podman isn't working, try docker:
sudo docker build -t linuxsystemroles/test-harness:latest .
Multiple of these containers can be started in parallel. They try to not step on each others feet by synchronizing via GitHub commit statuses.
The test runner needs a config.json in the /config mount, which specifies
the repositories to monitor for open pull requests, the cloud images to use,
a location where to copy (via ssh) the test results to, and a publicly
accessible URL for these results. For example:
{
"repositories": [
"linux-system-roles/network",
"linux-system-roles/selinux",
"linux-system-roles/timesync",
"linux-system-roles/tuned",
"linux-system-roles/kdump",
"linux-system-roles/firewall",
"linux-system-roles/postfix"
],
"images": [
{
"name": "fedora-27",
"source": "https://download.fedoraproject.org/pub/fedora/linux/releases/27/CloudImages/x86_64/images/Fedora-Cloud-Base-27-1.6.x86_64.qcow2",
"upload_results": true,
"setup": "dnf install -yv python2 python2-dnf libselinux-python"
},
{
"name": "fedora-28",
"source": "https://download.fedoraproject.org/pub/fedora/linux/releases/28/Cloud/x86_64/images/Fedora-Cloud-Base-28-1.1.x86_64.qcow2",
"upload_results": true,
"min_ansible_version": "2.7",
"setup": [
{
"name": "Setup",
"hosts": "all",
"become": true,
"gather_facts": false,
"tasks": [
{"raw": "dnf install -yv python2 python2-dnf libselinux-python"}
]
}
]
},
{
"name": "centos-6",
"source": "https://cloud.centos.org/centos/6/images/CentOS-6-x86_64-GenericCloud-1804_02.qcow2c",
"upload_results": true
},
{
"name": "centos-7",
"source": "https://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud-1804_02.qcow2c",
"upload_results": true
}
],
"results": {
"destination": "user@example.org:public_html/logs",
"public_url": "https://example.com/logs"
},
"logging": {
"level": "WARNING",
"filename": "/var/log/test-harness_HOSTNAME.log",
"format": "%(asctime)s: %(levelname)s: %(message)s",
"datefmt": "%Y-%m-%dT%H:%M:%S%z",
"style": "%",
"file_level": "NOTSET",
"stderr_level": "WARNING"
}
}The setup key contains either a list of Ansible plays which will be saved as
a playbook and executed before the test run, or a single shell command which
will be executed using the Ansible raw module (so the two examples above are
exactly equivalent).
Use the min_ansible_version key to indicate the minimum version of ansible
that can be used. For example, if you need to use ansible version 2.8 or
later to manage Fedora 32 hosts, use something like this:
{
"name": "fedora-32",
"source": "http://download.fedoraproject.org/pub/fedora/linux/releases/32/Cloud/x86_64/images/Fedora-Cloud-Base-32-1.6.x86_64.qcow2",
"upload_results": true,
"min_ansible_version": "2.8"
},This means the CI environment will run tests using only ansible 2.8 or later to manage Fedora 32 hosts.
The container needs a /secrets mount, which must contain these files:
github-token: A GitHub access token withpublic_repo,repo:statusandread:orgscopes. Theread:orgpermission is required to identify users as members to organiziations in case they set their organization membership to private. Without it, the test harness can still update CI status but might ignore pull requests or command comments unexpectedly.id_rsaandid_rsa.pub: A ssh key with to access the server inresults.destination.known_hosts: A sshknown_hostsfile which contains the public key of that server.
The test runner writes images into the /cache mount and reuses existing ones.
The container must have access to /dev/kvm for fast virtualization.
In this section we first describe how to setup an OpenShift environment and run the CI system on it. Then we give several tips concerning maintenance the entire system and backing up important data.
In our use case, we create OpenShift project with two deployments. The first one will be used for testing and will be based on stable version of the CI. The second one, based on staging version of the CI, will be used to test new features of the CI before they will become part of the stable version (i.e. before merging them into the master branch).
Before start, get the config.json and config-staging.json. Then create an
empty directory and put the secrets inside it (files github-token, id_rsa,
id_rsa.pub and known_hosts discussed before). The requirements put on
config{,-staging}.json and secrets were discussed above.
In what follows, suppose you have set following environment variables:
PROJECT_NAME: A name of OpenShift project.CONFIG_PATH: A path to directory withconfig.jsonandconfig-staging.jsonSECRETS_PATH: A path to the directory with secrets.SCC_NAME: The name of the SecurityContextConstraints that you will add your ServiceAccount to for privileged operations (e.g.privileged)
First, you need to be logged on (replace [URL] with a real URL of your
OpenShift server):
$ oc login [URL]
NOTE: Ansible will be running all of the commands on localhost, instead of
executing the tasks remotely in the cluster. That is, the Ansible playbook
will be using the oc OpenShift command line tool, and the python OpenShift
API library, to execute API calls in the OpenShift cluster to provision and
configure the resources in the OpenShift cluster. So make sure you are logged
into the OpenShift cluster first:
$ oc login [URL]
Then execute the playbook:
$ ansible-playbook [-e test_harness_secrets_dir=$SECRETS_PATH] \
[-e test_harness_config_dir=$CONFIG_PATH] \
-e test_harness_scc=$SCC_NAME \
ansible/openshift-playbook.yml
Parameters:
test_harness_secrets_dir- Default$HOME/rhel-system-roles/test-harness-config/secrets- SeeSECRETS_PATHtest_harness_config_dir- Default$HOME/rhel-system-roles/test-harness-config/config- SeeCONFIG_PATHtest_harness_scc- Required - SeeSCC_NAMEtest_harness_namespace- Defaultlsr-test-harnesstest_harness_sa- Defaultsystem:serviceaccount:{{ test_harness_namespace }}:testertest_harness_need_node_selector- Defaulttruetest_harness_run_as_root- Defaulttruetest_harness_node_selector- Default{"system-roles-ci": "true"}test_harness_use_staging- Defaulttrueif on the staging branch,falseotherwisetest_harness_use_production- Defaultfalse- Must explicitly set and be on the production/master branch
The environment, staging or production, that will be deployed/updated depends
on which git branch you have checked out. If you are on the staging branch
(currently master), then the playbook will deploy/update the staging
environment. If you are on the production branch (currently production),
and have explicitly set test_harness_use_production to true, then the
playbook will deploy/update the production environment. You will get an error
if not on one of these branches. You will get an error if you have
uncommitted git changes (see git status -uno --porcelain). It will also
change the DeploymentConfig to use the nodeSelector above, and will configure
the DeploymentConfig to run as root.
Make sure you are logged into the OpenShift cluster first:
$ oc login [URL]
Create a fresh OpenShift project:
$ oc new-project ${PROJECT_NAME} --display-name=${PROJECT_NAME}
The --display-name parameter is optional and it is what is displayed in
OpenShift web console.
If you have already OpenShift project created and you want to use it, you can select it by typing:
$ oc project ${PROJECT_NAME}
Now its time to create objects. Type:
$ oc create -f ./openshift-objects.yml
This will create ServiceAccount object named tester and ImageStream and
DeploymentConfig objects both named linux-system-roles for us, just as they
are specified in ./openshift-objects.yml. Be careful: if some of the object
does exist so far, the create command ends with an error. In this case, you
can use replace command to just replace the existing object (suppose that
./openshift-object.yml contains a specification of object to be replaced):
$ oc replace -f ./openshift-object.yml
Because we need also staging deployment, lets create DeploymentConfig object
named linux-system-roles-staging:
$ oc create -f ./openshift-objects-staging.yml
Note: If ImageStream was created from an older version of
./openshift-objects.yml, it is possible that staging tag is missing. To
check this, please type (is refers to ImageStreams resource,
linux-system-roles is the name of ImageStream we want to inspect):
$ oc get is linux-system-roles -o jsonpath='{range .spec.tags[*]}{.name}{"\n"}{end}'
latest
staging
$
If you are not seeing staging in the above command output, then you must add
staging tag by yourself. To do this, type:
$ ./add-staging-tag linux-system-roles
and now you should have staging tag in your linux-system-roles image
stream.
The CI system launches virtual machines (VMs for short), so the container
needs to have /dev/kvm device (in other words, the container must be
privileged). Containers are launched using service account named tester.
Such account must be present in a list of privileged account. If you are
a cluster-admin, you can simply type
$ oc edit scc $SCC_NAME
and then add under the users list a user with the name tester. Please keep
the following format for items in the users list:
system:serviceaccount:${PROJECT_NAME}:tester
Tip: oc edit will open file to be edited in vi, which is a default
editor. To change this behavior, set OC_EDITOR environment variable to point
to your favorite editor.
If you are done with editing, the snippet from your editor should looks like
this (if you named your project lsr-test-harness):
users:
- system:serviceaccount:lsr-test-harness:testerAfter you save and quit the editor, tester should become the new privileged
user. You can check this with
$ oc get scc -o jsonpath='{range .items[?(.allowPrivilegedContainer==true)].users[*]}{@}{"\n"}{end}' | grep -Ee ':tester$'
system:serviceaccount:lsr-test-harness:tester
$
If you are not a cluster-admin, ask someone more powerful to make tester
privileged user for you.
Some OpenShift servers selects nodes in which deployments should run according
to node selectors. For example, if you need to run your deployments on bare
metal node and OpenShift server selects such a node for deployments that have
node selector set to system-roles-ci = "true", run the following command for
both linux-system-roles and linux-system-roles-staging deployment configs:
$ ./apply-node-selectors linux-system-roles
$ ./apply-node-selectors linux-system-roles-staging
If you need to run containers as a root, run:
$ ./grant-root-privileges linux-system-roles
$ ./grant-root-privileges linux-system-roles-staging
Now its time to create ConfigMap and Secret objects. This will create two
ConfigMap objects named config and config-staging and one Secret object
named secrets:
$ oc create configmap config --from-file=config.json=${CONFIG_PATH}/config.json
$ oc create configmap config-staging --from-file=config.json=${CONFIG_PATH}/config-staging.json
$ oc create secret generic secrets --from-file=${SECRETS_PATH}
Once all required objects are created, import the images to kick up our deployments:
$ oc import-image linux-system-roles
$ oc import-image linux-system-roles:staging
Congratulations! Now you can check via OpenShift web console that the things are properly set up. You can browse the statuses, scale a number of pods in each deployment up or down and do many other operations.
If you want to turn on debug logging for a particular deployment, just set
TEST_HARNESS_DEBUG=true in the deployment config e.g.
oc set env dc/linux-system-roles TEST_HARNESS_DEBUG=true
If you need to do more fine-grained log level setting, use oc edit configmap config or oc edit configmap config-staging and edit the "logging" section -
change both the "level" and the "stderr_level" or "file_level". You will
need to rollout the dc for the change to go into effect (there is no trigger
for configmap changes):
$ oc edit configmap config
...
$ oc rollout latest dc/linux-system-roles
$ oc get pods -w
For staging:
$ oc edit configmap config-staging
...
$ oc rollout latest dc/linux-system-roles-staging
$ oc get pods -w
When using file logging e.g.
"logging": {
"filename": "/var/log/test-harness_HOSTNAME.log",
"file_level": "DEBUG",
The file is written inside the container. If you want to see the file, you will need
to use oc exec or oc rsh:
$ oc get pods
NAME READY STATUS RESTARTS AGE
linux-system-roles-12-ddtrs 1/1 Running 57 55d
...
$ oc exec linux-system-roles-12-ddtrs -- cat /var/log/test-harness_linux-system-roles-12-ddtrs.log
The run-tests script can be controlled via command line arguments,
environment variables, and the config.json config file. When running in a pod
in a Kubernetes/OpenShift environment, the easiest way to customize the pod is
via environment variables. The environment variables correspond to the
command line arguments to run-tests. In general, the name of the environment
variable is the name of the command line argument, in all upper case, with _
instead of -, and with the prefix TEST_HARNESS_. For example, the command
line option --use-images can be set via the env. var.
TEST_HARNESS_USE_IMAGES. See run-tests usage for the complete list.
The precedence is:
- command line args are highest precedence
- environment variables are next highest
- settings in the config.json file are next highest
- defaults are lowest precedence
For example, in a running OpenShift cluster, if you wanted to use a custom variant in the string used for the PR status:
oc set env dc/linux-system-roles-staging TEST_HARNESS_VARIANT=my_custom_name
This will cause the staging pods to be redeployed and use my_custom_name for
the PR status reporting string e.g. centos-7/ansible-29 (my_custom_name).
To reset:
oc set env dc/linux-system-roles-staging TEST_HARNESS_VARIANT-