This the multi-page printable view of this section. Click here to print.
Creating Addons
- 1: Monitoring Addons
- 1.1: SLO Dashboards
- 1.2: Dead Man's Snitch Operator Integration
- 1.3: PagerDuty Integration
- 1.4: OCM SendGrid Service Integration
- 2: Testing Addons
- 2.1: Installing a specific version of an Addon in a staging environment
- 2.2: Testing With OCP (Without OCM)
- 2.3: Testing With OSD-E2E
- 3: Top Level Operator
- 3.1: Customer Notifications
- 3.2: Dependencies
- 3.3: Environments
- 3.4: Plug and Play Addon
- 4: managed-tenants Repository
- 5: SKU
1 - Monitoring Addons
1.1 - SLO Dashboards
Development teams are required to co-maintain, in conjunction with the MT-SRE Team, SLO Dashboards for the Addons they develop. This document explains how to bootstrap the dashboard creation and deployment.
First Dashboard
- Fork/clone the managed-tenants-slos repository.
- Create the following directory structure:
├── <addon-name>
│ ├── dashboards
│ │ └── <addon-name>-slo-dashboard.configmap.yaml
│ └── OWNERS
.
Example OWNERS
:
approvers:
- akonarde
- asegundo
<addon-name>-slo-dashboard.configmap.yaml
contents (replace all occurrences of <addon-name>
):
apiVersion: v1
kind: ConfigMap
metadata:
name: <addon-name>-slo-dashboard
labels:
grafana_dashboard: "true"
annotations:
grafana-folder: /grafana-dashboard-definitions/Addons
data:
mtsre-rhods-slos.json: |
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "grafana",
"uid": "-- Grafana --"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "prometheus",
"uid": "4rNsqZfnz"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"inspect": false
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 16,
"w": 3,
"x": 0,
"y": 0
},
"id": 2,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"pluginVersion": "9.0.1",
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "4rNsqZfnz"
},
"editorMode": "code",
"expr": "group by (_id) (subscription_sync_total{name=\"${addon_name}\"})",
"format": "table",
"range": true,
"refId": "A"
}
],
"title": "Clusters",
"transformations": [
{
"id": "groupBy",
"options": {
"fields": {
"_id": {
"aggregations": [],
"operation": "groupby"
}
}
}
}
],
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [],
"templating": {
"list": [
{
"hide": 2,
"name": "addon_name",
"query": "addon-<addon-name>",
"skipUrlSync": false,
"type": "constant"
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "<addon-name> - SLO Dashboard",
"version": 0,
"weekStart": ""
}
- Create a Merge Request adding the files to the managed-tenants-slos git repository.
- Ping @mt-sre-ic in the #forum-managed-tenants Slack channel for review.
Dashboard Deployment
Merging of the above merge request is a prerequisite for this step.
The dashboard deployment happens through app-interface, using saas-files.
- For each new Addon, we need to create a new saas-file in app-interface.
- Give ownership of the saas-file to your team using an app-interface role file.
Example Merge Request content to app-interface:
- Ping
@mt-sre-ic
in the#forum-managed-tenants
Slack channel for approval. - Merge Requests to app-interface are constantly reviewed/merged by AppSRE. After the MT-SRE approval, wait until the Merge Request is merged.
Accessing the Dashboards
Once the app-interface merge request is merged, you will see your ConfigMaps
being deployed in the #sd-mt-sre-info
Slack channel. For example:
[app-sre-stage-01] ConfigMap odf-ms-cluster-status applied
...
[app-sre-prod-01] ConfigMap odf-ms-cluster-status applied
Once the dashboards are deployed, you can see them here:
- STAGE: https://grafana.stage.devshift.net/dashboards/f/aGqy3WB7k/addons
- PRODUCTION: https://grafana.app-sre.devshift.net/dashboards/f/sDiLLtgVz/addons
Development Flow
After all the configuration is in place:
STAGE:
- Dashboards on the STAGE Grafana instance should not be used by external audiences other than the people developing the dashboards.
- Changes in the
managed-tenants-slos
repository can be merged by the development team with “/lgtm” comments from those in the OWNERS file. - After merged, changes are automatically delivered to the STAGE grafana instance.
PRODUCTION:
- The dashboards on the PRODUCTION Grafana are pinpointed to a specific git commit from the managed-tenants-slos repository in the corresponding saas-file in app-interface.
- After patching the git commit in the saas-file, owners of the saas-file can merge the promotion with a “/lgtm” comment in the app-interface Merge Request.
1.2 - Dead Man's Snitch Operator Integration
Overview
Dead Man’s Snitch (DMS) is essentially a constantly firing prometheus alert and an external receiver (called a snitch) that will alert should the monitoring stack go down and stop sending alerts. The generation of the snitch URLs is done dynamically via the DMS operator, which runs on hive and is owned by SREP. The snitch URL shows up in a secret.
Usage
The Add-On metadata file (addon.yaml
) allows you to provide
a deadmanssnitch
field (see deadmansnitch
field in
the Add-On metadata file
schema documentation
for more information).
This field allows you to provide the required Dead Man’s Snitch integration configuration.
A DeadmansSnitchIntegration
resource is then created and applied to Hive alongside the Add-On
SelectorSyncSet (SSS).
DeadmansSnitchIntegration Resource
The default DMS configurations which will be created if you specify the bare minimum fields under ‘deadmanssnitch’ field in addon metadata:
- apiVersion: deadmanssnitch.managed.openshift.io/v1alpha1
kind: DeadmansSnitchIntegration
metadata:
name: addon-{{ADDON.metadata['id']}}
namespace: deadmanssnitch-operator
spec:
clusterDeploymentSelector: ## can be overridden by .deadmanssnitch.clusterDeploymentSelector field in addon metadata
matchExpressions:
- key: {{ADDON.metadata['label']}}
operator: In
values:
- "true"
dmsAPIKeySecretRef: ## fixed
name: deadmanssnitch-api-key
namespace: deadmanssnitch-operator
snitchNamePostFix: {{ADDON.metadata['id']}} ## can be overridden by .deadmanssnitch.snitchNamePostFix field in addon metadata
tags: {{ADDON.metadata['deadmanssnitch']['tags']}} ## Required
targetSecretRef:
## can be overridden by .deadmanssnitch.targetSecretRef.name field in addon metadata
name: {{ADDON.metadata['id']}}-deadmanssnitch
## can be overridden by .deadmanssnitch.targetSecretRef.namespace field in addon metadata
namespace: {{ADDON.metadata['targetNamespace']}}
Examples of deadmanssnitch
field in addon.yaml
id: ocs-converged
....
....
deadmanssnitch:
tags: ["ocs-converged-stage"]
....
id: managed-odh
....
....
deadmanssnitch:
snitchNamePostFix: rhods
tags: ["rhods-integration"]
targetSecretRef:
name: redhat-rhods-deadmanssnitch
namespace: redhat-ods-monitoring
....
id: managed-api-service-internal
....
....
deadmanssnitch:
clusterDeploymentSelector:
matchExpressions:
- key: "api.openshift.com/addon-managed-api-service-internal"
operator: In
values:
- "true"
- key: "api.openshift.com/addon-managed-api-service-internal-delete"
operator: NotIn
values:
- 'true'
snitchNamePostFix: rhoam
tags: ["rhoam-production"]
targetSecretRef:
name: redhat-rhoami-deadmanssnitch
namespace: redhat-rhoami-operator
Generated Secret
A secrete will be generated (by default in the same namespace as your addon) with the SNITCH_URL
.
Your add-on will need to pick up the generated secret in cluster and inject it into your
alertmanager config. Example of in-cluster created secret:
kind: Secret
apiVersion: v1
metadata:
namespace: redhat-myaddon-operator
labels:
hive.openshift.io/managed: 'true'
data:
SNITCH_URL: #url like https://nosnch.in/123123123
type: Opaque
Alert
Your alertmanager will need a constantly firing alert that is routed to DMS: Example of an alert that always fires:
- name: DeadManSnitch
interval: 1m
rules:
- alert: DeadManSnitch
expr: vector(1)
labels:
severity: critical
annotations:
description: This is a DeadManSnitch to ensure RHODS monitoring and alerting pipeline is online.
summary: Alerting DeadManSnitch
Route
Example of a route that forwards the firing-alert to DMS:
- match:
alertname: DeadManSnitch
receiver: deadman-snitch
repeat_interval: 5m
Receiver
Example receiver for DMS:
- name: 'deadman-snitch'
webhook_configs:
- url: '<snitch_url>?m=just+checking+in'
send_resolved: false
tags: ["my-addon-production"]
in the Service Delivery DMS account to their pagerduty service.Please log a JIRA with your assigned SRE team to have this completed at least one week before going live with the SRE team.
Current Example
1.3 - PagerDuty Integration
The PagerDuty integration is configured in the pagerduty
field in the
addon.yaml metadata file.
Given this configuration, a secret with the specified name is created in the
specified namespace by the PagerDuty Operator,
which runs on Hive. The secret contains the PAGERDUTY_KEY
.
1.4 - OCM SendGrid Service Integration
OCM SendGrid Service is an event driven service that manages SendGrid subuser accounts and credential bundles based on addon cluster logs.
The secret name and namespace are configured in app interface, see this section in the documentation.
2 - Testing Addons
2.1 - Installing a specific version of an Addon in a staging environment
Add-on services are typically installed using the OpenShift Cluster Manager web console, by selecting the specific addon from the Add-ons tab and clicking Install. However, only the latest version of an addon service can be installed using the OpenShift Cluster Manager console.
In some cases, you might need to install an older version of an addon, for example, to test the upgrade of an addon from one version to the next. Follow this procedure to install a specific version of an addon service in a staging environment.
IMPORTANT: Installing an addon service using this procedure is only recommended for testing upgrades in a staging environment and is not supported for customer-facing production workloads.
Prerequisites
You have the
version_select
capability added to your organization by creating a merge request to the ocm-resources respository.For more information about assigning capabilities to an organization, see Customer Capabilities Management. For more information about enabling the
version_select
capability, see organization YAML example and merge request example.
Procedure
Create a JSON file with the addon service and addon version that you want to install. In this example, the JSON file is
install-payload.json
, the addon id isreference-addon
, and the version we want to install is0.6.7
.Example
{ "addon": { "id": "reference-addon" }, "addon_version": { "id": "0.6.7" } }
NOTE: If the addon that you are installing has a required parameter, ensure that you add it to the JSON file. For instance, the
managed-odh
addon, which is shown in the example below, requires the parameternotification-email
to be included.Example
{ "addon": { "id": "managed-odh" }, "addon_version": { "id": "1.23.0" }, "parameters": { "items": [ { "id": "notification-email", "value": "me@somewhere.com" } ] } }
Set the
CLUSTER_ID
environment variable:export CLUSTER_ID=<your_cluster_internal_id>
Run the following API request to install the addon:
ocm post /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addons --body install-payload.json
Verify the addon installation:
Log into your cluster:
oc login
Run the
oc get addons
command to view the addon installation status:$ oc get addons NAME STATUS AGE reference-addon Pending 10m
Optionally, run the
watch
command to watch the addon installation status:$ watch oc get addons NAME STATUS AGE reference-addon Ready 32m
If you do not want the addon to automatically upgrade to the latest version after installation, delete the addon upgrade policy before the addon installation completes.
List the upgrade policies:
Example
$ ocm get /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addon_upgrade_policies { "kind": "AddonUpgradePolicyList", "page": 1, "size": 1, "total": 1, "items": [ { "kind": "AddonUpgradePolicy", "id": "991a69a5-ce33-11ed-9dda-0a580a8308f5", "href": "/api/clusters_mgmt/v1/clusters/22ogsfo8kd36bk280b6bqbi7l03micmm/addon_upgrade_policies/991a69a5-ce33-11ed-9dda-0a580a8308f5", "schedule": "0,15,30,45 * * * *", "schedule_type": "automatic", "upgrade_type": "ADDON", "version": "", "next_run": "2023-03-29T19:30:00Z", "cluster_id": "22ogsfo8kd36bk280b6bqbi7l03micmm", "addon_id": "reference-addon" } ] }
Delete the addon upgrade policy:
Syntax
ocm delete /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addon_upgrade_policies/<addon_upgrade_policy_id>
Example
ocm delete /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addon_upgrade_policies/991a69a5-ce33-11ed-9dda-0a580a8308f5
Verify the upgrade policy no longer exists:
Syntax
ocm get /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addon_upgrade_policies | grep <addon_upgrade_policy_id>
Example
ocm get /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addon_upgrade_policies | grep 991a69a5-ce33-11ed-9dda-0a580a8308f5
Review the addon installation status and version:
Example
$ oc get addons reference-addon -o yaml apiVersion: addons.managed.openshift.io/v1alpha1 kind: Addon metadata: annotations: ... creationTimestamp: "2023-03-20T19:07:08Z" finalizers: - addons.managed.openshift.io/cache ... spec: displayName: Reference Addon ... pause: false version: 0.6.7 status: conditions: - lastTransitionTime: "2023-03-20T19:08:10Z" message: "" observedGeneration: 2 reason: FullyReconciled status: "True" type: Available - lastTransitionTime: "2023-03-20T19:08:10Z" message: Addon has been successfully installed. observedGeneration: 2 reason: AddonInstalled status: "True" type: Installed lastObservedAvailableCSV: redhat-reference-addon/reference-addon.v0.6.7 observedGeneration: 2 observedVersion: 0.6.7 phase: Ready
In this example, you can see the addon version is set to
0.6.7
andAddonInstalled
status isTrue
.(Optional) If needed, recreate the addon upgrade policy manually.
Create a JSON file with the addon upgrade policy information.
Example of automatic upgrade
{ "kind": "AddonUpgradePolicy", "addon_id": "reference-addon", "cluster_id": "$CLUSTER_ID", "schedule_type": "automatic", "upgrade_type": "ADDON" }
Example of manual upgrade
{ "kind": "AddonUpgradePolicy", "addon_id": "reference-addon", "cluster_id": "$CLUSTER_ID", "schedule_type": "manual", "upgrade_type": "ADDON", "version": "0.7.0" }
In the example above, the schedule_type for the
reference-addon
is set tomanual
and the version to upgrade to is set0.7.0
. The upgrade policy will execute once and the addon will upgrade to version0.7.0
.Run the following API request to install the addon upgrade policy:
Syntax
ocm post /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addon_upgrade_policies --body <your_json_filename>
Example
ocm post /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addon_upgrade_policies --body reference-upgrade-policy.json
Verify the upgrade policy exists:
Syntax
ocm get /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addon_upgrade_policies | jq '.items[] | select(.addon_id=="<addon_id>")'
Example
ocm get /api/clusters_mgmt/v1/clusters/$CLUSTER_ID/addon_upgrade_policies | jq '.items[] | select(.addon_id=="reference-addon")'
Useful commands
Get a list of available addons:
ocm get /api/clusters_mgmt/v1/addons | jq '.items[].id'
Get a list of available versions to install for a given addon id:
Syntax
ocm get /api/clusters_mgmt/v1/addons/<addon-id>/versions | jq '.items[].id'
Example
$ ocm get /api/clusters_mgmt/v1/addons/reference-addon/versions | jq '.items[].id' "0.0.0" "0.1.5" "0.1.6" "0.2.2" "0.3.0" "0.3.1" "0.3.2" "0.4.0" "0.4.1" "0.5.0" "0.5.1" "0.6.0" "0.6.1" "0.6.2" "0.6.3" "0.6.4" "0.6.5" "0.6.6" "0.6.7" "0.7.0"
2.2 - Testing With OCP (Without OCM)
Testing Without OCM
During the development process, it might be useful (and cheaper) to run your addon on an OCP cluster.
You can spin up an OCP cluster on your local machine using CRC.
OCP and OSD differ in one important aspect: OCP gives you full access, while OSD restricts the administrative actions. But Managed Tenants will apply resources as unrestricted admin to OSD, just like you can do with your OCP, so OCP is a good OSD mockup for our use case.
By doing this, you’re skipping:
- OCM and SKU management
- Hive
First, you have to build your catalog. Let take the managed-odh
as example:
$ managedtenants --environment=stage --addons-dir addons --dry-run run --debug tasks/deploy/10_build_push_catalog.py:managed-odh
Loading stage...
Loading stage OK
== TASKS =======================================================================
tasks/deploy/10_build_push_catalog.py:BuildCatalog:managed-odh:stage...
-> creating the temporary directory
-> /tmp/managed-odh-stage-1bkjtsea
-> generating the bundle directory
-> generating the bundle package.yaml
-> building the docker image
-> ['docker', 'build', '-f', PosixPath('/home/apahim/git/managed-tenants/Dockerfile.catalog'), '-t', 'quay.io/osd-addons/opendatahub-operator:stage-91918fe', PosixPath('/tmp/managed-odh-stage-1bkjtsea')]
tasks/deploy/10_build_push_catalog.py:BuildCatalog:managed-odh:stage OK
tasks/deploy/10_build_push_catalog.py:PushCatalog:managed-odh:stage...
-> pushing the docker image
-> ['docker', '--config', '/home/apahim/.docker', 'push', 'quay.io/osd-addons/opendatahub-operator:stage-91918fe']
tasks/deploy/10_build_push_catalog.py:PushCatalog:managed-odh:stage OK
That command has built the image
quay.io/osd-addons/opendatahub-operator:stage-91918fe
on your local machine.
You can inspect the image with:
$ docker run --rm -it --entrypoint "bash" quay.io/osd-addons/opendatahub-operator:stage-91918fe -c "ls manifests/"
0.8.0 1.0.0-experiment managed-odh.package.yml
$ docker run --rm -it --entrypoint "bash" quay.io/osd-addons/opendatahub-operator:stage-91918fe -c "cat manifests/managed-odh.package.yml"
channels:
- currentCSV: opendatahub-operator.1.0.0-experiment
name: beta
defaultChannel: beta
packageName: managed-odh
Next, you have to tag/push that image to some public registry repository of yours:
$ docker tag quay.io/osd-addons/opendatahub-operator:stage-91918fe quay.io/<my-repository>/opendatahub-operator:stage-91918fe
$ docker push quay.io/<my-repository>/opendatahub-operator:stage-91918fe
Getting image source signatures
Copying blob 9fbc4a1ed0b0 done
Copying blob c4d8f7894b7d skipped: already exists
Copying blob 61598d8d1b24 skipped: already exists
Copying blob 38ada4bcd26f skipped: already exists
Copying blob d5fdf1f627c8 skipped: already exists
Copying blob 2bf094d88b12 skipped: already exists
Copying blob 8a6c7bacb5db done
Copying config 3088e48540 done
Writing manifest to image destination
Copying config 3088e48540 [--------------------------------------] 0.0b / 3.6KiB
Writing manifest to image destination
Writing manifest to image destination
Storing signatures
Now we have to apply the OpenShift resources that will install the operator
in the OCP cluster. You can use the managedtenants
command to generate
the stage SelectorSyncSet
and look at it for reference:
$ managedtenants --environment=stage --addons-dir addons --dry-run run --debug tasks/generate/99_generate_SelectorSyncSet.py
Loading stage...
Loading stage OK
== POSTTASKS ===================================================================
tasks/generate/99_generate_SelectorSyncSet.py:GenerateSSS:stage...
-> Generating SSS template /home/apahim/git/managed-tenants/openshift/stage.yaml
tasks/generate/99_generate_SelectorSyncSet.py:GenerateSSS:stage OK
Here’s the SelectorSyncSet snippet we are interested in:
---
- apiVersion: hive.openshift.io/v1
kind: SelectorSyncSet
metadata:
name: addon-managed-odh
spec:
clusterDeploymentSelector:
matchLabels:
api.openshift.com/addon-managed-odh: "true"
resourceApplyMode: Sync
resources:
- apiVersion: v1
kind: Namespace
metadata:
annotations:
openshift.io/node-selector: ""
labels: null
name: redhat-opendatahub
- apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: addon-managed-odh-catalog
namespace: openshift-marketplace
spec:
displayName: Managed Open Data Hub Operator
image: quay.io/osd-addons/opendatahub-operator:stage-${IMAGE_TAG}
publisher: OSD Red Hat Addons
sourceType: grpc
- apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
name: redhat-layered-product-og
namespace: redhat-opendatahub
- apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: addon-managed-odh
namespace: redhat-opendatahub
spec:
channel: beta
name: managed-odh
source: addon-managed-odh-catalog
sourceNamespace: openshift-marketplace
Our OpenShift manifest to be applied to the OCP cluster looks as follows:
kind: List
metadata: {}
apiVersion: v1
items:
- apiVersion: v1
kind: Namespace
metadata:
name: redhat-opendatahub
- apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: addon-managed-odh-catalog
spec:
displayName: Managed Open Data Hub Operator
image: quay.io/<my-repository>/opendatahub-operator:stage-91918fe
publisher: OSD Red Hat Addons
sourceType: grpc
- apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
name: redhat-layered-product-og
- apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: addon-managed-odh
spec:
channel: beta
name: managed-odh
source: addon-managed-odh-catalog
sourceNamespace: openshift-marketplace
Finally, apply it to the OCP cluster:
$ oc apply -f manifest.yaml
Namespace/redhat-opendatahub created
CatalogSource/addon-managed-odh-catalog created
Subscription/addon-managed-odh created
OperatorGroup/redhat-layered-product-og created
Your operator should be installed in the cluster.
2.3 - Testing With OSD-E2E
Testing With OSD-E2E
All Add-Ons must have a reference to a test harness container in a publicly available repository. The Add-On development team is responsible for creating and maintaining the test harness image. That image is generated by the OSD e2e process.
The test harness will be tested against OCP nightly and OSD next.
Please refer to the OSD-E2E Add-On Documentation for more details on how this test harness will be run and how it is expected to report results.
Primer into OSD E2E tests and prow jobs
To ensure certain things such as validating that the addon can be easily and successfully installed on a customer’s cluster, we have prow jobs setup which run e2e tests (one test suite per addon) every 12 hours. If the e2e tests corresponding to any addon fail, then automated alerts/notifications are sent to the addon team. Every addon’s e2e tests are packaged in an image called “testHarness”, which is built and pushed to quay.ioby the team maintaining the addon. Once the “testHarness” image is built and pushed, the team must register their addon to testHarness image’s e2e tests by making a PR against this file.
You can access the portal for prow jobs here. The prow jobs follow the below steps to run the e2e tests. For every e2e test defined inside this file:
- An OSD cluster is created and the addon, which is being tested, is installed. Openshift API is used to perform these operations via the API definition provided at https://api.openshift.com
- The e2e prow job definition, specifically for the addon from this file, is parsed and hence, the parameters required to run its e2e tests will be recognized as well.
- The “testHarness” image for the addon is parsed and executed against the parameters fetched from the above step.
- If an MT-SRE team member notices those tests failing, they should notify the respective team to take a look at them and fix them.
3 - Top Level Operator
3.1 - Customer Notifications
Status Page
https://gitlab.cee.redhat.com/service/app-interface/-/blob/master/docs/app-sre/statuspage.md
https://service.pages.redhat.com/dev-guidelines/docs/appsre/advanced/statuspage/
Service Logs
Internal Email
There are multiple ways a user or group can get notified of service events (e.g. planned maintenance, outages). There are two fields in the addon metadata file (see Add-On metadata file schema documentation for more information) where email addresses can be provided:
addonOwner
: REQUIRED Point of contact for communications from Service Delivery to addon owners. Where possible, this should be a development team mailing list (rather than an individual developer).addonNotifications
: This is a list of additional email addresses of employees who would like to receive notifications about a service.
There is also a mailing list that receives notifications for all services managed by Service Delivery. Subscribe to the sd-notifications mailing list here.
3.2 - Dependencies
This document describes the supported implementation for Addon dependencies, as signed-off by the Managed Tenants SRE Team.
Dependencies Specification
- Addons must specify dependencies using the OLM dependencies feature, documented here
- The dependencies must have the version pin-pointed. Ranges are not allowed.
- The dependencies must come from a Trusted Catalog. See the Trusted Catalogs section for details.
Trusted Catalogs
The Addon and its dependencies must come from Trusted Catalogs. Trusted Catalogs are those with content published by the Managed Services Pipelines, implemented by CPaaS, or by the Managed Tenants SRE Team.
Trusted Catalogs List
- Addon catalog: the catalog created by the Managed Tenants SRE Team, for the purpose of releasing the Addon. Dependency bundles can be shipped in the same catalog of the Addon. The Addon catalog is considered “trusted” for the dependencies it carries.
- Red Hat Operators catalog: the catalog content goes through the Managed Services Pipelines, same process to build some Addons themselves, just with a different release process. This catalog is considered “trusted” and can be used for dependencies.
Including a Catalog in the Trusted List
- Make sure that the catalog is available on OSD and its content is released through the Managed Services Pipelines, implemented by CPaaS.
- Create a Jira ticket in the MT-SRE Team backlog, requesting the assessment of the OSD catalog you want to consider as “trusted”.
Issues
There’s a feature request to the OLM Team to allow specifying the CatalogSource used for the dependencies:
3.3 - Environments
Mandatory environments
Add-ons are normally deployed to two environments:
ocm stage
: development/testing - All add-ons must deploy to this environment before being released to production.ocm production
: once the deployment in stage has been reviewed, accepted, and approved it can be promoted to production via/lgtm
by your SRE team.
We recommend the ocm stage
and ocm production
add-on metadata be as
similar as possible.
SLOs
ocm stage
have no SLO and operates with best effort support from Add-on SRE,
SREP, and App-SRE
osd stage cluster
have no SLO and operates with best effort support from Add-on
SRE, SREP, and App-SRE
ocm production
environments are subject to App-SRE SLOs.
osd production cluster
environments are subject OSD SLOs.
Additional Environments (via duplicate add-ons)
Some add-on providers have had use cases which require additional add-on envs.
While we only have ocm stage
and ocm prod
, managed-tenants may be leveraged
to deploy to an additional add-on (like edge or internal). Today we don’t recommend
this practice due to the need to clone all add-on metadata which increases the
risk for incorrect metadata going to production/customer clusters.
If you need to do the above, please reach out to your assigned SRE team for guidance first.
3.4 - Plug and Play Addon
Package Operator
Package Operator is a Kubernetes Operator for packaging and managing a collection of arbitrary Kubernetes objects.
Each addon with a packageOperator
defined in its spec
will have a corresponding
ClusterObjectTemplate.
The ClusterObjectTemplate is an API defined in Package
Operator, enabling users to create an object by templating a
manifest and injecting values retrieved from other arbitrary source objects.
However, regular users typically do not need to interact with the ClusterObjectTemplate
.
Instead, they can interact with the generated
ClusterPackage
manifest.
Example of a ClusterPackage
manifest:
apiVersion: package-operator.run/v1alpha1
kind: ClusterPackage
metadata:
name: <addon_name>
spec:
image: <addon.spec.packageOperator>
config:
addonsv1:
clusterID: a440b136-b2d6-406b-a884-fca2d62cd170
deadMansSnitchUrl: https://example.com/test-snitch-url
ocmClusterID: abc123
ocmClusterName: asdf
pagerDutyKey: 1234567890ABCDEF
parameters:
foo1: bar
foo2: baz
targetNamespace: pko-test-ns-00-req-apy-dsy-pdy
The
deadMansSnitchUrl
andpagerDutyKey
are obtained from the ConfigMaps using their default names and locations. IMPORTANT: To successfully inject thedeadMansSnitchUrl
andpagerDutyKey
values into theClusterPackage
manifest, you must keep the default naming scheme and location of the corresponding ConfigMaps. See the addons deadMansSnitch and addons pagerDuty documentation for more information.Additionally, all the values present in
.spec.config.addonsv1
can be injected into the objects within your packageImage. See the package operator documentation for more information.
Tenants Onboarding Steps
Although you can generate the packageImage yourself using the package operator documentation, we recommend you use the Managed Tenants Bundles (MTB) facilities.
The following steps are an example of generating the packageImage for the reference-addon package using the MTB flow:
In the MTB repository, create a
package
directory and add themanifests.yaml
inside thepackage
directory. See the following merge request for an example.The MTB CI creates the packageImage and the Operator Lifecycle Manager (OLM) Index Image as part of the team’s addon folder.
The MTB CI creates a merge request to the managed-tenants repository and adds a new AddonImageSet with the PackageImage and OLM Index images.
4 - managed-tenants Repository
Addons are deployed through GitOps pipelines. Most of the configuration for Addons can be found in the managed-tenants Repository . See the create an addon documentation page for a good starting point.
5 - SKU
NOTE MT-SRE do not influence SKU creation/priorities. You must work with OCM directly for this.
Requesting a SKU
To request a SKU, please complete the following steps:
- Determine a unique quota ID for the addon. This should be
lowercase with dashes and of the format
addon-<addon-name>
. For example:addon-prow-operator
- Create a JIRA Request at Openshift Cluster Manager
with the subject
Request for new Add-On SKU in OCM
and the following information:- Add-On name.
- Add-On owner.
- Requested Add-On unique quota ID.
- Additional information that would help qualify the ask, including goals, timelines, etc., you might have in mind.
- You will need at least your PM and the OCM PM’s to sign off before the SKU is created. We expect to resolve these requests within 7 working days.
Requesting SKU Attributes Changes
From time to time you may want to update some SKU fields like supported cloud providers, quota cost, product support, etc. To do this:
- Create a JIRA Request at Openshift Cluster Manager
- Ping the ticket in #service-development-b Slack channel (@sd-b-team is the handle)
- This requires an update to be committed in-code in AMS, then deployed to stage and eventually prod (allow up to 7 working days).
Current Status
To check current SKUs and attributes, see OCM Resource Cost Mappings.