This the multi-page printable view of this section. Click here to print.
Onboarding
1 - Introduction
The Repeatable Onboarding for Managed Services (ROMS) article on source has a great introduction to managed services.
There are multiple steps to onboard a managed service:
- First the ROMS process needs to be completed.
- Then the onboardig acceptance criteria checklist has to be completed, along with other requirements, to onboard the addon to production.
- The addon must then pass a soak test.
- The addon then has to run in production for a given amount of time to prove its stability, and then the pager can be handed over. See the pager handover section for more information.
2 - ROMS
Please start at the ROMS documentation on the Source here to get an overview of managed services and ROMS.
Kickoff Overview
The Service Owners starts ROMS Checklist (including SAG review). SRES Architects will evaluate the service, determine whether it’s suitable and viable. If accepted it will receive a prioritization rating from the SRES Architects, and finally assigned to an SRE team by the Addon SRE Manager.
IMPORTANT: As an addon provider you can self-service onto OCM stage via the addon
flow and start prototyping before completing ROMS. You are free to experiment and
kick the tires! The only restrictions in place are the available support forums to
the Addon SRE team. These are limited to
#forum-managed-tenants
and to the weekly SD: Layered Products Sync
call (You can request an invitation via #forum-managed-tenants)
Kickoff Steps
- Service Owner creates an Epic in the SDE board.
- Service Owner will request sign-off from SRES Architects: Paul Bergene, Jaime Melis and Karanbir Singh. Note that the preferred communication channels are the JIRA or the #sd-org channel.
- Service Owner includes reference to the SRES Onboarding Questionnaire (part of ROMS) in the epic.
- The epic should also reference a Service Definition Document. Service Definition examples: OSD, OCS, RHODS, RHOAM
- SRES Architects and the Addon SRE team lead will review the epic, discuss with the Service Owners and accept or reject it. SRES Architects will also assign a priority.
- Addon SRE lead will scope out the work needed to onboard the service into Addon SRE.
3 - Acceptance Criteria Checklist
Addon SRE Acceptance Criteria checklist
The Service Owners, with the assistance of the Addon SRE team, will deploy the service to production while the SLOs are being implemented and fine-tuned. The Addon SRE onboarding team will work through the Addon SRE Acceptance Criteria Checklist process.
Additional Requirements
There are also requirements from other teams, which can be found here.
Next Steps
If all requirements are passed successfully, the service owners will be allowed to deploy to production. Once the service is running in production, the service can begin its transition period. This is where an addon proves its stability for a given amount of time and then the pager is handed over to the SRE team. Read more about the transition period here.
4 - Pager Handover
Transition Overview
This stage focuses on the viability of the service. If, after an agreed amount of time of being in production(defaults to a 4-week rolling window), the service meets its SLOs then it will be considered viable by SRE and the pager will be transferred to the Addon SRE team.
Viability Steps
- The service runs in production and SLO data is collected.
- An Addon SRE reviews the SLOs and SOPs.
- If the SLOs are met, the service will be transitioned to Addon SRE.
- Any critical alerts will be routed to the Addon SRE 24x7 PagerDuty escalation policy.
Further Links
For information on how to set up the PagerDuty integration for your addon, see the PagerDuty Integration documentation.
5 - Soak Test
WIP