Disaster Recovery β
Cloud-IAM Keycloak deployments are designed for high availability and resilience.
By default, each deployment is distributed across all availability zones (multi-AZ) within the selected cloud provider region. This architecture minimizes the risk of downtime due to infrastructure or network failures. To learn more about Keycloak deployment architecture, please refer to the Cloud-IAM architecture.
However, major incidents can still occur, for example, a complete outage affecting the entire cloud provider region. In such cases, Cloud-IAMβs on-call team is prepared to recreate any deployment from scratch using its cold backup.
Recovery Scenarios β
Cloud-IAMβs disaster recovery process follows a strict, predefined sequence to ensure actions are taken securely, consistently, and in full alignment with our compliance framework. If the preferred recovery option is not possible, the process automatically moves to the next scenario, always prioritizing the fastest and safest restoration path.
In every scenario, the customer is informed and must validate the upcoming process before it begins. No critical action will be taken without the explicit consent of the customer, in accordance with our RACI framework.
Scenario 0 β Restore the impacted Availability Zone (AZ)
- First action if the failure is limited to a single Availability Zone and recovery can be achieved locally within the same region.
Scenario 1 β Restore in the same region
- Chosen when the affected region becomes available again within a short timeframe.
Scenario 2 β Restore in a different region of the same cloud provider
- Activated when a regional outage is prolonged or recovery in the original location is not immediately possible.
Scenario 3 β Restore in a different cloud provider
- Executed only with explicit customer approval to ensure security, compliance, and compatibility.
This structured approach ensures clarity, security, and predictability during critical recovery operations, eliminating guesswork and enabling the on-call team to act quickly and decisively.
Disaster Recovery Testing β
As part of our ISO 27001 certification, Cloud-IAM maintains a formal Disaster Recovery Plan.
- Weekly: Our automated recovery process is tested as part of delivery pipelines.
- Annually: A full disaster recovery drill is executed to validate end-to-end restoration capability.
Recovery Objectives β
Single Region deployment β
On single region deployment Recovery Objective is contingent on the frequency of backups, which can be configure by the customer depending on is support level.
- RTO (Recovery Time Objective) β Maximum time required to fully restore your deployment: Up to 2 hours.
- RPO (Recovery Point Objective) β Maximum potential data loss, measured from the time of the last backup: Less than 24 hours (varies by backup frequency and support level)
Multi-Region deployment β
On multi-region deployment Recovery Objective is contingent on the database technology used.
- RTO (Recovery Time Objective) β Maximum time required to fully restore your deployment: Up to 1 hour.
- RPO (Recovery Point Objective) β Maximum potential data loss: Up to 15 minutes
For details information on RPO and available customization, please refer to the Cloud-IAM Service Level Agreement (SLA)
Communication during a disaster recovery operation β
During the recovery process, the Cloud-IAM on-call team will keep all affected customers informed through timely email updates. No critical action will be taken without the explicit consent of the customer, in accordance with our RACI framework.
Each update will provide:
- Current incident status
- Actions completed and those in progress
- Estimated time to recovery
- Schedule for the next update
Ensure You Receive Important Email Notifications
To ensure you receive all important email communications without interruption, please:
- Add and regularly update your organizationβs contact information (e.g., how to add additional contacts to your organization, how to add new user in your organization).
- Whitelist emails from support[at]cloud-iam.com to prevent them from being marked as spam.