We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.
Introducing – Aurora Global Database Failover
On August 2023, the Aurora team
Note : We are retiring the old Managed Planned Failover terminology and will use Global Database Switchover from now on to refer to the same functionality.
Overview of Aurora Global Database
A Global Database cluster comprises multiple Regional DB clusters. A Global Database cluster may have up to six regional DB clusters in supported Regions, such as DB clusters in both us-east-1
and us-west-2
.
In the Global Database topology, only one Region is primary and all other Regions are secondary. The primary Region contains the only writer DB instance and also contains the only active writer endpoint. The writer endpoint always points to the active writer node. The secondary Regions also have writer endpoints, but they’re inactive.
Each Regional DB cluster also has a reader endpoint. The reader endpoint balances read traffic to read replicas in a Regional DB cluster, if there are any. The reader endpoint isn’t affected after a Global Database Failover.
There are two scenarios in which customers change their primary Region – as a response to a planned event (for example, a regional rotation) or a response to an unplanned event (for example, a regional outage). A
Challenges of an unplanned failover
An unplanned failover may be required when a current primary Region DB cluster experiences a service level outage or there is a complete primary Region outage. It’s worth noting that such a scenario is rare because of the high resilience of Amazon Web Services regional architecture; however, it cannot be completely ruled out. Traditionally, an Aurora Global Database Failover was achieved by manually detaching one of the surviving secondary Region DB clusters and promoting it. This approach worked well, but it had some challenges.
First, the existing approach of
Introducing Global Database Failover
With the launch of the new Global Database Failover feature, you can now manage an unplanned Aurora Global Database Failover such as a regional service outage. This feature is available for new Global Database deployments and also made retroactively available to existing Aurora Global Database deployments.
To use the Global Database Failover feature, you are not required to make configuration changes. You can initiate a
us-east-1
, us-west-2
and eu-east-1
Regions, and us-west-2
is experiencing a service outage. In this scenario, Global Database Failover can be initiated from either us-east-1
or eu-east-1
Region.
Once the Global Database Failover is initiated, the following steps are taken:
- User chosen secondary DB cluster promotes one of the read replicas as a writer and assumes the role of primary DB cluster in the Global Database topology.
- If the Global Database topology had other secondary DB clusters, they are rebuilt.
- Aurora Global Database service continues to monitor the availability of the old primary Region. When it becomes available and healthy, Aurora Global Database adds this Region back to the Global Database by restoring a snapshot of the current Primary Region DB cluster.
- When the old primary Region is added back to the Global Database cluster, an attempt is made to take a snapshot of the old storage volume, and if successful, a snapshot is made available with the naming convention of
rds:unplanned-global-failover-<cluster name>-timestamp
Global Database Failover reduces the operational overhead of manually promoting Regional DB clusters, while preserving the Global Database topology. After a failover is completed, you can point applications to the writer endpoint of the new primary DB cluster, and start reading and writing from the new DB cluster. If you have applications and clients in other surviving secondary Regions, they must wait until the secondary DB clusters are rebuilt before they can start reading from those Region DB cluster reader endpoints.
Best Practices
During Global Database Failover, user chosen secondary Region DB cluster is promoted to primary. However, it does not automatically inherit the configuration options of the primary. To mitigate configuration mismatch, we recommend you create an Aurora DB cluster parameter group and configure the options in advance. We also recommend creating and configuring monitoring tools such as
For a more complete list of recommendations, refer to the
Performing Global Database Failover
When you create an Aurora Global Database with at least 1 secondary Region DB cluster comprising at least one instance, you have an option to perform Global Database Failover and Global Database Switchover functions. To set up an Aurora Global Database, see
In the following example (Figure.1), we have created an Aurora Global Database with 2 DB clusters. The primary Region is located in ap-southeast-1
and the secondary Region is located in ap-south-1
Region.

Figure.1 A Global Database Cluster
To perform the Global Database Failover using the console, complete the following steps:
- Select the Aurora Global Database cluster you want to fail over.
- On the Actions menu, choose Switch over or fail over global database .
Figure.2 Failover Global Database cluster
- For the Target DB cluster, choose the active secondary Aurora DB cluster that you want to promote to primary.
- Select Failover (allow data loss) as a Failover reason. To confirm the failover, type confirm and choose Confirm .
After performing the failover, you can see the database status change. You can follow the status column of the database list to monitor the failover process. Due to the asynchronous nature of Global Database replication, there is a possibility of some data loss after the failover on the new Primary DB cluster.

Figure.4 Failover in progress
The new primary Region DB cluster is available first and it takes a few minutes to setup the replication from new primary to new secondary DB cluster. Depending on the size of the database, new secondary Region clusters may take anywhere between a few minutes to a few hours to set up. If there were other secondary Region DB clusters, they are also recreated, and replication is re-established. After the failover completes, you can see both primary and secondary Region DB clusters become available.

Figure.5 Failover completed
To perform a Global Database Failover using Amazon Web Services CLI, you can use a command similar to the following:
Conclusion
In this post, we explored the newly launched Aurora Global Database Failover feature and its benefits. You can use this feature to rapidly recover an Aurora Global Database cluster from failover during an unplanned outage with reduced operational burden. To learn more about Aurora Global Database, look at the
About the authors
Aditya Samant is a relational database industry veteran with over 2 decades of experience working with commercial and open-source databases. Over the years, he has held many roles, including database consultant, professional support, DBA and database architect. He currently works at Amazon Web Services as a Sr. Database Specialist Solutions Architect. In his current role, he spends his time working with customers designing scalable, secure and robust cloud native architectures. Aditya also works closely with the service teams and collaborates on designing and delivery of the new features for Amazon’s flagship relational database, Amazon Aurora.
Surendar Munimohan is a Sr. Database Solutions Architect with Amazon Web Services. He has more than a decade of experience working with relational databases and architecting highly scalable applications in the Amazon Web Services. In his current role, he works with customer to design scalable, highly available, secure and cost-effective solutions in Amazon Web Services Cloud.
The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.