Cost Effective! AWS Database DR Strategy that makes sense!
Power of cloud when you actually need it
A DR strategy might not be the most talked-about topic in the cloud right now. Still, it's essential for meeting compliance and high-availability needs for your applications and company.
Many blogs explain what a DR strategy involves and how to reduce the cost of lowering your RTO and RPO for your databases. However, this blog is about what I recently learned firsthand, which could help reduce your DR database costs.
On AWS, we use multi-AZ RDS Aurora, a popular database choice. It offers effective switchover and failover methods and reliable, eventually consistent data replication across regions. So, for most of our production workloads, we have a
Global RDS Cluster
—> Primary region cluster
——> Primary db instance
—> Secondary region cluster
——> Secondary db instance
Here, the secondary region cluster is the backup cluster for a region or service failover. But if you look closer, you will see that the secondary database instance, which is the actual compute behind RDS, is not doing anything until the failover happens. Only the storage is being used to replicate transactions across regions.
You would have guessed that I am talking about using a headless cluster to save costs for the secondary region database.
So, what’s a headless cluster?
As the name suggests, it's a body without a head—or, you can say, without a computer. So, AWS has decoupled the storage from the computing, which obviously lowers our cost by not using the compute.
So after implementing, what you will be left with is
Global RDS Cluster
—> Primary region cluster
——> Primary db instance
—> Secondary region cluster
You might be wondering what happens when you actually face a failover. As there is no instance in DR, the database will not work. But as we all have alerts in place for things like these, we can automate or manually add an instance in the DR region by running the aws cli command or through the console.
In this case, the RTO will be no more than the time it takes you to respond and bring up an instance in the Dr region. In this sense, we can save cost for 99% of the time we are not using that secondary DB and have a 100% sense of data protection and replication to boost availability.
Headless cluster option is not supported in aws console, you have to run cli commands to make use of this.
Some transactions might get lost when you do a faillover because rds works on the principle of asynchronous eventual consistency. But aws has solutions arround it which you can read in detail @ here
You can create an Aurora global database with a headless configuration in one of the following ways:
Convert an Aurora cluster to a global database with a headless configuration – You can create an Aurora global database with a headless configuration by adding a secondary cluster without an Aurora database instance.
Modify an existing Aurora global cluster to create headless configuration – To convert a secondary region cluster into a headless configuration, you can delete the Aurora database instances from the secondary region Aurora cluster. This secondary cluster is now considered headless.
The Aurora global database uses the dedicated infrastructure in the purpose-built storage layer to handle replication across regions. The storage volume used by the headless secondary cluster (now) is kept in sync with the primary region, the Aurora cluster.
Steps for implementing this is here
Conclusion
RDS Aurora is a killer product in itself. Steps like this by AWS encourage us to rely on them effectively without worrying about the large check we have to sign for things we don’t even use. Shoutout 🥳 to AWS for this customer-centric approach to their services.
References
Using switchover or failover in Amazon Aurora Global Database - Amazon Aurora