Feb 25, 2025
DNS: A Deep Dive in AWS Resources & Best Practices to Adopt
0 min read
Share:
Endpoints on the internet and in your private data centers have an address on the form 17.18.19.20. This is an IP address. More specifically, an IPv4 address. Nowadays, you might also see an IPv6 address for your different endpoints.
For modern cloud infrastructures, using IP addresses is error prone and unreliable. IP addresses are ephemeral. Even if you have static and long-lived endpoints with the same IP address year in and year out, it can still be a good idea to use a friendly name instead of a string containing a lot of numbers.
This is where the Domain Name System (DNS) enters the story. DNS assigns a friendly name to your IP addresses. Instead of connecting to 17.18.19.20 to reach your Linux machine, you might connect to my-linux-machine.example.com instead.
DNS is used everywhere. From simple websites to multi-tier cloud architectures spanning different cloud providers and multiple regions. This puts DNS in a unique position to be one of the top candidates for breaking cloud infrastructure.
DNS on AWS
The DNS service on AWS is named Amazon Route 53.
In Route 53 you create one or more DNS zones. Each DNS zone corresponds to one domain name, e.g. example.com. Each DNS zone contains one or more DNS records. Each record corresponds to some subdomain to the domain name of the DNS zone (e.g. www.example.com or db.example.com).
There are multiple types of records. The two most common types are A-records and CNAME-records. An A-record assigns a friendly name (e.g. www.example.com) to an IP address (e.g. 17.18.19.20). A CNAME-record assigns a friendly name to another friendly name (e.g. www.example.com that points at app.s3-website-eu-west-1.amazonaws.com).
A different record type that is important is the NS-record type. NS is short for nameserver. Each hosted zone on AWS has a number of nameservers associated with it. A nameserver is the server address that is authoritative for the domain (e.g. example.com). This is how the global DNS system knows which nameserver is to be trusted when it comes to queries for the domain.
DNS zones can be public or private. Public DNS zones contain records that anyone can query the DNS system for and get a response. Private DNS zones can only be queried from within a private environment on AWS, e.g. an AWS Virtual Private Cloud (VPC).
Both private and public DNS zones are heavily used in modern cloud infrastructures.
The Route 53 service offers more features than the basic DNS features discussed above, but they are outside the scope of this blog post.
Manage AWS DNS with Terraform
The AWS provider for Terraform has full support to manage all aspects of DNS with Route 53 on AWS.
To create a public DNS zone you use the aws_route53_zone resource type:
Note that example.com is a special case. You will not be able to create this hosted zone on AWS because it is reserved. It is used as an illustrative example in this case.
Each zone you create has several DNS nameservers. There is automatically an NS-record for the nameservers created in your hosted zone. You need to provide these nameservers to your DNS name registrar, unless you are using AWS as the domain name registrar.
You can also find the nameservers in the AWS UI or as an attribute on the DNS zone resource:
It is common that you set up a hosted zone for a subdomain (e.g. dev.example.com). This is a good way to organize and manage your domains at scale. This also allows you to delegate administration of a subdomain to a different team. You could assign permissions for users to a subdomain, while not giving them any access to the parent domain.
To make sure the chain of authority for DNS queries works as intended, you must connect the parent domain with the subdomain. You can do this by connecting the two corresponding hosted zones in Route 53.
In Terraform, you do this by setting up a new DNS zone resource for the subdomain, and adding an NS-record to the parent DNS zone referencing the new child DNS zone:
The NS-record is added using the aws_route53_record resource type. Adding A-records or CNAME-records works the same way.
An example of an A-record for server.example.com pointing at the IP address 17.18.19.20 is shown below:
An example of a CNAME-record pointing at the S3 bucket website endpoint my-website-bucket.s3-website-eu-west-1.amazonaws.com is shown below:
This covers the basics of managing DNS resources on AWS Route 53 using Terraform.
How to break your infrastructure with DNS
Managing DNS at scale can be complex. The most devastating accidental change you can make is deleting a whole hosted zone for a domain along with all of the DNS records that it contains.
Imagine working as a cloud engineer in the platform team at an organization named Umbrella Security. Your organization is a cloud security solutions provider covering every aspect of the security space.
You are in charge of the public DNS hosted zone for the organization. This zone is used for most public facing websites and applications that your customers have access to. The domain associated with this hosted zone is umbrellasecurity.cloud.
The hosted zone for umbrellasecurity.cloud is part of the platform team infrastructure and it has been set up like this:
You have also set up hosted zones for a number of subdomains. You currently have three subdomains, one for each environment that your developers are working with (dev, stage and prod). The prod environment child hosted zone is configured like this:
The development team in charge of setting up the application reachable on the different subdomains is currently using a Terraform data source to reference the child hosted zone for each subdomain.
For the prod environment this looks like:
The development team knows the data source reference to the hosted zone for the subdomain is a hidden dependency to the platform team, but assumes that any changes made will be communicated properly between the teams. So far the collaboration has been working fine.
The platform team has decided to move away from using a subdomain for the prod environment, and would instead want the umbrellasecurity.cloud domain to be used instead. This would make public domain names more professional.
To prepare for the change the platform team has communicated the intended update to the development team and set a deadline when the old hosted zone for the prod subdomain will be deleted.
By default, it is impossible to delete a hosted zone with Terraform if it contains DNS records. The platform team knows about this safety mechanism but since the hosted zone is planned to be destroyed they make a preparatory update to the hosted zone:
The deadline for the destruction of the prod subdomain is approaching and the platform team has prepared a pull-request with the change:
data:image/s3,"s3://crabby-images/07397/07397119bb4d58d6e1fe73e960e974920eefcf1d" alt=""
The platform team sets up a quick sync call with the development team before the pull-request is approved and the automation is triggered that runs Terraform to destroy the old prod hosted zone.
The workday is over and it is time to go home.
Meanwhile, a different development team is running a demo session with a new potential customer to showcase their applications. All of a sudden their application is giving them an unpleasant response:
data:image/s3,"s3://crabby-images/18b7b/18b7bc718dc027514b4eeca2549bc54c41aff6b2" alt=""
It is rare that you accidentally delete a hosted zone, but the possibility is there and accidents happen. How big of an impact an accident like this has depends on your DNS environment setup and how other teams are consuming this environment. However, in general the impact will be large.
One solution to the scenario outlined above is knowledge and proper communication. But relying on knowledge and communication alone is error-prone. Even a well-planned change can miss hidden dependencies in your infrastructure environment. You might be aware of many hidden dependencies from your experience of the environment, but chances are there are gaps in your awareness.
Anyshift is an AI-powered tool that can assist you in a situation like this. Anyshift provides a comprehensive map of your cloud infrastructure and can correlate existing resources with their definitions in your source code repositories. This allows Anyshift to detect hidden dependencies that are difficult to discover using other means.
The impact analysis feature of Anyshift can inform you of hidden dependencies and the blast radius of a change to your Terraform infrastructure. As an example, a pull-request introducing a change similar to what was outlined in the scenario above finds the dependency between the two Terraform configurations located in different GitHub repositories:
data:image/s3,"s3://crabby-images/972ce/972cec1e19d967813c03e848a286cc99b4c70c72" alt=""
This allows you to investigate the impacted resources to understand if your intended change is safe to perform or not. You get a natural language analysis of what the change means for your infrastructure:
data:image/s3,"s3://crabby-images/ae5cb/ae5cbf7c944c2d68e2dd865b206966b830a06051" alt=""
Learn more on Anyshift.io.
Best practices for DNS on AWS
There are a number of best practices around DNS with Amazon Route 53 to keep in mind.
Use hosted zones wisely
Using hosted zones for each subdomain provides the following benefits:
You can delegate the administration of a hosted zone to a team. For instance, if a team is responsible for a service available at team-1.example.com, then this team can have administrative privileges over team-1.example.com but no direct privileges for the example.com hosted zone.
It minimizes the blast radius of any misconfigurations to a given hosted zone. If you accidentally delete a hosted zone for a subdomain, then this does not affect any other hosted zone.
Managing records in different child hosted zones is more manageable than using a single hosted zone containing all the records.
You must take care to manage the hosted zones wisely using infrastructure as code and take regular backups (see below).
Backup your hosted zones
As with other cloud infrastructure, you should take regular backups of your hosted zones. This means exporting all the details of the hosted zone and all the records that are part of the hosted zone.
There are no native solutions for taking regular backups of hosted zones on Route 53, but you can use the AWS CLI and build a simple script that performs this job. You could also use the boto3 package in Python to create a script to do this.
You should also test how you can perform a restore of these backups to discover any issues with the process before you need to use the backups in a real situation.
Use routing policies for complex infrastructures
Route 53 offers routing policies for complex infrastructures. One example is if you have infrastructure distributed across multiple AWS regions. You can set up routing policies for a DNS record to route users to the closest region.
Another type of routing policy is weighted routing. This allows you to send your traffic to different endpoints based on weight. One use-case for this is canary testing, where you send a small percentage of your traffic to a new application version while you monitor the behavior of the new version before you shift all of your traffic to the new version.
Use DNS Security Extensions (DNSSEC)
DNS is susceptible to DNS spoofing or man-in-the-middle attacks. This is when an attacker intercepts DNS queries intended for your hosted zone, and injects their own endpoints as the response. To protect your environment from these types of attacks you can configure the DNS Security Extensions (DNSSEC).
Amazon Route 53 supports DNSSEC signing and DNSSEC for domain registration. The details of how DNSSEC works is outside the scope of this blog post.
Use alias records where possible
An alias record is a special type of record supported in Amazon Route 53. They can only point at specific AWS resources (e.g. API Gateways, CloudFront distributions, S3 buckets and more).
When you create an alias record you point at a given resource, not at a static value (IP address or hostname). The benefit you gain from this is if the underlying resource would change in any way, the alias record is automatically updated to reflect this change.
Use infrastructure as code
You should administer your DNS environment using infrastructure as code. As mentioned, you can manage all aspects of AWS Route 53 using Terraform.
This practice will partly remove the need to take regular backups of your hosted zones, because the Terraform configuration will work as a backup and can be used to bring your environment back in case of issues. However, in some situations restoring a backup could be much quicker than provisioning everything with Terraform. You should test both approaches to determine what is the best approach in an emergency. You can always import resources to Terraform if needed after a backup has been restored.
Use health checks
A simple but effective feature of Route 53 are health checks. You can use this feature standalone, even if you do not use Route 53 for your DNS management.
A health check is a period test where you query an endpoint and record the response. The available settings for a health check are shown in the following figure:
data:image/s3,"s3://crabby-images/ac8b2/ac8b2e61a8b3b92156fed26183d23dcba90106c4" alt=""
A health check is a simple test that mimics the behavior of your end users.
You can use health checks for your own applications and endpoints, but you should also set up health checks for your external dependencies.
Monitor your DNS environment
You should set up monitoring of your DNS environment. This is like any other cloud infrastructure. The easiest way to start is with Amazon CloudWatch, the native service for monitoring on AWS.
One important metric to keep track of for your hosted zone is DNSQueries, the number of DNS queries that are processed in your hosted zone during a specific time period. This metric can quickly indicate if there are any issues for the hosted zone, or if you are experiencing a DDoS attack.
Conclusions
Since DNS is in heavy use in most cloud infrastructures, it is a great candidate for breaking your cloud infrastructure.
The DNS service on AWS is Amazon Route 53. This service supports most of the features of modern DNS requirements, and you can manage it completely using infrastructure as code with Terraform.
DNS is heavily used in cloud infrastructures. We saw an example of how working with DNS in an organization can cause issues if care is not taken. Since most applications, systems and cloud resources you work with require DNS records this means DNS is everywhere in your Terraform configurations. This means there are likely many hidden dependencies among your Terraform configurations related to DNS.
You can use Anyshift to highlight dependencies between your Terraform configurations. Anyshift uses your cloud environment, source code repositories, and Terraform state files to provide you with intelligent insights.