The tale of a cheap DNS provider
Introduction
Let’s suppose you’ve just had the most brilliant idea for a product, and you even got through the second most difficult challenge: finding the perfect name for it. Now comes the easy part: register your domain, hoping nobody has claimed it yet, and create your fancy, visually appealing website that may attract lots of customers and investors worldwide.
In the earliest stage of Colossyan’s startup journey (August of 2021), the founders have found themselves in this very same situation, and they have bought the colossyan.com domain at dreamhost.com. It was the most budget-friendly alternative for starting off with a domain and a website, at least for non-frontend engineers. This article is about this decision and our journey with Dreamhost that never ceased to amaze us. Not wanting to judge Dreamhost though, without it, Colossyan wouldn’t even exist, but that extra $5 for not buying this domain in another, more established DNS provider has still had its effect to this day.
The API
When you check a product’s attractive main page it is usually not your first thought to check the product’s API documentation. Even if we did that, we probably would have just gone along with Dreamhost anyways as we wouldn’t use it, or would we?
Obviously, the Dreamhost API is used for managing the DNS record in Dreamhost. It has been written long ago, and has never been rewritten to modern standards. That being said, it has the following “features” that should be kept in mind when interacting with it:
- It does not support even the most primitive (Basic Auth) authentication.
You must provide your API key in a query parameter. Hey, at least we’re not communicating through HTTP, right? - There are only “Add”, “List” and “Delete” endpoints. There’s no support for getting DNS records for paths or filtering them on server-side.
- Every operation, including the “Add” and “Delete”, uses the GET HTTP method. Intuitive. E.g. an add DNS record operation
https://api.dreamhost.com/?key=1A2B3C4D5E6F7G8H&cmd=dns-add_record&record=example.com&type=TXT&value=test123
- Has pretty interesting rate limiting support.
Could not find out the exact values, but after you reach your hourly limit, it will deny your access for at least an hour. No documentation.
response: slow_down_bucko
- It also has no API wrapper/library available at the time.
Application and workaround
As Colossyan has matured, so did our technology stack. The first version of the Creator was written in Next.js, which enables the creation of full-stack web applications by extending the latest React features and integrating powerful Rust-based JavaScript tooling for the fastest builds. Next.js comes with a powerful platform called Vercel, which helps with the tooling of Next.js and enables blazing fast builds and continuous deployment for your application.
The reason why we didn’t start with Vercel in the first place is because when the first Colossyan application prototype was created, we didn’t have any customers, and would like to avoid locking us to an expensive vendor at day 0. We started using AWS Amplify for a brief period, but realized quickly (but probably not quickly enough), that it does not yield faster development - at least for us. When we migrated away, we’ve already opted out from many NextJS features, and it became difficult to go back. A few months later we become even farther away from this goal. Being an AI-based video-making platform, we support huge uploaded videos as well, certain pieces of information like the duration and the codecs of uploaded assets are better calculated server-side before sending them down to our AI models. This has caused as big as 8Gb memory spikes (compared to the basically ~100Mb average memory footprint) which was making it infeasible to run this in Vercel, as that platform is focused more on edge use-cases and less on resource heavy processing.
The ability to spin up temporary deployments is crucial to maintain quick feedback loops of development teams, so the demand was given. The resolution was to implement our own solution, on top of an automation of our used DNS provider, Dreamhost.
The implemented solution was actually a 2 day work in June 2022. The automation uses
github-pr-controller for synchronizing the pull requests of a repository and go-dreamhost to create URLs (domain records) for a pull request. The rest of the automation is a closed-sourced Kubernetes controller which syncs the Kubernetes pods (with images of the build Next.js image) and ingress rules.
The root domain
WordPress might be familiar for some folks, since it’s been around since 2003. It’s a versatile and user-friendly content management system (CMS) that empowers individuals and businesses to effortlessly create and manage stunning websites. It’s easy to get started with, but you will hit its limitation pretty quickly, especially if you want to create something extraordinary.
In pre-investment it’s crucial to deliver as fast as you can, and the landing page just serves you as a quickly updated pitch deck. However, after the first paycheck comes, it is wise to revisit your choices so that you can more or less maintain your momentum for the future. As Colossyan turned out to be an idea worth spending investors money on, the main page has been given to a dedicated content management team, and it has been also migrated from WordPress to Webflow. It has been assigned with a dedicated designer as well!
The problem
Everything seemed nice and steady until the company matured enough to adopt a status page in November of 2022. After some internal discussions we’ve adopted Betteruptime by Betterstack (another alternative was Statuspage by Atlassian) - this topic in itself may also deserve a blog post. It became quite clear within days that the main page was quite unstable. For minutes, colossyan.com could become unreachable, and it was due to mostly DNS errors.
It has been found out later that the WordPress integration was never disabled, and the managed WordPress provided by Dreamhost automatically tried to override the root domain DNS records that we configured to point at our webflow deployment. After disabling the managed WordPress the problem got resolved. Look at our status page since then - incredible.
Subdomains
When Colossyan reached the critical mass and started thriving, more and more domain-related requests were coming from the organization. Requests like:
- I have integrated the managed billing solution, could you please add the 1.2.3.4 A DNS record to the billing.colossyan.com subdomain?
- Marketing: can you please fix the DMARC records so that the mails sent to the customer won’t end up in the Spam folder?
- Can you please add the DNS record ASAP? Only you have the credentials to the domain portal.
Solution
Keeping in mind DevOps best practices, we’ve started brainstorming ways to support these configuration changes in a more secure, easier and controlled way. Obviously we don’t need to come up with anything brand new - we needed GitOps to control these changes.
- Changing something in the UI is inherently error-prone. Copy-pasting the content of the record from Slack is not life insurance. -> Declarative
- Even in a team of two we sometimes got confused whose responsibility was to do the changes and what was the current state of the DNS records. -> Versioned and immutable
- Only one person in the team had the credentials, which was passed around when needed -> continuously reconciled
For a second we obviously thought about creating a DNS manager component, but since there’s already a buzzword for what I wanted to implement, let’s go with a well established GitOps tool: Terraform. (We’ve already been using Terraform for a while, Pulumi would also be a great choice).
We have implemented the Dreamhost Terraform provider in May 2023 within 2 days. The provider is already used to manage our DNS records for all of our subdomains.
Implementing your own TF provider is also a very fun activity, would definitely recommend doing it.
Summary
As you can see, Dreamhost came with a price that we’re still paying today, however, we are in owe for being a flexible provider to start with and enabled the company in day 0 to iterate rapidly.
As for the future, you might ask, why not just migrate to a hosting solution that already has a Terraform plugin or has an extensive documentation? The answer is yes, that’s actually in our roadmap, and we will definitely want to make that step, but we’re not in a hurry now.