One of my favorite points of discussion still is the story of taking a company from a $5000 monthly spend down to $50 per month. To me, this is more so a story of right-sizing and just ensuring that the appropriate tools were being leveraged. This also involved a slight amount of re-architecting to ensure that the application could properly support the new infrastructure.
Compute
The first item to work on was the compute element. When beginning to work on the system the initial build simply used GCP Compute machines. These machines ended up having to be sized fairly large in order to handle some random spikes within the application. This meant paying for the larger compute node for the entire time.
Moving this to Cloud Function allowed handling the spikes while also being able to take advantage of the low times and not get charged for unused resources.
Pub/Sub
Within the system, there was initially no way to queue up a retry. This caused the system to hang for long amounts of time, slow down the entire system, and force additional nodes to be stood up. By working to move things into a Pub/Sub we could easily queue up a retry while not stopping the entire rest of the system. This allowed an overall much smoother setup and something that we could better monitor for errors.
Database
The database presented the largest issue that had to be resolved. Initially, there was a massive amount of data being saved in the database. After further consideration, it was determined that 75% of the data didn’t need to be stored. Instead, it could be easily and effectively retrieved on demand. By dropping the amount of data being stored this allowed using a much smaller instance overall while also not having to pay for as much storage on the actual Database itself.
Vendor Lock-In
While this was built on GCP for all of the lifecycles this has very low amounts of overall vendor lock-in. The work to lift and shift to AWS would be minimal and even to shift to a single VM would be possible. The goal of this project was less of trying to create a cloud-agnostic system and more of ensuring that the system worked well and was cost-effective. GCP was chosen in this instance purely because of familiarity with the team.
Conclusion
Overall this project to me shows that clouds can be very cost-effective. This does require some knowledge of how to properly leverage everything and ensure that items are properly sized.