By Bhanu Jamwal, Head of Solutions Engineering, APAC, TiDB
According to a report by IDC, worldwide spending on public cloud services is expected to reach $805 billion in 2024 and double by 2028. From my interaction with IT decision-makers across the APAC region, I can see a clear shift in preference—more are opting to deploy IT services like databases on the cloud rather than on traditional on-premise setups. And it’s not just a regional trend—global shifts are also happening.
Take the recent collaboration between Oracle, AWS, and Microsoft. After two decades of intense rivalry, Oracle embraced cloud partnerships, enabling its customers to run Oracle databases seamlessly on AWS and Azure. This kind of shift is pushing enterprises across industries to adopt the cloud at a faster pace than ever.. However, the challenge I have seen over the past few years is that although cloud adoption has accelerated over the past decade, cloud cost control mechanisms have evolved at a different pace. As organisations allocate substantial budgets to cloud solutions, optimising cloud costs has emerged as a critical component of a successful cloud strategy. However, IT leaders face significant challenges in managing cloud-related costs while demonstrating a clear ROI from cloud-based services.
This challenge extends to cloud databases as well. For example, the elastic nature of cloud databases makes storing large amounts of data easier without worrying about physical property. As enterprises migrate to cloud databases, controlling costs while maintaining database performance becomes crucial. Let’s look at six ways to minimise cloud database spending while continuing the scalability and innovation journey.
Right-Sizing
Right-sizing cloud databases involves making informed decisions about resource allocation—like RAM and CPU. Here are the two ways businesses can right-size databases:
- Provisioning for peak workloads: Cloud providers often require users to specify the resources needed for their databases based on expected usage patterns. By provisioning resources based on the 99th percentile, businesses can ensure that their databases will accommodate unexpected spikes in traffic. The downside is that it can result in over-provisioning during regular operational periods.
For instance, during festive sales, an e-commerce platform may experience a predictable spike in activity that can overwhelm a database provisioned for average daily traffic. However, the same database remains over-provisioned outside the sale period, wasting resources.
- Right-sizing to improve cost efficiency: Autoscaling is one strategy that helps eliminate the mismatch between resource allocation and actual workload demand. A serverless database handles scaling seamlessly, ensuring users pay only for what they use, achieving cost efficiency at every growth stage.
Data lifecycle management
Automate data archiving or deletion for unused or outdated records. Use lifecycle policies to move logs older than specific days to cheaper storage or delete them. TTL (Time to Live) is an easier way to perform such data lifecycle. TTL refers to a setting that defines the lifespan of a piece of data (e.g., a record or document) in the database. After the specified TTL expires, the data is automatically deleted or marked for deletion by the database. Reduces manual cleanup tasks for expired data. It helps optimise storage and maintain database performance by removing stale data.
Implement multi-tenancy
Multi-tenancy allows multiple applications, users, or clients to share a single database instance while maintaining logical isolation. This approach reduces the number of database instances required, cutting costs. The advantage of consolidating multiple applications to one single database results in fewer instances, hence reducing costs for compute and storage, enabling efficient resource utilisation when workloads have similar usage patterns. The Implementation can follow schema-based isolation where separate schemas for each tenant can be implemented & row-level isolation where a tenant ID column can be used to segment data within tables One example is to host a SaaS platform for multiple customers on a single database instance with logical partitions. The key benefits from this approach is reducing licensing and maintenance costs for databases with pay-as-you-go pricing models and maximising resource utilisation by running multiple applications on a single instance.
A common challenge with multi-tenant databases is the “noisy neighbour” problem, where the resource usage of one tenant can affect others. To address this, implementing resource control in the database is essential. This ensures that resource quotas are allocated to each tenant, preventing any single tenant from overusing system resources.
Denormalisation
While right-sizing and data tiering focus on optimising cloud databases for cost-efficiency without sacrificing performance, denormalisation takes a different approach, explicitly targeting read performance for workloads that heavily rely on fast data retrieval.
Creating copies of specific data items can enhance read performance by reducing costly operations. In an e-commerce store example, you’d typically have separate tables for customers, products, and orders. Retrieving one customer’s order history would involve a query that joins the order table with the customer table and product table.
While this method maintains data integrity, it may be inefficient across millions of queries. Instead, when a customer places an order, the system can write the product and customer details directly onto the order table, allowing all necessary information to be retrieved with just one hit.
Simplify database structure
Over time, databases can become complex and even unwieldy. Based on my experience, I can say that databases have significantly evolved- from traditional systems to sophisticated modern platforms. Organisations can improve cost efficiency by rethinking the database’s structure in ways including:
- Reducing redundancy: We just looked at how denormalisation can improve read query performance but at the expense of duplicating data. However, there might be some data where redundancy harms efficiency. Identifying and eliminating data elements duplicated across tables can shrink database footprint, reducing storage costs.
- Archiving historical data: Not all data must stay in an active database. Archiving historical data to a separate lower-cost storage tier can free up space in the primary database and reduce compute resource costs.
- Decomposing underutilised tables: If the database team mixes frequently and infrequently accessed data in the same table, they should consider decomposing these tables into smaller, more focused tables, which allows them to optimise the storage tier for each based on its access frequency. This approach can streamline queries and potentially reduce storage costs for less frequently accessed data.
Use a serverless database
Serverless databases are the ideal choice for eliminating overprovisioning. IT teams will no longer have to consider how data is stored, optimised, and queried. The serverless database dynamically adjusts resources based on actual demand, so enterprises and startups only pay for what they use. This frees your team from database management headaches, allowing them to focus on core application development and strategic data initiatives.
As I see it, the real challenge isn’t how businesses navigate the complexities of the cloud but how they will manage the complex web of associated costs. In the long run, optimising cloud spending will determine whether an organisation can maximise its cloud investments or be weighed down by unnecessary expenses.