Surprisingly enjoying both low latency and low costs with DynamoDB
As big-data architects using managed AWS solutions, we tend to map storage technologies in our heads into rules of thumb. S3 will be in the category of “cheap and slow” while DynamoDB, relatively speaking, will sit in the “fast and expensive” category. Well, in certain use cases as I discovered, you can get cheap and fast using DynamoDB.
In our event stream processing system, we process 300 million events per day that are ingested from a source queue.
Our service runs some business logic that includes transformation and enrichment on the source events and in some cases outputs a message to a destination queue.
Each event has a unique GUID and the first requirement is to store them, after the transformation, in a way we could later retrieve them by their GUID.
The second requirement is that those retrievals could only occur on events that were ingested in the last 30 days.
Sounds easy right? We basically need a key-value store that would map the GUID to the transformed event with a 30 days retention.
Let’s break down the costs of doing this with S3 VS. the costs of using DynamoDB.
I’ll start with the pure managed costs, i.e., how much would we pay AWS, and then take a broader look at more hidden costs (development, maintenance par complexity, etc.).
All costs will be calculated based on the us-east region.
First, we need to make an assumption about the average event size we process.
In my use case it was 5 KB, but after GZIP compression I got it to 1.6 KB. Compressing the events and calculating costs based on the compressed size is something many engineers tend to miss, and it could easily shift the pendulum from one technology to the other.
300 million events written per day * 30 days = 9B events per month.
Total data stored and transferred is
1.6KB * 9B events = 14.4 TB.
We will assume 500 million event retrievals per month for our use case.
Storage — S3 Standard is priced at $0.023 per GB for the first 50 TB in a month, which totals $331. We can configure the S3 bucket with a 30 days retention policy and as we don’t pay for deletes in S3, we wouldn’t have any extra charge for the eviction of the events.
Write requests — 1,000 PUT requests in S3 Standard are priced at $0.005, which totals $45,000(!). Usually, when persisting to S3, you would want to batch multiple events into zipped formats or columnar formats like ORC or PARQUET. But, when the requirement is to retrieve an event by its GUID, you must store it separately as one object keyed by the GUID which translates to one PUT request.
Read requests — 1,000 GET requests in S3 Standard are priced at $0.0004, which totals in $200.
Total = $331 Storage + $45,000 Write + $200 Read = $45,531 per month
Storage — DynamoDB Standard table is priced at $0.25 per GB, which totals in $3,600. We can configure a 30 days TTL per item with no extra charge for the evictions.
Write requests — I’ll use the provisioned capacity pricing model for this calculation. Provisioned capacity can also work with auto-scaling if needed. In the provisioned model you don’t pay for write requests, you pay for “write capacity units” (WCU). For items up to 1 KB in size, one WCU can perform one standard write request per second. Because our items are sized 1.6 KB on average, we will need to round it to 2 WCUs per item per second. In an ideal world where the ingestion of events spreads evenly throughout the month, 9 billion events would result in 3,472 events written per second, which means 6,944 provisioned WCUs. WCU is priced at $0.00065 per hour, which totals in $3,249. If the events aren’t spread evenly and you need to handle bursts, you could configure auto-scaling and you would probably pay more or less the same price, depending on how fast you are required to handle the lag of the bursts.
Read requests — I’ll use the same provisioned capacity model. Here AWS uses a term called “read capacity unit” (RCU). For items up to 4 KB in size, one RCU can perform one strongly consistent read request per second or two eventually consistent read requests. To have a fair comparison with S3, which is already strongly consistent since December 2020, we will calculate using the strongly consistent read price. 500 million events per month would result in 193 events read per second, which means 193 provisioned RCUs. RCU is priced at $0.00013 per hour, which totals in $18.
Total = $3,600 Storage + $3,249 Write + $18 Read = $6,867 per month
Until now we reached
$45,531 for S3 VS. $6,867 for DynamoDB which is pretty close to the 10x I promised in the title but not there yet.
Let’s look at the bigger picture now. A standard event processing pipeline that writes to Dynamo as part of its processing might look that:
Because of the extremely low latency in writing to DynamoDB, I want to argue that if you needed 10 pods for your service to handle the throughput of the events while persisting to DynamoDB, you would probably need 50–100 pods to handle the same throughput when writing to S3. This will increase your compute costs significantly.
Moreover, you probably calibrated the CPU and Memory of the pods that run your service to the service’s business logic needs and scaling up with more pods just for the sake of persistence to S3 would be a huge waste of money.
What you would probably prefer to do in this case is to write to S3 asynchronously with a different service, calibrated with less CPU and RAM or even use a solution like Kafka Connect S3 Sink if your messaging infrastructure is Kafka-based.
In this solution, we can keep the original 10 pods of the service with the business logic, and we can use cheaper pods for the persistence service or use Kafka Connect if we don’t want to write the code ourselves. We saved cost relative to the naive S3 persistence flow but still paid much more in compute than the naive Dynamo persistence flow.
And what about complexity? Having synchronous flows highly simplifies our apps. It enables rapid development and speeds up troubleshooting and maintenance. The solution that avoids the lags inflicted by writing to S3 results in a more complex asynchronous flow.
We either need to spin up another service that we will need to implement and monitor or spin up a Kafka Connect cluster that needs its own monitoring. We also expose ourselves to more race conditions in future features as we broke down our flow into more non-transactional moving parts. Taking all of this into account and adding up our original AWS bill, we can easily say that the DynamoDB solution is cheaper by more than 10x than the S3 solution.
Never assume that S3 would necessarily be cheaper than a NoSQL solution, look at the numbers for your specific use case and you might reach surprising results.
Remember to compress data before calculating the size. Consider the simplicity of your solution and potential future use cases as part of your “hidden” cost calculation.
Good luck in your next architecture brainstorming session.
Want to Connect?I'm the Founder and CTO @ Poker Fighter