22 Apr 2023

An Introduction to DynamoDB

In this post series, I’ll be sharing everything I’ve learned about DynamoDB.

However, it is important to first grasp some fundamental concepts before diving deeper into the DynamoDB.

So let’s get started!

What is DynamoDB?

DynamoDB is a fully managed NoSQL database service provided by Amazon Web Services (AWS).

It allows developers to store and retrieve data in a flexible, highly scalable, and low-latency database that can handle millions of requests per second.

DynamoDB is designed to provide fast and predictable performance with seamless scalability and high availability, making it a popular choice for modern web and mobile applications. It also supports features such as automatic scaling, encryption, backup and restore, and global tables for multi-region deployment.

OLTP vs OLAP

TL;DR: DynamoDB is primarily designed for OLTP (Online Transaction Processing) workloads, it can be used for certain OLAP (Online Analytical Processing) use cases as well. However, its capabilities in this regard are limited compared to other databases that are specifically designed for OLAP, such as Amazon Redshift or Google BigQuery.

To avoid selecting the wrong database for your application, it’s very important to have a clear understanding of the distinctions between OLTP and OLAP.

  • OLTP Real-time transaction processing.
  • OLAP Complex data analysis and reporting.

I understand that the concepts of OLAP and OLTP may still be unclear, so perhaps some concrete examples will help.

OLTP Example

An e-commerce website that allows customers to place orders online and pay for them in real-time. The website’s database needs to be optimized for fast reads and writes to handle the high volume of transactions that occur each day.

In SQL, this is done using the INSERT statement:

INSERT INTO orders (customer_id, order_data, total_amount)
		VALUES(123, '2023-04-07', 500.00);

Another OLTP query is updating the customer data:

UPDATE customers SET name = 'Ahmad Mayahi' WHERE customer_id = 123;

OLAP Example

A retail chain that needs to analyze sales data to determine which products are selling well and which ones are not. The company’s data analysts need to be able to quickly retrieve and analyze large volumes of data to make informed business decisions.

SELECT
	product_id,
	SUM(sales_amount) AS total_sales
FROM
	sales_data
WHERE
	date BETWEEN '2023-01-01'
	AND '2023-03-31'
GROUP BY
	product_id;

Another example, is calculating the number of unique customers, and the average order amount for each country where orders were placed:

SELECT
	country,
	COUNT(DISTINCT customer_id) AS unique_customers,
	AVG(order_amount) AS average_order_amount
FROM
	orders
	JOIN customers ON orders.customer_id = customers.customer_id
GROUP BY
	country;

If you’re simply inserting, updating or deleting data, it’s generally referred to as OLTP. However, if you’re performing more complex analytical processes, it falls under the category of OLAP. That’s the main difference between the two.

Data Modeling and Access Patterns

Another crucial aspect to keep in mind about DynamoDB is that you must approach database design differently than you would with traditional RDBMS systems, and let go of any preconceptions you may have about database design.

Sounds confusing? Yes, it is.

One key difference between SQL databases and DynamoDB is that with SQL databases, you can run any type of query without limitations, whereas with DynamoDB, there are limitations. For example, in DynamoDB, you can only query data based on the primary key and cannot perform complex queries or joins like you can with SQL databases.

While you can implement joins and relationships in DynamoDB, the approach is quite different from SQL databases. In DynamoDB, you use a technique called “denormalization” to embed related data within a single item or partition, rather than splitting it across multiple tables as you would in a relational database. This can make it more challenging to model your data and query it efficiently, but it also allows for faster and more scalable data access.

A word about MySQL

You might be wondering why MySQL is being discussed in a DynamoDB post. It’s because most developers choose MySQL as the default database for their Laravel projects.

Prepare to be shocked by what I’m about to tell you.

MySQL is not specifically designed for OLAP workloads and may have some limitations when it comes to handling complex analytical queries on large data sets.

Let me ask you a question, have you ever wondered why your aggregation queries are still running slow despite optimizing them? That’s very simple, MySQL is not meant to be for - yes, as you guessed - OLAP.

This is a complex and confusing topic that I’d rather not delve into. However, Oracle has recently launched MySQL Heatwave, a new analytics engine that can be used with MySQL. If you prefer to stick with MySQL, you might want to consider using MySQL Heatwave instead of DynamoDB. MySQL Heatwave can offer better performance and more advanced analytics capabilities for your project.

Read more about MySQL Heatwave.

Real-world use cases of DynamoDB applications

Well, Despite its limitations compared to MySQL and other RDBMS, what are some use cases where DynamoDB shines? I have tried to create a list of some real-world scenarios where DynamoDB is commonly utilized.

This list is based on the DynamoDB customer’s page.

Social Media (Snapchat)

DynamoDB can be used to store user profiles, social graph data, and activity streams for social media platforms. For example, Snapchat uses DynamoDB where every day, the company exchanges hundreds of millions of messages.

I don’t have specific knowledge about how Snapchat utilizes DynamoDB, but retrieving messages from the database can be achieved easily through a simple query using the primary key, and this is all what they need (I guess).

Cloud Storage (Samsung Cloud)

Samsung Cloud uses Amazon DynamoDB as a metadata index for voice recordings, notes, and contacts stored in Amazon S3. According to JeongHun Kim, Database Engineer at Samsung Cloud, DynamoDB was chosen for its fast performance, durability, and scalability. Switching some workloads to DynamoDB Standard-Infrequent Access tables resulted in over 30% cost reduction with no impact on performance, durability, or scalability, and no new code was required.

Conclusion

In this article, I've provided a brief introduction to DynamoDB. I've also discussed some of the benefits and drawbacks of using DynamoDB, and highlighted some common real use-cases for this powerful database service.

Overall, DynamoDB is a valuable tool for developers and organizations looking to build scalable and high-performance applications. While there may be some challenges associated with using DynamoDB, such as the query options, the benefits of this service are significant.

In my next post, I’ll be talking about the fundamental principles of DynamoDB. Stay tuned!