Cosmos Db is not a relational database, but we can use it for relational data, for example, eCommerce data.
A few key patterns for successful data modeling in Cosmos DB are
- Select partition key wisely: Consider max document size, max partition size, and the most common query pattern. Choose the partition to let the query be point read or query within the single partition.
- Use denormalization: Store id and name together in the same document. In relational DB, names usually stored in the reference tables separately. But, in document DB, it is better to store it together even though it is a repetition. Because it can reduce the need to join, which is expensive in Cosmos DB.
- Sync denormalized value, e.g. name: Use Cosmos DB change feed feature, which will work as a trigger in relational DB and update all the reference data when name changes. A flow will be like this. Cosmos change feed -> Azure Function -> Update relevant data.
- Share a partition among data objects: For example, put all reference tables in the same container and share the partition called ‘type’, which will differentiate among reference tables.
Key information
Partition Size
– single document size 2MB
– single logical partiion size 20GB
Data Model for eCommerce
Customer
– id
– customerId // partition key, same value as Customer.id
– type: customer
SalesOrder
– id
– customerId // partition key, save value as Customer.customerId. Customer and SalesOrder shares partition key in the same container.
– type: salesorder // use type field to differentiate Customer and SalesOrder model.
Product
– id
– ProductCategory
– Tag[] // embed for M to M relation. Change of tag name can be updated by Cosmos DB change feed to Azure function app.
ProductCategory
– id
– name
– type: Category // partition key. reference/lookup data can be put in the same partition key to be read at a low cost.
ProductTag
– id
– name
– type: tag // partition key. ProductionCategory and ProductKey share the partition key and placed it in the same container.
Full details of the Cosmos DB data modeling lecture at Pluralsight can be found at the link.
If your database is small, why not put att items/documents on the same partition? A partition can be 20GBs! If all documents can fit in one partition, cross-partition-queries won’t be a problem, and transactional batches can be used to get ACID-compliant writes over all documents.
Agreed. Based on the need, 20GB can be enough or not. It’s up to the use case.