How to Avoid Composite IDs in GraphQL with DynamoDB (feat. AppSync)

Learn how not to expose composite DynamoDB keys to the GraphQL client.

How to Avoid Composite IDs in GraphQL with DynamoDB (feat. AppSync)

In this article, I will discuss a few tricks on how to optimize your GraphQL API for items that use composite keys in DynamoDB. It will work no matter the GraphQL server, but if you're using AppSync, you're in luck because I'll share a few (VTL) code snippets too 🙂

In DynamoDB, it is very common to use composite keys (ie: Partition Key and Sort Key). This allows us to group related items together. Moreover, the combination of the partition key (PK) and sort key (SK) is what uniquely identifies the Item.

To illustrate this, let's take the following simple example. Imagine we have a DynamoDB states table that contains states from different countries. We might structure our data like so:

image.png

Here, the PK identifies the country code, and the SK the state code. They both together uniquely identify one Item in the database (ie: a state in a given country) and ensures their uniqueness at the same time. Additionally, this gives us some free access patterns (eg: Find all states for a given country).

Now, imagine that we want to serve the Items from a GraphQL endpoint. The query might look like this:

getState(countryCode: "US", stateCode: "TX") {
  name
}

This works well, but has several inconvenients:

This is not practical

The client has to pass two arguments in order to identify which item it wants to query. Understanding which fields must be used (eg: from other queries) might not be as straightforward as it seems. Also, the frontend often needs a unique key to distinguish items/components from each other (think "key" attribute in React), forcing it to compute it every time.

The client should not have to worry about the underlying data structure

In an ideal world, the client should not have to worry about how the data is being stored. By having a composite id in our API, we are exposing how the data is organized in the data layer and make the client depend from it.

In Front end applications, the client cache functionality might not work out of the box

Most GraphQL clients, like Apollo, offer a solid and powerful cache functionality. However, by default, the id field (with an ID type) is what they usually use to uniquely identify the Item in the (cache) datastore. In the above example, there isn't any (Neither in the request nor in the response). The client does not know that the countryCode/stateCode combination is what uniquely identifies a State. As a result, the item would never be cached.

Sure, we can always customize the cache ids, but we would have to do it for every Item type and in every client (ie: web, mobile, etc).

The solution: Denormalizing a unique id

Wouldn't it be nice if we could have a unique id field for our State items? As mentioned earlier, every State is a unique combination of the country code and the state code. In this case, we could even use the iso code of each state for that. For example, Texas' id can be US-TX.

Let's add an id attribute to our data model.

image.png

Now, all we have to do is to denormalize the id by concatenating the country and state codes. Doing so at creation time will avoid us having to generate it on the fly in every query (Plus, it's always nice to receive a pre-computed id field everywhere, even in the backend, for future uses). We can easily do that when saving the item in DynamoDB.

Example using AppSync VTL

#set($countryCode=$ctx.args.input.countryCode)
#set($stateCode=$ctx.args.input.stateCode)
#set($attributeValues={})
$util.qr($attributeValues.put("id", $util.dynamodb.toDynamoDB("${countryCode}-${stateCode}")))
#foreach($item in $ctx.args.input.entrySet())
  $util.qr($attributeValues.put("${item.key}", $util.dynamodb.toDynamoDB($item.value)))
#end
{
  "version": "2018-05-29",
  "operation": "PutItem",
  "key": {
    "countryCode": $util.dynamodb.toDynamoDBJson($countryCode),
    "stateCode": $util.dynamodb.toDynamoDBJson($stateCode)
  },
  "attributeValues": $util.toJson($attributeValues)
}

Awesome! But now, how do we fetch data from GraphQL? Let's update the query and use a unique id parameter with an ID! type.

type Query {
  getState(id: ID!): State!
}

type State {
  id: ID!
  countryCode: String!
  stateCode: String!
  name: String!
}

Great! Now, the backend receives a unique argument. However, DynamoDB still requires us to pass a countryCode (PK) and stateCode (SK) composite key. This will require some additional gymnastics at the resolver level. This is pretty straightforward, though. All we have to do is to split the id argument by '-'. You can do that in your favourite language depending on your use case. If you are using AppSync, here is how you can easily do that in VTL.

#if(!$ctx.args.id.contains("-"))
  ## Invalid iso code
  $util.error("Invalid Id", "InputError")
#end
#set($parts=$ctx.args.id.split("-"))
#set($countryCode=$parts.get(0))
#set($stateCode=$parts.get(1))
{
  "version": "2018-05-29",
  "operation": "GetItem",
  "key": {
    "countryCode": $util.dynamodb.toStringJson($countryCode),
    "stateCode": $util.dynamodb.toStringJson($stateCode)
  }
}

As you can see, this requires very little logic to implement and it solves all our issues. And it's completely transparent to the client. 🙌

Here is what the new query looks like:

getState(id: "US-TX") {
  id
  name
}

Conclusion

In this post, I showed you how to handle composite DynamoDB keys with GraphQL by hiding them from the client behind a unique attribute. By denormalizing this attribute in DynamoDB and implementing some simple logic in the resolvers, you can save yourself from more annoying issues that we identified earlier.

Did you find this article valuable?

Support Benoît Bouré by becoming a sponsor. Any amount is appreciated!