July 5th 2021 • 5 minute read

Demonstrating Redis Cluster management with Azure Cache for Redis

"Azure Cache for Redis (Premium) offers Redis cluster as implemented in Redis."

Redis Cluster - which is Open Source (here's the first ever commit from 2011) is a data sharding solution with automatic management, handling failover and replication.

This raises the question, what innovation does Azure Cache for Redis bring to the table as a managed Database offering?

To appreciate what this entails, let's quickly look at:

How Redis Clustering works
How to set up a sharded Redis cluster from scratch on a group of Linux VMs running on a laptop (e.g. using multipass)
The "Azure Experience" of the above setup, to appreciate the management aspects Azure's Control Plane abstracts away for us

Redis Clustering

Redis cluster spec docs and the cluster setup tutorial do a phenomenal job of listing out in detail how Redis clusters work - but in short, here are some takeaways:

Why

In Distributed Computing (not specific to Redis), sharding data allows us to horizontally scale-out Compute. In simple words, by systematically distributing a big chunk of data into smaller chunks, we go beyond the computational limits of a single machine (physical or virtual). Sharding a dataset requires us to intelligently design how we want to Partition the data (almost always based on a Key).
Data Sharding = Divide and conquer.
In general, Partitioning data across standalone Redis instances is a non-trivial topic (just like any other Database engine). The 2 approaches - Range, Hash (Consistent or otherwise) - comes with it's own benefits and drawbacks.
- For Range, you have to manage a key → shard mapping table.
- For Hash, you have to implement hashing algorithms (such as crc32)
The different permutations in implementating (client side, proxy assisted, query routing) adds to further implementation variances - and generally doesn't make the end user's life much easier.
Because Partitioning is hard, Redis Cluster was introduced into Redis (GA as of 2015), to offer end users of Redis with more of a "de facto standard" for physically implementing Partitioning while leveraging Redis.

How

Redis Cluster implementation terminology introduces Masters (Primary) and Slaves (Replica). At any given point in time, a given shard of data has 1 Master, and N Slaves (N is configurable via --replicas N or resharding)
A given node in the Cluster talks to every other node using the Redis Cluster Bus using the Gossip Protocol (e.g. Cassandra does the same thing).
Master → Slave(s) Shard replication occurs via async replication.
A given key, based on it's hash, is mapped to 1 of 16384 hash slots (reasoning behind the number 16384). These hash slots are then distributed amongst the Shards - i.e. Master → Slave(s) .
The value of a key, say "my_value" is mapped to a given hash slot using the formula:
```
HASH_SLOT = CRC16("my_value") % 16384
```
End users can force certain keys to be grouped together via hash tags using curly brackets {...} (Redis Enterprise extends this to include RegEx, great article here)
💡 The main highlight here is that Redis Cluster handles the implementation heavy lifting for us, so all we have to do as an end user is to design our keys intelligently.

Cluster setup from scratch

Following this excellent video (and associated GitHub repo) - you can use multipass on any laptop to end up with a 6 Node Redis Cluster (3 shards) in a few minutes:

At the end of the setup, I ended up with something like this:

Manual Redis Cluster onboarding of 6 Instances (click to expand)

And here's a logical architecture of the manual setup using OSS tools, and end user responsibilities:

Redis Cluster Architecture and end user responsibilities

End user management responsibilities

End user responsibilities observed from this simple demonstration:

VM Provisioning:
- In my case, manually setting up 6 VMs via multipass launch - in reality this would involve a datacenter/hypervisor/infrastructure etc.
- Performing several apt-get pre-reqs, downloading redis via wget, make from source after unzip.
- Installing redis from specific node.conf files with the correct Cluster Parameters
- Proper management of the aof and rdb files per node per disk.
- Day 2 operations (OS upgrades etc.)
Cluster Onboarding:
- In this case, via redis-cli --cluster create --cluster-replicas 1 IP1:7001 IP2:7002... - where we have to specify all the participating nodes at creation time.
- Using redis-cli --cluster add-node IPN:700N to add a new node.
- Using redis-server node.conf to re-onboard a node if it crashes.
Shard Rebalancing:
- Using redis-cli --cluster reshard IPN:700N to reshard the data, as well as implementing the sharding logic - i.e. answering Redis "How many slots do you want to move?"

We can conclude that although Redis Cluster helps us with tools (such as redis-cli/trib), there's management and Day 2 Operations overhead on the end user when it comes to environment management.

Azure Cache for Redis

Offerings Summary

The chart below summarizes the different Azure Cache for Redis offerings available at the time of writing:

Further documentation of this feature comparison chart can be found here, and per SKU pricing details can be found here.

The lowest SKU that meets our original purposes - a managed Redis cluster (1) that is Network accessible (2) - is met via the Premium Tier. Lower Tiers (Basic, Standard) do not offer Clustering.

The Enterprise Tier takes things to the next level from Premium thanks to the continued innovations (e.g. Redis on Flash) by our friends over at Redis Labs - making Redis "Enterprise Ready" for very large scale (e.g. 13 TB) deployments.

Cluster setup via Azure

We can achieve the end goal of access to a managed, network integrated Redis cluster via a handful of clicks form the Azure Portal.

The diagram below attempts at visually contrasting our management pain points from above - highlighting a subset of the key capabilities Azure Cache for Redis brings to the table as a Managed Redis Cluster offering:

Azure Cache for Redis Premium Tier provisioning (click to expand)

The idea here is, the Redis cluster management experience, previously demanding on the end user (despite Redis OSS tooling), is abstracted away by the Azure Control Plane, materialized as the UI (or Azure CLI/API) - in the visual/managed experience we see above.

Detailed information on these above line items can be found here:

Automated deployment: Deploy Azure Cache for Redis via PowerShell, CLI, ARM Template (Portal shown in screenshot).
Sharding (In/Out): During first time setup and post-deployment.
Vertical Scaling: Scaling within tier and across tiers, as well as automation.
Turnkey Replica: During first time setup.
HA deployment: Zone redundancy and other HA patterns, and DR patterns such as Geo-Replication.
VNET Integration: VNET support for Premium and Private Link for all tiers.

Wrap Up

We explored the tooling available when setting up Redis Clusters from scratch and highlighted the management pain points, before contrasting how Azure Cache for Redis alleviates management overhead thanks to Azure's Control Plane operating on the Redis Cluster.

For a condensed summary of Azure Cache for Redis's capabilities, check out this great overview article.