Demonstrating Redis Cluster management with Azure Cache for Redis
Browsing Microsoft docs, we note:
"Azure Cache for Redis (Premium) offers Redis cluster as implemented in Redis."
Redis Cluster - which is Open Source (here's the first ever commit from 2011) is a data sharding solution with automatic management, handling failover and replication.
This raises the question, what innovation does Azure Cache for Redis bring to the table as a managed Database offering?
To appreciate what this entails, let's quickly look at:
- How Redis Clustering works
- How to set up a sharded Redis cluster from scratch on a group of Linux VMs running on a laptop (e.g. using multipass)
- The "Azure Experience" of the above setup, to appreciate the management aspects Azure's Control Plane abstracts away for us
Redis Clustering
Redis cluster spec docs and the cluster setup tutorial do a phenomenal job of listing out in detail how Redis clusters work - but in short, here are some takeaways:
Why
In Distributed Computing (not specific to Redis), sharding data allows us to horizontally scale-out Compute. In simple words, by systematically distributing a big chunk of data into smaller chunks, we go beyond the computational limits of a single machine (physical or virtual). Sharding a dataset requires us to intelligently design how we want to Partition the data (almost always based on a Key).
Data Sharding = Divide and conquer.
In general, Partitioning data across standalone Redis instances is a non-trivial topic (just like any other Database engine). The 2 approaches - Range, Hash (Consistent or otherwise) - comes with it's own benefits and drawbacks.
- For Range, you have to manage a key → shard mapping table.
- For Hash, you have to implement hashing algorithms (such as
crc32
)
The different permutations in implementating (client side, proxy assisted, query routing) adds to further implementation variances - and generally doesn't make the end user's life much easier.
Because Partitioning is hard, Redis Cluster was introduced into Redis (GA as of 2015), to offer end users of Redis with more of a "de facto standard" for physically implementing Partitioning while leveraging Redis.
How
Redis Cluster implementation terminology introduces Masters (Primary) and Slaves (Replica). At any given point in time, a given shard of data has 1 Master, and
N
Slaves (N
is configurable via--replicas N
or resharding)A given node in the Cluster talks to every other node using the Redis Cluster Bus using the Gossip Protocol (e.g. Cassandra does the same thing).
Master → Slave(s) Shard replication occurs via async replication.
A given key, based on it's hash, is mapped to 1 of 16384 hash slots (reasoning behind the number 16384). These hash slots are then distributed amongst the Shards - i.e. Master → Slave(s) .
The value of a key, say "my_value" is mapped to a given hash slot using the formula:
HASH_SLOT = CRC16("my_value") % 16384
End users can force certain keys to be grouped together via hash tags using curly brackets
{...}
(Redis Enterprise extends this to include RegEx, great article here)
Cluster setup from scratch
Following this excellent video (and associated GitHub repo) - you can use multipass on any laptop to end up with a 6 Node Redis Cluster (3 shards) in a few minutes:
At the end of the setup, I ended up with something like this:
And here's a logical architecture of the manual setup using OSS tools, and end user responsibilities:
End user management responsibilities
End user responsibilities observed from this simple demonstration:
VM Provisioning:
- In my case, manually setting up 6 VMs via
multipass launch
- in reality this would involve a datacenter/hypervisor/infrastructure etc. - Performing several
apt-get
pre-reqs, downloading redis viawget
,make
from source after unzip. - Installing redis from specific
node.conf
files with the correct Cluster Parameters - Proper management of the
aof
andrdb
files per node per disk. - Day 2 operations (OS upgrades etc.)
- In my case, manually setting up 6 VMs via
Cluster Onboarding:
- In this case, via
redis-cli --cluster create --cluster-replicas 1 IP1:7001 IP2:7002...
- where we have to specify all the participating nodes at creation time. - Using
redis-cli --cluster add-node IPN:700N
to add a new node. - Using
redis-server node.conf
to re-onboard a node if it crashes.
- In this case, via
Shard Rebalancing:
- Using
redis-cli --cluster reshard IPN:700N
to reshard the data, as well as implementing the sharding logic - i.e. answering Redis "How many slots do you want to move?"
- Using
We can conclude that although Redis Cluster helps us with tools (such as redis-cli/trib
), there's management and Day 2 Operations overhead on the end user when it comes to environment management.
Azure Cache for Redis
Offerings Summary
The chart below summarizes the different Azure Cache for Redis offerings available at the time of writing:
Further documentation of this feature comparison chart can be found here, and per SKU pricing details can be found here.
The lowest SKU that meets our original purposes - a managed Redis cluster (1) that is Network accessible (2) - is met via the Premium Tier. Lower Tiers (Basic, Standard) do not offer Clustering.
The Enterprise Tier takes things to the next level from Premium thanks to the continued innovations (e.g. Redis on Flash) by our friends over at Redis Labs - making Redis "Enterprise Ready" for very large scale (e.g. 13 TB) deployments.
Cluster setup via Azure
We can achieve the end goal of access to a managed, network integrated Redis cluster via a handful of clicks form the Azure Portal.
The diagram below attempts at visually contrasting our management pain points from above - highlighting a subset of the key capabilities Azure Cache for Redis brings to the table as a Managed Redis Cluster offering:
The idea here is, the Redis cluster management experience, previously demanding on the end user (despite Redis OSS tooling), is abstracted away by the Azure Control Plane, materialized as the UI (or Azure CLI/API) - in the visual/managed experience we see above.
Detailed information on these above line items can be found here:
- Automated deployment: Deploy Azure Cache for Redis via PowerShell, CLI, ARM Template (Portal shown in screenshot).
- Sharding (In/Out): During first time setup and post-deployment.
- Vertical Scaling: Scaling within tier and across tiers, as well as automation.
- Turnkey Replica: During first time setup.
- HA deployment: Zone redundancy and other HA patterns, and DR patterns such as Geo-Replication.
- VNET Integration: VNET support for Premium and Private Link for all tiers.
Wrap Up
We explored the tooling available when setting up Redis Clusters from scratch and highlighted the management pain points, before contrasting how Azure Cache for Redis alleviates management overhead thanks to Azure's Control Plane operating on the Redis Cluster.
For a condensed summary of Azure Cache for Redis's capabilities, check out this great overview article.