Hi, I’m Raki.
I build highly scalable Data & AI software on the Microsoft stack.
LATEST ARTICLESAll opinions are my own and not those of my employer.
dbt with Fabric Spark in Production
Declarative is so hot right now
February 16th 2026dbtMicrosoft FabricApache SparkDelta LakeCICDDevcontainerCopilotColumn level lineage in Fabric Spark with OpenLineage and stashing the lineage in Delta Lake
You really don't need any other infra
January 10th 2026OpenLineageMicrosoft FabricApache SparkDelta LakeAlert on thousands of Fabric Pipelines with Monitoring Eventhouse
Say no to inline Data Factory conditionals, say no to spagghetti
December 14th 2025Microsoft FabricFabric Open Mirroring is really fast, and free (really!)
If ingestion is free, your data is the product
October 20th 2025DuckDBMicrosoft FabricHow to securedly setup CICD automation on Fabric when Service principal usage is disabled on the tenant
Another giant networking magic trick
September 13th 2025Azure RelayMicrosoft FabricHow to setup a secured tunnel from your local machine to Fabric, Databricks, Synapse or anywhere else
A giant networking magic trick
June 15th 2025Azure RelayOpenTelemetry to Delta Lake with OTel Arrow Schema
Yes, it CAN be delightful to query OpenTelemetry data with SQL Server
May 19th 2025Apache SparkApache ArrowOpenTelemetryDelta LakeMicrosoft FabricSQL ServerHow to deeply instrument a Spark Cluster with OpenTelemetry (feat. real time Power BI report)
Everything you ever wanted to know about the JVM
May 12th 2025Apache SparkOpenTelemetryDelta LakePower BIFlushing 27+ GB/min from Event Hub to Delta Lake with delta-dotnet
Stream Processing Engine Vendors hate this "One Weird Trick" 🤬
November 10th 2024Delta Lake.NETOneLakeAzure Event HubHow I host this website on Azure for $1.50 per month
That's canadian Dollars
June 18th 2023Azure CDNGatsbyAzure Blob StorageJavaScriptConquering Eviction Manager in Kubernetes
A reusable script to pretty print the signals Eviction Manager processes to evict Pods
January 29th 2023KubernetesChaos Testing Azure Arc-enabled SQL Managed Instance's High Availability
Using Chaos Mesh on Kubernetes to give SQL Server Availability Groups a hard time
December 5th 2022Azure ArcKubernetesSQL ServerChaos MeshArc Data Controller - Bring-Your-Own-SSL Certs
Steps to create SSL certificates for Kibana and Grafana
March 14th 2022Azure ArcKubernetesArc SQL MI - increase storage size of PVC
Scripted steps for increasing PVC size for Arc SQL MI
January 22nd 2022Azure ArcKubernetesSQL ServerActive DirectoryArc SQL MI - Scripted Active Directory Setup
End-to-end script Active Directory environment setup for Arc SQL MI
December 31st 2021Azure ArcKubernetesSQL ServerActive DirectoryDetecting SQL Column Decryption using Purview, Kafka, Kafdrop and Spark
Demonstrating a reusable method to leverage Purview's Atlas Hook to build an event-based, decryption detection mechanism
October 3rd 2021Azure PurviewAzure SQLKafkaApache SparkReverse Engineering Dockerfiles for Azure Arc-Enabled Data Services
Demonstrating a reusable method to reverse engineer Dockerfiles for Azure Arc-enabled SQL Managed Instance
July 25th 2021Azure ArcKubernetesAutomating Purview Integration Runtime with the Proxy API
Demonstrating a wrapper script around the Proxy API to automate Integration Runtime VM Management
July 10th 2021Azure PurviewDemonstrating Redis Cluster management with Azure Cache for Redis
Connecting the dots between "Redis Clusters" and how Azure manages all the underlying complexity
July 5th 2021RedisQuerying Event Hub Capture files with Azure Data Explorer
Creating and querying external tables on Event Hub Capture avro files with Azure Data Explorer
January 31st 2021Azure Data ExplorerAzure Event HubReplicating data from SQL Server to Azure SQL MI & DB
An illustrative summary of the different SQL Data Synchronization and Replication Options
January 13th 2021SQL ServerStream Processing Event Hub Capture files with Autoloader
Processing avro files and payloads from Event Hub Capture with Databricks Autoloader
January 4th 2021Azure Event HubDatabricksApache SparkExploring Purview’s REST API with Synapse
Programmatically accessing Data Lake Asset Classifications in Synapse Spark Pools with Purview's REST API
December 28th 2020Azure SynapseAzure PurviewAzure Blob StorageApache SparkHow to Recurse Data Lake Folders with Synapse Spark Pools
Demonstrating a handy recursion technique to populate all files in a Data Lake (and how to pretty print)
December 24th 2020Azure SynapseApache SparkAzure Blob StorageExploring Azure Schema Registry with Spark
Demonstrating Spark integration with Azure Schema Registry with native Event Hub endpoint and Kafka Surface
December 2nd 2020Azure Event HubDatabricksApache SparkDatabricks Autoloader Pipeline - an illustrated view
End-to-end illustrative walkthrough of an Autoloader Pipeline
November 26th 2020DatabricksAzure SynapseAutomating Braze Data Ingestion to Synapse with Autoloader
End-to-end walkthrough of Autoloader setup for ingesting mock data from Braze
November 16th 2020DatabricksAzure SynapseBuilding an Intelligent Harry Potter Search Engine
A Full-Stack Web App for hosting BERT on Azure Containers
February 12th 2020JavaScriptDatabricksKubernetesAI/MLSpark Certification Study Guide - Part 2 (Application)
Part 2 of the Study Guide I created to pass the Spark Certification Exam
January 2nd 2020Apache SparkDatabricksPythonSpark Certification Study Guide - Part 1 (Core)
Part 1 of the Study Guide I created to pass the Spark Certification Exam
January 1st 2020Apache SparkDatabricksPython
Get in touch 👋
If you have any questions or suggestions, feel free to open an issue on GitHub!