The pattern
The architecture deck looks perfect. Then the project hits production and suddenly nothing can talk to anything securely or at the required latency.
Where it usually breaks
- Public endpoints used in production “just to get it working.”
- Missing private link or private endpoint configurations for storage, Key Vault, and external systems.
- Firewall rules that block Delta Live Tables or model serving traffic.
- No consideration for cross-subscription or cross-tenant connectivity patterns.
- Network policies that were never tested at scale.
What production-grade connectivity actually requires
- Private endpoints for all Azure services touching Databricks (storage accounts, Key Vault, Event Hubs, etc.).
- Private Link Service where Databricks needs to reach on-prem or third-party systems.
- Network Security Groups and route tables that are locked down but still allow necessary data movement.
- Proper DNS configuration so private endpoints resolve correctly inside the workspace VNet.
- Early testing of the full connectivity path, not just “ping works.”
The takeaway
Networking and private connectivity are not “someone else’s problem.” They are core delivery scope for any production Databricks platform. Treat them as first-class citizens from day one and you will avoid the most painful (and most avoidable) delays in enterprise AI delivery.