NGINX vs AKS Ingress Controller: Choosing the Right Gateway for Kubernetes on Azure - Phase 1

NGINX vs AKS Ingress Controller: Choosing the Right Gateway for Kubernetes on Azure - Phase 1

Introduction

Kubernetes Ingress controllers are the front door to your cluster managing how external traffic routes to your services. In Azure Kubernetes Service (AKS), teams often weigh using the popular NGINX Ingress Controller versus Azure’s native options like the Application Gateway Ingress Controller (AGIC). Each approach has distinct advantages and drawbacks. In this post, I’ll provide a broad comparison of NGINX vs. AKS’s ingress controllers (especially Azure Application Gateway), a detailed pros and cons list for each, quantitative performance data, insights on adoption trends, real community feedback, and a look at documentation & support. Our goal is to help CTOs and DevOps teams make an informed, opinionated decision on which ingress fits their needs.

(Note: “AKS Ingress Controller” in this context refers primarily to Azure’s Application Gateway Ingress Controller, the default Azure-managed L7 ingress option. Other third-party controllers exist, but we focus on NGINX and Azure’s own ingress.)

Ingress Controllers in AKS: An Overview

Ingress controllers act as smart reverse proxies inside (or in front of) your cluster, typically operating at Layer 7 (HTTP/HTTPS). They consolidate access so you can expose multiple services under one IP address with flexible routing rules (hosts, paths, etc.), rather than using separate LoadBalancers for each service . In AKS, you have two general approaches:

  • In-Cluster Ingress (NGINX) – Deploy the open-source NGINX ingress controller as pods in your cluster. This is one of the most widely used ingress controllers in Kubernetes . It watches Ingress resources and configures NGINX to route traffic accordingly. By default you’d use an Azure L4 Load Balancer to funnel external traffic to the NGINX pods.

  • Azure Application Gateway Ingress (AGIC) – Use Azure’s managed Application Gateway (a layer-7 load balancer with optional WAF) as the ingress. A controller pod in AKS keeps the App Gateway’s configuration in sync with Kubernetes Ingress resources. The data path is handled by the Azure Application Gateway service instead of in-cluster pods. This is often considered the “native” ingress for AKS due to its tight integration with Azure’s ecosystem .

There are other options (Traefik, HAProxy, Istio, etc.), but NGINX and AGIC are among the most common in Azure environments. Now let’s dive deeper into each.

NGINX Ingress Controller in AKS

NGINX Ingress Controller (usually referring to the open-source Ingress-NGINX maintained by the Kubernetes community) is renowned for its versatility and robustness . Running NGINX ingress on AKS means you deploy it like any other app in your cluster (often via a Helm chart). Key points about NGINX ingress:

  • Proven Performance: NGINX is built for high-performance web serving and reverse proxying. It’s known to handle high volumes of concurrent connections efficiently with low memory/CPU footprint . Many users report that NGINX can drive high throughput and low latency for their workloads. In fact, NGINX consistently ranks as the most popular Kubernetes ingress globally (one CNCF survey found ~64% of respondents use NGINX ingress) .

  • Flexibility & Features: It offers a rich set of configuration options via Kubernetes annotations and ConfigMap settings. You can implement complex routing rules, URL rewrites, custom SSL/TLS configurations, session persistence, rate limiting, and more. This customizability is a big selling point – teams can fine-tune NGINX behavior to fit almost any scenario . There’s also an ecosystem of enhancements (e.g. Lua scripting, custom NGINX modules, or using NGINX Plus for advanced features).

  • Runs In-Cluster: NGINX ingress runs as one or more pods (typically deployed in a DaemonSet or Deployment). It means ingress is part of your cluster’s workload. The ingress pods will consume CPU/memory from your nodes and you’re responsible for managing/monitoring them like any app. You’ll usually expose the ingress via an Azure Service of type LoadBalancer, which allocates an Azure LB and IP for external access. All external traffic goes through the Azure LB to the NGINX pods, which then route internally to services .

  • Wide Community Support: Because Ingress-NGINX is used on many cloud platforms (not just Azure), there’s a huge community. You’ll find countless guides, GitHub issues, and StackOverflow Q&As for troubleshooting. Documentation is extensive (Kubernetes docs, NGINX official docs, and community articles). If you run into a problem or need an example config, chances are someone has posted about it. This makes it easier to debug and get help compared to more niche controllers.

In short, NGINX provides a tried-and-true, cloud-agnostic ingress solution. But it also means you manage that layer yourself within the cluster. Next, we’ll look at Azure’s alternative which offloads that management to a cloud service.

Azure Application Gateway Ingress Controller (AGIC) in AKS

Application Gateway Ingress Controller (AGIC) leverages Azure’s managed Application Gateway service to do the heavy lifting of ingress. Instead of running a proxy in-cluster, you run a small controller pod that translates Kubernetes Ingress resources into Azure Application Gateway config via Azure APIs . The actual traffic routing and load balancing is done by the Azure Application Gateway, which lives outside the cluster (in your Azure VNet). Key characteristics of AGIC:

  • Fully Managed Data Plane: The data path (HTTP/S processing) is handled by Azure’s Application Gateway, a fully managed L7 load balancer. This means no NGINX pods chewing up your cluster resources – all that work is offloaded. It also means Azure handles the reliability and scaling of that component. Application Gateway instances run on Azure VM scale sets managed by Microsoft , and can automatically scale out to handle load (if using the v2 SKU with autoscaling). From an Ops perspective, you’re outsourcing ingress routing to Azure’s infrastructure.

  • Native Azure Features: Because it’s an Azure service, you get out-of-the-box integration with Azure features. For example, Application Gateway supports Zone redundancy, static IP addresses, Azure-managed SSL certificates, and Web Application Firewall (WAF) capabilities . Importantly, enabling the WAF on Application Gateway gives you enterprise-grade threat protection (OWASP rules, bot protection, etc.) with just a setting change – something not natively available in the open-source NGINX (you’d have to plug in something like ModSecurity yourself). If security and compliance are top concerns, this native WAF integration is a big advantage for AGIC .

  • Minimal Kubernetes Overhead: With AGIC, your AKS cluster’s nodes aren’t doing the heavy packet processing for ingress. The AGIC pod is relatively lightweight – its job is just to call Azure ARM APIs to update the App Gateway config when you create/modify Ingresses. So the footprint on the cluster is small. There’s also no need to run a dedicated ingress controller deployment with high availability, since Azure’s gateway is inherently HA. This simplifies your Kubernetes architecture – one less component to manage lifecycle of (NGINX upgrades, etc.) .

  • Operations and Support: Because it’s an official Azure service, you can get Azure Support involved if something goes wrong at the ingress layer. Microsoft maintains AGIC (open-source on GitHub, but with official backing). Many enterprises value this “single throat to choke” – if ingress is misbehaving, you have a vendor to call. AGIC is also evolving rapidly under Azure’s roadmap (e.g., support for new TLS features, gRPC, etc., have been added over time ). Microsoft has even introduced a next-gen variant called Application Gateway for Containers (AGC) aimed at addressing some limitations (in preview as of 2023).

In summary, AGIC offers a managed, Azure-integrated ingress with strong features like WAF and reduced cluster overhead. However, these benefits come with some trade-offs, which we’ll explore next.

Performance Benchmarks and Quantitative Analysis

Performance is often a deciding factor for ingress choice. Let’s examine latency, throughput, and resource usage for NGINX vs AGIC with available data:

  • Raw Throughput: Both NGINX and Azure App Gateway can handle substantial traffic. NGINX is event-driven and can handle thousands of requests per second per pod (assuming sufficient CPU). Application Gateway v2 can also scale out to multiple instances and has documented limits in the millions of requests per day range. In practice, throughput will depend on your instance size (for AppGW) or pod resources (for NGINX). It’s hard to crown a winner without specific scenarios – both are high-performance L7 proxies. If anything, NGINX might have an edge in absolute single-instance throughput since it’s highly optimized C code, whereas App Gateway might impose some overhead for cloud flexibility. But with autoscaling, App Gateway can match capacity by adding instances.

  • Latency (External Clients): For end-user-facing traffic coming from the internet, tests show minimal difference between NGINX and AGIC. In one head-to-head latency test, the average response times were 88 ms for Azure Application Gateway vs 90 ms for NGINX – essentially a tie . The min and max latencies were also very close (min ~85 ms each, max ~104 ms vs 109 ms) . These tests were under light load (single concurrent user). The negligible gap suggests that for north-south traffic, the network transit time dominates and the choice of ingress controller adds only a few milliseconds either way. Azure’s claim that direct pod routing is faster didn’t show a meaningful impact in this scenario – likely because both still involve an Azure Load Balancer in front (for NGINX’s service) or DNS resolution, etc., and the processing time in each proxy is low (sub-millisecond).

  • Latency (Within Cluster): The story changes for east-west or internal calls via the ingress. If services inside the cluster call other services via the external ingress VIP (not a typical design, but sometimes done or in mesh-like architectures), NGINX can provide lower latency, especially if the NGINX pod is on the same node as the target service. The AKS team’s performance test created an “affinity” scenario (ingress and service pod on same node) vs “anti-affinity” (all on different nodes). In the affinity case, NGINX’s average latency was ~2.95 ms vs 5.53 ms for AGIC, making NGINX ~47% faster . In the worst case (anti-affinity), NGINX averaged ~4.30 ms vs 5.09 ms, still ~15% faster . Across both scenarios NGINX was ~38% faster on average . The takeaway: NGINX introduces less intra-cluster latency because it can often handle traffic on the same node or at least avoids an extra network hop. AGIC, on the other hand, always sends traffic out to the App Gateway and back in via the Azure fabric, which can add a few milliseconds. For most web apps, a few milliseconds is inconsequential, but for high-performance microservice call chains it could add up.

  • Failover Behavior: Quantitative data from failure tests highlights reliability differences. In a scenario where backend pods were rapidly killed and restarted, NGINX served 100% of requests without errors (no hiccups) while AGIC saw 50% of requests experience delays or issues, including some extreme outliers (30 seconds delays) . This is because Kubernetes local networking updated NGINX almost instantly when pods changed, whereas AGIC’s updates to the external gateway lagged enough to impact traffic. So in terms of reliability under change, NGINX demonstrated superior handling of pod churn.

  • Resource Consumption: NGINX, as a container, will consume CPU and memory on your nodes. A single NGINX ingress controller can typically run in a few hundred MB of RAM and moderate CPU under load, but it scales with traffic (TLS handshakes in particular are CPU-intensive). You might allocate, say, 0.5 vCPU and 500 MiB memory reservation for an ingress controller pod as a baseline, but high throughput could use more. If you need dozens of NGINX pods, you are dedicating a portion of your cluster to ingress. By contrast, AGIC’s pod in AKS is tiny (negligible resource use), and the heavy lifting is on Azure’s side – which you don’t see in kubectl top, but you certainly pay for it in Azure bills. One benefit of AGIC here is your cluster autoscaler won’t add nodes just because ingress needs more capacity; it’s decoupled.

Community Feedback and Real-World Experiences

It’s enlightening to see what practitioners say on forums and issue trackers about these options. Here’s a summary of common sentiments from GitHub, Stack Overflow, Reddit, and other sources:

  • Simplicity vs Integration: Many users appreciate the simplicity of NGINX ingress. “It just works” is a common refrain – deploy it and you get a functional ingress with little fuss. Conversely, some have found AGIC setup and gotchas frustrating. For instance, the limit on 100 backend pools in App Gateway tripped up one team (they hit 102 services and got errors) – something they wouldn’t have to think about with NGINX. A Reddit poster dealing with that issue was leaning towards dumping AGIC and said “you’re probably better off just using NGINX ingress” after hitting such Azure-specific limits . This highlights how AGIC can introduce Azure quotas/limits that cluster operators aren’t used to encountering.

  • Azure Networking Critiques: It’s not uncommon to hear strong opinions like “Azure networking products are bad” from some in the Kubernetes community . This likely stems from experiences with the complexity or performance of things like App Gateway. While that’s a broad-brush statement, it’s true that debugging layer 7 issues in a black-box service can be harder than with an in-cluster proxy you control. People comfortable with Linux tools may find it tedious to have to go to Azure Portal or CLI to get insights, versus just kubectl logs on an NGINX pod.

  • WAF and Security Needs: On the flip side, users who need a robust WAF often lean towards AGIC. One user mentioned that because AGIC initially had issues (like slow updates) and the new AGC didn’t yet have WAF, they went with Traefik + Coraza (open-source WAF) as an alternative . They also noted NGINX Plus with AppProtect (commercial WAF) would have been a viable fit . This shows there’s demand for WAF in ingress, and if the cloud-native ingress doesn’t deliver, users will seek another combo. Microsoft is actively working on adding WAF to AGC as noted by an Azure PM in that discussion . So Microsoft listens to feedback and is trying to close gaps.

  • Hybrid Approaches: Some advanced users take a hybrid approach – using both NGINX and Azure App Gateway together. For example, a pattern is to run NGINX ingress internally, but put an Azure Application Gateway or Front Door in front of it solely to get WAF and global routing, etc. In this setup, the external DNS points to App Gateway (with WAF), which then has a backend pool pointed at the NGINX service’s IP. Essentially NGINX does what it’s good at (K8s routing) and App Gateway adds an extra security layer on top . One Microsoft engineer actually recommended this approach for those needing WAF: “leverage NGINX within the cluster and put AppGW WAF in front, but don’t use AGIC – configure AppGW backend to point at the NGINX” . This speaks volumes – even Azure experts see the merit in NGINX’s integration, and suggest using App Gateway more as an external protector rather than the direct ingress controller via AGIC, in certain cases. The trade-off is you manage two layers and lose some dynamic syncing (you must configure AppGW manually or via Infrastructure-as-Code to point to NGINX), but it can be a pragmatic compromise.

  • Issues and Bug Reports: Scanning GitHub issues in the AGIC repository, common complaints include: delays updating routes, support for new features (people asking for gRPC, mutual TLS, etc.), and confusion around annotations. AGIC uses Kubernetes Ingress resources with specific Azure annotations; if those aren’t set perfectly, you might get weird behaviors. The documentation covers them, but it’s another learning curve (different from NGINX’s annotations). With NGINX, many of those features exist too but are documented in the ingress-nginx repo and are more uniform across environments.

  • Stack Overflow Q&A: On StackOverflow, questions like “How to choose between NGINX and Application Gateway?” often get answers focusing on the above points: NGINX for simplicity and if cost is a concern; Application Gateway if you need WAF or Azure integration. One answer to a similar question highlighted that with ingress (NGINX) you can consolidate to one load balancer for many services to cut costs – implying that if cost-saving is the goal, NGINX ingress is the way (since one LB vs many saves $$).

  • Community Votes: In Kubernetes forums and Reddit polls, NGINX typically gets the nod as the default ingress. Traefik is another community favorite (especially in open-source circles), but Azure’s AGIC is usually only brought up by those in an Azure-specific context. That said, as AGIC matures, I see more Azure architects recommending it in cloud-specific forums, especially with the promise of new features like AGC (which apparently raises limits and improves performance). For example, an Azure PM (Jack Stromberg) engaged on Reddit informing users about “Application Gateway for Containers” and its upcoming WAF, trying to address concerns – showing Microsoft’s effort in community outreach for this product.

Conclusion: Which Ingress Controller Should You Choose?

Choosing between NGINX Ingress and Azure’s AGIC for your AKS cluster comes down to your priorities and constraints:

  • Choose NGINX Ingress Controller if… you value a proven, high-performance, and flexible solution that keeps you cloud-agnostic. It’s a great fit when you need fine-grained control over routing rules, or if you’re trying to minimize costs. Teams with strong Linux/K8s skills can leverage NGINX’s vast community and tune it to perfection. NGINX shines in scenarios where low latency between services is critical and you want to avoid any external dependencies that might slow down updates. It’s also the safer choice if you might move to another Kubernetes platform in the future – you won’t have to re-tool your ingress. In short, use NGINX when you need a “Kubernetes-native” ingress that you control, and when Azure-specific features like WAF are not deal-breakers (or can be handled via other means). As one LinkedIn tech lead summarized: go for NGINX if you need high customizability, efficiency, and strong community support .

  • Choose Application Gateway Ingress (AGIC) if… you are all-in on Azure and need the value-adds it provides. This is ideal when security is paramount – if having a managed WAF with Azure’s security updates gives you peace of mind, AGIC is a strong contender. It’s also a fit for teams that want to reduce the operational burden on their Kubernetes cluster – if running another moving piece (NGINX pods) is something you’d rather avoid, using a PaaS service can simplify things. Large enterprises already using Azure networking products will appreciate the uniformity (using the same App Gateway for AKS and other apps). If your org has Azure support and prefers Microsoft to manage critical infrastructure components, AGIC aligns with that model. In short, use AGIC when integration with Azure’s ecosystem and offloading complexity is more important than saving every last millisecond or dollar. The tight coupling with Azure can pay off if you leverage those features fully. Or as the advice often goes: opt for AGIC if you’re heavily invested in Azure’s stack and need native features like WAF and autoscaling managed ingress .

It’s worth noting you aren’t necessarily locked in forever – some organizations start with NGINX (for speed/ease), and later migrate to AGIC as they scale and require WAF; others do the opposite, starting with AGIC and switching to NGINX to cut costs or improve performance for specific cases. Both controllers can even coexist in one cluster (each handling different ingresses) if you want to do a gradual evaluation or migration 

To view or add a comment, sign in

Others also viewed

Explore topics