How to Scale Node.js Applications with Clustering

1w Edited

🚀 Exploring Node.js Application Scaling through Clustering! Node.js typically operates on a single thread, utilizing just one CPU core regardless of server potency. 👉 Delving into Clustering: - Initiates various worker processes (one for each CPU core). - The master process allocates incoming requests among the workers. - Enhances efficiency, expandability, and resilience. ⚡ Why its amazing? Because, On a 4-core system, 4 workers can manage a workload four times greater. In case of a worker failure, the master reinstates it seamlessly.

To view or add a comment, sign in

More Relevant Posts

Gideon Adurota

Cloud/DevOps Engineer | DevOps, Cloud Migration, IaC
1w
Report this post
𝗞𝘂𝗯𝗲𝗿𝗻𝗲𝘁𝗲𝘀 𝗜𝗻𝘁𝗲𝗿𝘃𝗲𝘄 𝗤𝘂𝗲𝘀𝘁𝗶𝗼𝗻 𝗪𝗵𝗮𝘁 𝗵𝗮𝗽𝗽𝗲𝗻𝘀 𝘄𝗵𝗲𝗻 𝗮 𝗽𝗼𝗱 𝘂𝘀𝗲𝘀 𝘂𝗽 𝗺𝗲𝗺𝗼𝗿𝘆 𝗮𝗻𝗱 𝘄𝗵𝗮𝘁 𝗵𝗮𝗽𝗽𝗲𝗻𝘀 𝘄𝗵𝗲𝗻 𝗶𝘁 𝘂𝘀𝗲𝘀 𝘂𝗽 𝗖𝗣𝗨? This is an interview question I was asked some time ago, and I thought to share the difference between these two scenarios. 𝗠𝗲𝗺𝗼𝗿𝘆 𝗮𝘁 𝟭𝟬𝟬% 𝗨𝘀𝗮𝗴𝗲 What happens technically: 🔹Each pod runs inside a cgroup with a memory limit. 🔹When the container’s memory consumption reaches the limit, the kernel tries to allocate more RAM but fails because the cgroup enforces a hard cap. 🔹Linux cannot throttle memory usage so the kernel triggers an Out of Memory (OOM) kill inside the cgroup. Kubernetes tries to restart the container depending on the restart policy. 🔹This is a hard failure. 𝗖𝗣𝗨 𝗮𝘁 𝟭𝟬𝟬% 𝗨𝘀𝗮𝗴𝗲 What happens technically: 🔹Unlike Memory, CPU is a compressible resource and the kernel scheduler users CFS (Completely Fair Scheduler) + cgroup quotas. 🔹The pod throttles the process (delays execution) so it doesn’t exceed its quota and work will execute slower. 🔹This is a soft degradation and not a hard failure. It only causes performance degradation. 𝗦𝗶𝗺𝗽𝗹𝗲 𝗔𝗻𝗮𝗹𝗼𝗴𝘆 Using a Restaurant: ✅Memory is like tables. No free table, then customers are kicked out (OOMKill). ✅CPU is like waiters. Not enough waiters, service is slower but no one is kicked out. #𝘈𝘒𝘚 #𝘒𝘶𝘣𝘦𝘳𝘯𝘦𝘵𝘦𝘴 #𝘋𝘦𝘷𝘖𝘱𝘴 #𝘚𝘙𝘌
Like Comment
To view or add a comment, sign in
Akhilesh Mishra

Senior DevOps Engineer | Write To Inspire | 11K on Medium | 39k on Linkedin | Mentor | Educator | Tech Writer | Help people get into DevOps
1w
Report this post
Your pod is stuck in Pending and you’re about to start randomly scaling nodes hoping something works. Stop. Take a breath. Here’s the systematic approach that actually fixes things: When pods won’t schedule, there are only 5 real culprits: 👉Resource starvation - Your nodes don’t have enough CPU/memory. Check kubectl top nodes before you panic-scale the cluster. 👉Taint drama - Your workload is trying to land on a node that doesn’t want it. Nodes with taints need pods with matching tolerations. It’s like a club with a dress code. 👉Selector shenanigans - Your pod is being picky about which nodes it wants to run on. Check nodeSelector and affinity rules that might be too restrictive. 👉Storage problems - PersistentVolumeClaims can’t find available storage or the storage class doesn’t exist. Your pod is homeless until this gets fixed. 👉Scheduler confusion - Sometimes the scheduler just gives up. Check the scheduler logs for the real error message. ✅The debugging order matters: Start with resources (most common), then work through taints, selectors, storage, and finally scheduler logs. Pro tip: Use kubectl describe pod <name> first. The Events section usually tells you exactly what’s wrong instead of making you guess. Most “mysterious” Kubernetes issues have boring explanations. Skip the guesswork and follow the flowchart. What’s your most frustrating Pending pod story?
2 Comments
Like Comment
To view or add a comment, sign in
Deethesh Diwakara Suvarna

Senior DevOps | SRE | DevSecOps | MLOps | Multi-Cloud (Azure, AWS, GCP) | Kubernetes
1mo Edited
Report this post
HPA Demystified (Part 1/3) Most people think Kubernetes HPA is just about “scale Pods when CPU goes high.” But the real picture is much deeper 👇 ⚙️ How HPA actually works 🧩 It’s one of the controller running inside the kube-controller-manager. ⏱️ It checks metrics every default 15 seconds and adjusts replicas.( you can always change --horizontal-pod-autoscaler-sync-period eg : 5 sec BUT this makes HPA react faster, but puts extra load on your metrics pipeline (metrics-server)) 📜 It acts on an API resource --> HorizontalPodAutoscaler in autoscaling/v2. (latest & recommended). ⚡ Note: If you use the older autoscaling/v1, you only get CPU-based scaling. With autoscaling/v2, you unlock scaling on memory, custom metrics, and external metrics. From where does it get metrics? 💻 Resource Metrics API → CPU, memory (from metrics-server -- > ) 👉 `/apis/metrics.k8s.io/v1beta1` ⚡ Note: Metrics-server runs as a cluster add-on and collects resource usage from each node’s kubelet/cAdvisor. 🛠️ Custom Metrics API → app-specific metrics (via adapters) 👉 `/apis/https://lnkd.in/ef23eBEy 🌐 Object Metrics API → metrics tied to objects like Ingress/Service (e.g., requests per second). 🚀 External Metrics API → metrics from outside the cluster 👉 `/apis/https://lnkd.in/eqXpakd5 ⚡ Example: For Datadog, you must run the Datadog Cluster Agent to expose Datadog metrics to Kubernetes. Formula HPA uses: desiredReplicas = ceil(currentReplicas × currentMetric / targetMetric) Example: Current replicas = 4 Current CPU = 200m Target CPU = 100m ➡️ desiredReplicas = ceil(4 × 200 / 100) = ceil(8) HPA scales from 4 --> 8 Pods. ⚠️ Important note: Many teams only configure HPA with CPU/Memory metrics because it’s easy. But CPU ≠ user experience. For web apps, latency or queue length might be better scaling signals. 👉 Always choose metrics that reflect your real workload needs, not just CPU usage. 📖 And of course, the ultimate bible for all things Kubernetes is here: 👉 https://lnkd.in/e5zjJfd7 (Everything I share is always referenced from here 😉) That’s the core loop of HPA(Controller + API resource). But that’s just the start. We can go deeper into-- > I’ve got 2 more posts coming where I’ll dive into: 🚦 Special handling & conservative math 🛡️ 🎛️ Advanced tuning (policies, stabilization, tolerance, container metrics) ⚙️ #Kubernetes #HPA #DevOps #CloudNative #Autoscaling #SRE

Kubernetes Documentation kubernetes.io
Like Comment
To view or add a comment, sign in
Dhanraj P

💻 Aspiring .NET Developer | Skilled in .NET Technologies | Passionate about Building Scalable Software Applications | Final Year B.Tech Student – CSBS
2w
Report this post
💡 Every .NET dev has seen it… 👉 w3wp.exe maxing out the CPU in Task Manager. But here’s the truth is: >w3wp.exe = IIS Worker Process >It simply runs your application and handles requests >If CPU spikes, it’s usually caused by your code: memory leaks, infinite loops, heavy queries, or bad optimizations ✅ Next time you see w3wp.exe eating resources, don’t panic—fix the code behind it. #DotNet #IIS #w3wp #Debugging #DeveloperLife
Like Comment
To view or add a comment, sign in
Shardool Patel

Co-Founder (YC F24) | Ex-Palantir
3w Edited
Report this post
The most expensive code isn’t the buggy code. It’s the code you refuse to delete. Our first logging system looked "clever": 1/ Agents opened websockets to a hub 2/ Hub fetched logs from agents on request 3/ Agents streamed logs back to users It worked… until more than one user requested the same logs. CPU spiked. Bandwidth doubled. We almost wasted weeks building caching, deduplication, batching. But here’s the truth: we weren’t optimizing code. We were optimizing the wrong design. So we scrapped it. Rebuilt with centralized log collection. One stream, many consumers. Done. Lesson: don’t ask "How can I optimize this code?" Ask "Is this code worth optimizing at all?" Sometimes the cheapest optimization is rm -rf

3 Comments
Like Comment
To view or add a comment, sign in
Efficient Computer

7,254 followers
6d Edited
Report this post
The Payoff: Accelerator-Class Efficiency for Everything Because all parts of your application run on the Fabric—not just selected kernels. That’s how it shifts the energy-performance curve on real workloads from end to end. The results show up where it matters: longer battery life on devices without room for a farm of accelerators, lower cloud costs thanks to fewer servers (and less cooling), and a simpler software stack—one toolchain, one binary, one happy developer. Ready to try it? Developers can reach out to us to get SDK access and run existing C/C++ code through the effcc Compiler with no code changes. Researchers can explore the architecture in our white paper. Hardware partners can reach out for IP licensing or evaluation kits. Learn more: https://lnkd.in/gvFxMHSb #EfficientComputer #FabricArchitecture #effcc
1 Comment
Like Comment
To view or add a comment, sign in
Alex Ptakhin

Software Team Lead @ Prestatech | ex-Amazing ex-Company
4w
Report this post
How many asterisk disclaimers should be added to "We build, we run it"? Yes, we run it, but we don't own all the code for our dependencies. Yes, we run it, but we don't own the Kubernetes codebase. Yes, we run it, but we don't own the codebase of hypervisors of virtual machines. Yes, we run it, but we don't own the operating system's core source and network drivers on all devices within the connection. Yes, we run it, but we don't make our own CPUs and their microcode. We checked the box that we understand it's working sometimes, and we rely on this as a base-level guarantee. I'm still loving this motto, just what do we mean really?

2 Comments
Like Comment
To view or add a comment, sign in
Victor Bitencourt

Database Developer | Self Hosting | UNEB
1w
Report this post
How to reduce your #Spring #Boot #Docker container #RAM #memory usage by over 50% for FREE 🤯 Just switch your #JVM. On the top #container, I'm using the #Eclipse #Temurin #Maven image as base in my #Dockerfile, which uses the #HotSpot JVM, the default most people use. On the bottom container, over 50% lighter on RAM, I'm using the #IBM #Semeru image, which uses the famously lighter Eclipse #OpenJ9 JVM, optimized for lighter RAM usage. Have fun. There, I saved you a few megabytes of RAM. You're welcome. For those cautious over licensing: IBM only distributes the binaries, the runtime itself is open-source 😉
Like Comment
To view or add a comment, sign in
David Hernandez

Infrastructure Team
3w
Report this post
🚀 **Excited for PSI Metrics Coming to Kubernetes 1.34!** 🚀 I'm thrilled to see Pressure Stall Information (PSI) metrics becoming a beta feature in Kubernetes 1.34 https://lnkd.in/dMHdftry PSI shines a spotlight on per-container CPU, memory, and I/O pressure—delivering a much more accurate picture of resource contention than traditional usage metrics. This is a huge leap forward for debugging and right-sizing workloads, especially in complex environments where node-level metrics just aren't enough. I've already run into situations where node-level pressure metrics couldn’t tell the whole story, and being able to see PSI at the container level will be a game-changer for reliability and troubleshooting. Big thanks to this excellent blog post https://lnkd.in/d3PKjqGh by Zain M. for breaking down why PSI matters so much and why it’s a better approach than plain utilization metrics. FYI The metric has already been collected by some compatible CRIs, such as containerd, for a while now. This means you could write your own exporter or manually SSH into the node to view the metrics for each container. #Kubernetes #Monitoring #CloudNative #DevOps #k8s
Like Comment
To view or add a comment, sign in
Imran Latif

Backend & Cloud Engineer
2w Edited
Report this post
Scaling challenge on EKS - hitting outbound limits I am running a production kubernetes cluster on EKS. Things scale fine on CPU and memory but when traffic spikes and workloads make lots of outbound calls to a few external APIs, we start hitting bottlenecks: 1. Connection errors and timeouts under load 2. NAT gateway limits on outbound connections 3. Pods stuck in pending even when nodes have capacity Tried so far: 1. Reusing connections and pooling -> helps, but bursts still fail 2. Adding more NAT gateways -> expensive and fragile 3. Bigger nodes and network tweaks -> did not solve the core issue 4. Looked into Cilium egress gateways -> promising but adds more moving parts The real problem: too many outbound connections to the same few IPs and NAT runs out of capacity. I would love to hear from anyone who faced this in production 1. Did using a proxy layer (Envoy/HAProxy) actually solve it? 2. Is Cilium egress gateway stable at scale? 3. Better to shard NAT gateways or assign fixed IPs per service? 4. Any retry/backoff strategies that actually worked? Not great help from LLM's. Curious what's worked for you
Like Comment
To view or add a comment, sign in

507 followers

191 Posts

View Profile Follow

LinkedIn respects your privacy

How to Scale Node.js Applications with Clustering

Explore content categories