Jan now runs faster on CUDA. We updated llama.cpp with the latest improvements and fixed a CUDA backend bug upstream. Jan stays pinned to v6324 due to flash-attention changes. If you have auto-update on, you'll get this automatically - if not, we recommend turning it on.
"Jan now runs faster on CUDA: Llama.cpp update and bug fix"
More Relevant Posts
-
Ever had a microservice dependency fail and bring down your entire application with it? It's a common problem in distributed systems, but one that can be solved with the Circuit Breaker pattern. I'm excited to share my new Dart package, dart_circuit_breaker, now available on pub.dev. It provides a simple, state-driven solution to prevent cascading failures and keep your services healthy. Learn how to build more robust applications and add it to your project today! 🔗 https://lnkd.in/dFKargyV #Dart #Flutter #SystemDesign #DistributedSystems #ResilientSystems #CircuitBreaker #SoftwareArchitecture
To view or add a comment, sign in
-
Virtual threads are fantastic for simplifying blocking I/O scenarios, but reactive streams still have several key advantages: **Composition and Flow Control**: Reactive streams excel at complex data transformations and pipeline composition. You can elegantly chain operations like map, filter, flatMap, and handle backpressure declaratively. With virtual threads, you'd need more imperative coordination code for equivalent complex workflows. **Backpressure Management**: This is huge for I/O bound systems. Reactive streams have built-in backpressure handling - when downstream consumers can't keep up, the system can buffer, drop, or slow down producers automatically. Virtual threads don't inherently solve this; you still need explicit queue management and coordination. **Resource Efficiency at Extreme Scale**: While virtual threads are lightweight, reactive streams can be even more efficient for scenarios with millions of concurrent operations since they're event-driven rather than thread-based, even if those threads are virtual. **Event-Driven Architectures**: For systems built around event streams, message brokers, or real-time data processing, reactive streams are more naturally aligned with the problem domain. **Integration with Reactive Ecosystems**: If you're already using reactive databases, message systems, or frameworks, staying reactive end-to-end often makes more sense than mixing paradigms. That said, virtual threads are game-changing for traditional request-response patterns and make blocking I/O code much simpler to write and debug. The choice often comes down to whether your problem is naturally stream-oriented versus request-oriented, and how much complexity you're willing to trade for the reactive benefits. What's your take on the debugging and maintainability aspects? That's often where the rubber meets the road in real projects.
Simplifying Code: migrating from Reactive to Virtual Threads This is exactly what virtual threads were made for - making the life of developers simpler by making code easier to maintain. And yes, the reactive code might have been faster, less resource intensive, but it probably still was less economic. Looking forward to the 'yes, but...' comments here 😂
To view or add a comment, sign in
-
🚀 After eight months of hard work, we removed the Rust engine binaries from Prisma ORM! Here's why: 😍 Reduced bundle size by ~90% ⚡️ Faster queries (on avg ~3.48x faster in our benchmark) 🐾 Lower CPU footprint 💡 Less deployment complexity 🤝 Easier open-source contributions 💡 The Rust-free Prisma ORM is ready for production as of v6.16.0. You can enable it by: ✅ setting the `engineType` option on the `generator` block ✅ installing the driver adapter for your database 👉 Learn more in the docs: https://lnkd.in/dM95zuGp
To view or add a comment, sign in
-
-
Concurrency bugs don’t crash your code in dev. They wait… and strike in production. ⚡ Last week, I was scaling a Go service that looked flawless on paper. Goroutines everywhere, fan-out patterns in place, zero obvious bottlenecks. 🚨 But when real load hit, throughput nosedived even though CPU and memory were barely touched. The issue wasn’t the goroutines themselves — it was how they were orchestrated. A single blocked channel caused backpressure that rippled through the system. Some goroutines never released, others piled up, and suddenly the whole service stalled. 🛠️ The fix wasn’t “add more goroutines.” It was: • Redesigning batching • Adding buffered channels for controlled backpressure • Implementing cancellation to stop zombie goroutines 💡 Go makes concurrency easy to start, but hard to master at scale. Goroutines are cheap — orchestration is where the real engineering happens.
To view or add a comment, sign in
-
Low-level Swift: Linking The LLVM backend transforms LLVM IR into machine code and produces object .o files. These files contain optimised CPU-architecture-specific assembly instructions, alongside metadata, constants, and debug information. ARM assembly object files might look a little like this: See the full blog here: https://lnkd.in/epWH2JYm
To view or add a comment, sign in
-
-
Playing around with Rust on ESP32 microcontrollers again. So on the ESP-RS board it is easy, this board is made for Rust and uses the ESP32-C3 chip which uses the riscv target so it just works with the latest stable Rust compiler. But some other boards I have are ESP32-S3 (dual core), but it also turns out they are not riscv but Extensa cores and need a special Espressif version of the Rust compiler. I thought this was going to be a pain, but it seems to be easy to install: cargo install espup --locked And then: espup install Followed by: esp-generate --chip esp32s3 projectname I'm hoping to write some drivers for various i2c based sensors I have using this framework: https://lnkd.in/gHCyAFhy Hopefully it won't be too much of a challenge getting it working on both ESP32-C3 and S3 boards. edit: I also had to install the Extensa linker xtensa-esp32s3-elf-gcc https://lnkd.in/gaPu5h5z
To view or add a comment, sign in
-
Ingress controller routing mechanism Ingress resource (k8s object) defines rules: host, path → service backend (name + port). Ingress controller (NGINX, Traefik, Istio Gateway, etc.) watches Ingress objects and converts rules into runtime config (reverse-proxy rules, routes). Request flow: 1. External client → DNS → load balancer / NodePort / LB service IP. 2. LB → ingress controller pod(s). 3. Controller matches host+path to an Ingress rule. 4. Controller forwards to the corresponding Service (ClusterIP) which load-balances to Pod endpoints (Endpoints / EndpointSlices). 5. Controller may apply TLS termination, auth, rate-limiting, rewrite, header mods, and upstream retries. Notes: Controller decides based on order/priority and exact match vs prefix. Some support advanced routing (canary, weight-based splits) and service mesh can intercept and do L7 logic. #kubernetes#ingess#ingress controller#K8s
To view or add a comment, sign in
-
🚀 𝗦𝗻𝗲𝗮𝗸 𝗽𝗲𝗲𝗸: 𝗻𝟴𝗻 𝗼𝗯𝘀𝗲𝗿𝘃𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝘄𝗶𝘁𝗵 𝗣𝗮𝗿𝘀𝗲𝗮𝗯𝗹𝗲 🔍 Setting up n8n workflows is fun; debugging them should also feel the same. Question for you, what’s the single most important thing you need when instrumenting observability for an n8n flow? Drop your #1 must-have in the comments below. Why does this matter? n8n ships its logs through the popular Winston logger. Handy, but teams keep tripping over a few bumps: • Performance tax – Winston’s heavier JSON serialization can add noticeable CPU & latency versus lightweight loggers. • Trace context gaps – getting Winston logs to play nicely with OpenTelemetry often needs custom code. Stay tuned for the full write-up on auto-instrumenting Winston, normalizing logs, and linking everything to metrics + traces in one place. Until then, try Parseable at demo.parseable.com #n8n #observability #logging #NodeJS #Winston #Parseable
To view or add a comment, sign in
-
In serverless, suspended functions can leak idle DB connections until the DB times them out, exhausting the pool. With Vercel's Fluid compute, you can prevent this by pairing Prisma ORM’s driver adapters with attachDatabasePool. Learn more 👇 https://lnkd.in/eUcd5MaS
To view or add a comment, sign in
-
🚀 Exploring Node.js Application Scaling through Clustering! Node.js typically operates on a single thread, utilizing just one CPU core regardless of server potency. 👉 Delving into Clustering: - Initiates various worker processes (one for each CPU core). - The master process allocates incoming requests among the workers. - Enhances efficiency, expandability, and resilience. ⚡ Why its amazing? Because, On a 4-core system, 4 workers can manage a workload four times greater. In case of a worker failure, the master reinstates it seamlessly.
To view or add a comment, sign in