The PyTorch community solved torch.compile's complex number limitation through tensor subclassing. Instead of rewriting the entire compiler backend, they created an elegant wrapper that: ✓ Maintains performance ✓ Preserves existing APIs ✓ Works with hardware optimizations Innovation through extensibility. Get the technical breakdown in Andrew M. James' PyData Boston talk: https://lnkd.in/dtPU3txe #PyTorch #OpenSource #Innovation
How PyTorch community solved complex number limitation
More Relevant Posts
-
pytorch + vllm, (blog link in comment below) pytorch’s vllm disaggregated inference improves efficiency, latency & throughput, vs. its internal stack, now they are making it available upstream, yaay !!
To view or add a comment, sign in
-
-
Any recommendations for open source code to build foundations models? I am considering options like the PyTorch Foundation Model Stack, LangChain, mosaicml/llm-foundry etc. I prefer pytorch (over jax) and I am primarily interested in the pipeline, tokenization, dataloaders, evaluation etc. Embeddings, transformers, or any other neural architecture or component of it are irrelevant -- as long as they are not hardwired into the system and are easy to remove entirely.
To view or add a comment, sign in
-
🤔 What is a Transformer? 🕑 2-minute intro 👉 A Transformer is a neural network architecture that learns context and relationships in sequential data using a mechanism called attention. ⭐️ It is an encoder followed by a decoder, each with multiple heads. 🤓 Why does it matter? Examples? How to use? 👇 Click through this quick fun carousel to find out in 2 mins. 📺 Video tutorial in the first comment.
To view or add a comment, sign in
-
Spent some time evaluating Cerebras free inference using Qwen-3 models. It is very fast on interactions with reasonable responses. https://chat.cerebras.ai/ Considering the potential of having 4 nodes of WSE-3 hardware on-prem with Claude 4.1 and 4.0 and GPT-5 as the core models, it sparked thoughts on the productivity boost for architecture exploration, code exploration, maintenance, and the foundation to accelerate code ports.
To view or add a comment, sign in
-
TIL how self-attention is actually implemented in PyTorch. We all know how masked self-attention prevents the model from looking at future tokens. But how is it actually implemented under the hood ? We create a lower triangular matrix using torch.tril and apply it to the attention scores. Refer - https://lnkd.in/gRYETTmx
To view or add a comment, sign in
-
-
Since PyTorch 2.0 (PT2) introduced compilation, model execution speed and runtime efficiency improved. For very large, complex models at Meta, initial compilation became a significant bottleneck. This blog details how parallel Triton compilation, dynamic shape marking, autotuning config pruning, and cache improvements (including MegaCache) reduced PT2 compile time for one of Meta’s largest foundation models by more than 80%. These optimizations are integrated into the PT2 compiler stack. 🔗 https://hubs.la/Q03JZ-sq0 By Mingming Ding, James Wu, Oguz Ulgen, Sam Larsen, Bob Ren, Laith Sakka, Pian Pawakapan, Animesh Jain, Edward Yang, Yuzhen Huang, Ruilin Chen, Daohang Shi, Shuai Yang, Menglu Yu, Chunzhi Yang, Jade Nie #PyTorchFoundation #PyTorch #OpenSourceAI #AIInfrastructure
To view or add a comment, sign in
-
-
I found an easy and fun way to remember the PyTorch training loop. 1) Forward Pass 2) Loss Calculation 3) Zero gradients 4) Back Propagate 5) Step the optimizer and repeat Source: https://lnkd.in/ghktsQQd #research #computation #neuralnetwork #deeplearning
To view or add a comment, sign in
-
This is why it’s important to understand the data that you are finetuning on and not treat that open source tutorial code you copied from a colab notebook on the internet as a black box. I have unfortunately witnessed this mistake recently as well.
I have seen many people doing SFT the wrong way. You attend to the prompt, but never base your loss function on it. How would you do this? Add `-100` labels correspoinding to the prompt, such that PyTorch Cross Entroppy Loss understands to ignore it.
To view or add a comment, sign in
-
-
Quantum needs a bigger stage. The potential is world-changing, but too often quantum programming feels locked behind complexity. With Intrico, we set out to change that. Built in Rust, powered by our custom math engine, it makes creating entangled states as simple as writing a loop. In just 3 lines, you can spin up a Bell pair - the “Hello World” of quantum tech - and see entanglement in action. And yes, you can even simulate it and visualize the sampling outcomes in a histogram, right from your terminal. Check out this blog to know how: https://lnkd.in/dSPFN825 But this is only the beginning. Our mission is to bring quantum into the developer mainstream by making it: - Fast (Rust + optimized math at its core) - Physically rigorous (states normalize, rules respected) - Intuitive (a clean, modern API that feels natural to code) We’d love for you to try it, break it, and help us shape where it goes next. Repo: https://lnkd.in/d-CEygHx This isn’t just another release — it’s a step toward making quantum part of every builder’s toolkit. #QuantumComputing #QuantumTechnology #OpenSource #RustProgramming
To view or add a comment, sign in
More from this author
-
Exploring Business Growth and Creativity through Modern Data Infrastructure: An Interview with Brenna Buuck
OpenTeams 1y -
Mastering AI/ML Infrastructure Scalability: Key Insights for Engineering Managers from Hope Wang
OpenTeams 1y -
Enhancing Engineering Productivity: Leveraging Data Management, with Blake Burch
OpenTeams 1y