How PyTorch community solved complex number limitation

4,253 followers

The PyTorch community solved torch.compile's complex number limitation through tensor subclassing. Instead of rewriting the entire compiler backend, they created an elegant wrapper that: ✓ Maintains performance ✓ Preserves existing APIs ✓ Works with hardware optimizations Innovation through extensibility. Get the technical breakdown in Andrew M. James' PyData Boston talk: https://lnkd.in/dtPU3txe #PyTorch #OpenSource #Innovation

Andrew James - Building on PyTorch: Techniques for Extensibility and Innovation (PyData Boston July)

https://www.youtube.com/

To view or add a comment, sign in

More Relevant Posts

Arjuna Anand

AI Inventor, Engineer & Researcher / Personalized AI Intelligence Service / Helping SME Businesses Stay Ahead on AI / Personalised Weekly & Monthly Sessions / Quarterly Board Updates / Building Agentic AI & RAG Systems
1w Edited
Report this post
pytorch + vllm, (blog link in comment below) pytorch’s vllm disaggregated inference improves efficiency, latency & throughput, vs. its internal stack, now they are making it available upstream, yaay !!
1 Comment
Like Comment
To view or add a comment, sign in
John A Drakopoulos

Neural network researcher
3w
Report this post
Any recommendations for open source code to build foundations models? I am considering options like the PyTorch Foundation Model Stack, LangChain, mosaicml/llm-foundry etc. I prefer pytorch (over jax) and I am primarily interested in the pipeline, tokenization, dataloaders, evaluation etc. Embeddings, transformers, or any other neural architecture or component of it are irrelevant -- as long as they are not hardwired into the system and are easy to remove entirely.
Like Comment
To view or add a comment, sign in
Samir Damle

Principal Designer, Platform DX at Salesforce | Wizo | AI Engineer & Guide | Certified from University of Washington and Deeplearning.ai
1w
Report this post
🤔 What is a Transformer? 🕑 2-minute intro 👉 A Transformer is a neural network architecture that learns context and relationships in sequential data using a mechanism called attention. ⭐️ It is an encoder followed by a decoder, each with multiple heads. 🤓 Why does it matter? Examples? How to use? 👇 Click through this quick fun carousel to find out in 2 mins. 📺 Video tutorial in the first comment.

1 Comment
Like Comment
To view or add a comment, sign in
Gregory Stoner

.Super Compute - AI Software and Platform Solutions Engineer
1w Edited
Report this post
Spent some time evaluating Cerebras free inference using Qwen-3 models. It is very fast on interactions with reasonable responses. https://chat.cerebras.ai/ Considering the potential of having 4 nodes of WSE-3 hardware on-prem with Claude 4.1 and 4.0 and GPT-5 as the core models, it sparked thoughts on the productivity boost for architecture exploration, code exploration, maintenance, and the foundation to accelerate code ports.

1 Comment
Like Comment
To view or add a comment, sign in
Sadiva Madaan

Data Scientist | Machine Learning | NLP | Python | Gen AI | LLMs | Software Engineering
1mo
Report this post
TIL how self-attention is actually implemented in PyTorch. We all know how masked self-attention prevents the model from looking at future tokens. But how is it actually implemented under the hood ? We create a lower triangular matrix using torch.tril and apply it to the attention scores. Refer - https://lnkd.in/gRYETTmx
Like Comment
To view or add a comment, sign in
PyTorch

290,687 followers
1w
Report this post
Since PyTorch 2.0 (PT2) introduced compilation, model execution speed and runtime efficiency improved. For very large, complex models at Meta, initial compilation became a significant bottleneck. This blog details how parallel Triton compilation, dynamic shape marking, autotuning config pruning, and cache improvements (including MegaCache) reduced PT2 compile time for one of Meta’s largest foundation models by more than 80%. These optimizations are integrated into the PT2 compiler stack. 🔗 https://hubs.la/Q03JZ-sq0 By Mingming Ding, James Wu, Oguz Ulgen, Sam Larsen, Bob Ren, Laith Sakka, Pian Pawakapan, Animesh Jain, Edward Yang, Yuzhen Huang, Ruilin Chen, Daohang Shi, Shuai Yang, Menglu Yu, Chunzhi Yang, Jade Nie #PyTorchFoundation #PyTorch #OpenSourceAI #AIInfrastructure
Like Comment
To view or add a comment, sign in
Faraha Ashraf

PhD Mathematics| Assistant Professor| Researcher in Graph Theory & Applied Mathematics | Curriculum DeveloperAssistant
3w
Report this post
I found an easy and fun way to remember the PyTorch training loop. 1) Forward Pass 2) Loss Calculation 3) Zero gradients 4) Back Propagate 5) Step the optimizer and repeat Source: https://lnkd.in/ghktsQQd #research #computation #neuralnetwork #deeplearning
Like Comment
To view or add a comment, sign in
Aritra Roy Gosthipaty
3w
Report this post
I have seen many people doing SFT the wrong way. You attend to the prompt, but never base your loss function on it. How would you do this? Add `-100` labels correspoinding to the prompt, such that PyTorch Cross Entroppy Loss understands to ignore it.
11 Comments
Like Comment
To view or add a comment, sign in
Aaron (Ari) Bornstein 🎗️

Principal AI Research Manager - Health & Life Sciences
3w
Report this post
This is why it’s important to understand the data that you are finetuning on and not treat that open source tutorial code you copied from a colab notebook on the internet as a black box. I have unfortunately witnessed this mistake recently as well.
Aritra Roy Gosthipaty
3w

I have seen many people doing SFT the wrong way. You attend to the prompt, but never base your loss function on it. How would you do this? Add `-100` labels correspoinding to the prompt, such that PyTorch Cross Entroppy Loss understands to ignore it.
Like Comment
To view or add a comment, sign in
Intrico

15 followers
6d Edited
Report this post
Quantum needs a bigger stage. The potential is world-changing, but too often quantum programming feels locked behind complexity. With Intrico, we set out to change that. Built in Rust, powered by our custom math engine, it makes creating entangled states as simple as writing a loop. In just 3 lines, you can spin up a Bell pair - the “Hello World” of quantum tech - and see entanglement in action. And yes, you can even simulate it and visualize the sampling outcomes in a histogram, right from your terminal. Check out this blog to know how: https://lnkd.in/dSPFN825 But this is only the beginning. Our mission is to bring quantum into the developer mainstream by making it: - Fast (Rust + optimized math at its core) - Physically rigorous (states normalize, rules respected) - Intuitive (a clean, modern API that feels natural to code) We’d love for you to try it, break it, and help us shape where it goes next. Repo: https://lnkd.in/d-CEygHx This isn’t just another release — it’s a step toward making quantum part of every builder’s toolkit. #QuantumComputing #QuantumTechnology #OpenSource #RustProgramming

Want to Create an Entangled Bell Pair in 3 Steps? medium.com

2 Comments
Like Comment
To view or add a comment, sign in

4,253 followers

View Profile Connect

LinkedIn respects your privacy

How PyTorch community solved complex number limitation

Andrew James - Building on PyTorch: Techniques for Extensibility and Innovation (PyData Boston July)

https://www.youtube.com/

More from this author

Exploring Business Growth and Creativity through Modern Data Infrastructure: An Interview with Brenna Buuck

Mastering AI/ML Infrastructure Scalability: Key Insights for Engineering Managers from Hope Wang

Enhancing Engineering Productivity: Leveraging Data Management, with Blake Burch

Explore content categories