SlideShare a Scribd company logo
Streams
for the web
by @domenic
Streams for the Web
Streams for the Web
Streams for the Web
Streams for the Web
sources of streaming data
AJAX
web sockets
files
web workers
IndexedDB
web rtc

webcam
microphone
geolocation.watchPosition
the human finger
setInterval
web audio sources
readable streams

var flappyBird = readable.read();
sinks for streaming data
AJAX
web sockets
files
web workers
IndexedDB
web rtc

<audio>
<video>
<canvas>
<template>
writable streams

writable.write(flappyBird);
transformations of streaming data
CSV parser
the html parser
JSON to HTML via template
string encoding/decoding
audio/video codecs
encryption/decryption
gzip
web workers
transform streams

transform.input.write(flappyBird);
var evilBird = transform.output.read();
simple pipe chains

readable.pipeTo(writable);
complex pipe chains

readable
.pipeThrough(t1).pipeThrough(t2)
.pipeTo(writable);
the future
fetch("http://example.com/video.mp4")
.pipeThrough(new MP4DecoderStream())
.pipeThrough(specialEffectsWebWorker)
.pipeTo(document.query("video"));
the future
fetch("http://example.com/images.zip")
.pipeThrough(new Unzipper())
.pipeTo(new InfinitePhotoGallery(document.query("#gallery")));
the future
navigator.webcam
.pipeThrough(new AdaptiveVideoResizer())
.pipeTo(rtcPeerConnection);
the future
var ws = new WebSocket("http://example.com/analytics");
var events = new EventStream(document, "click");
events
.pipeThrough(new EventThrottler())
.pipeThrough(new EventsAsJSON())
.pipeTo(ws.input);
the present
FileReader
readAsBinaryString
onprogress
loaded
total
target
target.result
XMLHttpRequest
open
responseText
send
onreadystatechange
MessageChannel
port1
port2
start
close

Worker
getUserMedia

src
srcObject
URL.createObjectURL
onloadedmetadata
createMediaStreamSource
canvas

getContext('2d').drawImage
toDataURL
createElement
appendChild
querySelector

postMessage
onmessage
onerror
terminate
event.data
MediaStreamTrack
onstarted
onended
readystate
stop
URL.createObjectURL
why streams win
• They’re a unifying abstraction
• They separate concerns
• They encourage reusable code
why streams win
• They efficiently map to low-level OS primitives
• They avoid buffering in memory
• They encapsulate backpressure
https://github.com/whatwg/streams
understanding streams
push vs. pull sources
readable stream buffering
backpressure
high-water marks
demo time!
http://thlorenz.github.io/stream-viz/
writable stream buffering
congrats. that was the hard stuff.
can streams be generic?
sync vs. async
error handling
abort/cancel
E

E

E

E

abort!

C

abort!

cancel!

cancel!

E

C

E

E
abort!

cancel!

E

C

E
phew.
i guess … i guess that was the easy stuff?
why?
Streams for the Web
https://github.com/whatwg/streams

More Related Content

PDF
Debugging node in prod
PDF
Node Interactive Debugging Node.js In Production
PPTX
Understanding eBPF in a Hurry!
PPTX
The Next Linux Superpower: eBPF Primer
PDF
Low Overhead System Tracing with eBPF
PDF
Observable Node.js Applications - EnterpriseJS
PPTX
Staring into the eBPF Abyss
PDF
Device-specific Clang Tooling for Embedded Systems
Debugging node in prod
Node Interactive Debugging Node.js In Production
Understanding eBPF in a Hurry!
The Next Linux Superpower: eBPF Primer
Low Overhead System Tracing with eBPF
Observable Node.js Applications - EnterpriseJS
Staring into the eBPF Abyss
Device-specific Clang Tooling for Embedded Systems

What's hot (20)

ODP
eBPF maps 101
PPTX
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
PPTX
eBPF Workshop
PPT
Introduction to gdb
PDF
Building Network Functions with eBPF & BCC
PPTX
Berkeley Packet Filters
PPTX
Modern Linux Tracing Landscape
PPTX
20141219 workshop methylation sequencing analysis
PDF
Debugging Hung Python Processes With GDB
PDF
Goroutine stack and local variable allocation in Go
PPTX
Improving go-git performance
PPTX
Developing High Performance Application with Aerospike & Go
PDF
Bpf performance tools chapter 4 bcc
PDF
Defcon 2011 network forensics 解题记录
PDF
Low pause GC in HotSpot
PDF
Tracer Evaluation
PDF
Linux System Monitoring with eBPF
PDF
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
PPTX
Debugging linux issues with eBPF
PDF
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;
eBPF maps 101
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
eBPF Workshop
Introduction to gdb
Building Network Functions with eBPF & BCC
Berkeley Packet Filters
Modern Linux Tracing Landscape
20141219 workshop methylation sequencing analysis
Debugging Hung Python Processes With GDB
Goroutine stack and local variable allocation in Go
Improving go-git performance
Developing High Performance Application with Aerospike & Go
Bpf performance tools chapter 4 bcc
Defcon 2011 network forensics 解题记录
Low pause GC in HotSpot
Tracer Evaluation
Linux System Monitoring with eBPF
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Debugging linux issues with eBPF
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;
Ad

Viewers also liked (20)

PDF
Rethink Async With RXJS
PDF
Functional Programming Patterns (BuildStuff '14)
PPTX
Introducing Razor - A new view engine for ASP.NET
PPTX
Razor and the Art of Templating
PPTX
Views
PDF
node.js and Containers: Dispatches from the Frontier
PPTX
The Promised Land (in Angular)
PPTX
Async Frontiers
PDF
Boom! Promises/A+ Was Born
KEY
Functional Reactive Programming in Javascript
PPTX
Angular 2 NgModule
PPTX
Routing And Navigation
PPTX
Upgrading from Angular 1.x to Angular 2.x
PPTX
Template syntax in Angular 2.0
PPTX
Component lifecycle hooks in Angular 2.0
PPTX
Angular 2.0 Dependency injection
PPTX
Angular 2 - Ahead of-time Compilation
PPTX
Performance Optimization In Angular 2
PPTX
Angular 1.x vs. Angular 2.x
PPTX
Creating Custom HTML Helpers in ASP.NET MVC
Rethink Async With RXJS
Functional Programming Patterns (BuildStuff '14)
Introducing Razor - A new view engine for ASP.NET
Razor and the Art of Templating
Views
node.js and Containers: Dispatches from the Frontier
The Promised Land (in Angular)
Async Frontiers
Boom! Promises/A+ Was Born
Functional Reactive Programming in Javascript
Angular 2 NgModule
Routing And Navigation
Upgrading from Angular 1.x to Angular 2.x
Template syntax in Angular 2.0
Component lifecycle hooks in Angular 2.0
Angular 2.0 Dependency injection
Angular 2 - Ahead of-time Compilation
Performance Optimization In Angular 2
Angular 1.x vs. Angular 2.x
Creating Custom HTML Helpers in ASP.NET MVC
Ad

Similar to Streams for the Web (20)

KEY
Push the web with HTML5
PDF
リニア放送型動画サービスの 
Web フロントエンド
PDF
Using Node.js to Build Great Streaming Services - HTML5 Dev Conf
PDF
The web can do that better - My adventure with HTML5 Vide, WebRTC and Shared ...
PPTX
The Fundamentals of HTML5
PPT
Introduction to TCP/IP
PDF
WebCodecs in WebKit With GStreamer!
PDF
WebRTC Videobroadcasting
PPTX
The Functional Web
PDF
Web rtc 핵심 기술에 대한 이해
PDF
HTML5 and Beyond
PDF
REST, JSON and RSS with WCF 3.5
PDF
2016 mORMot
PDF
Comet from JavaOne 2008
ODP
Interoperable Web Services with JAX-WS and WSIT
PDF
Introduction to Snort Rule Writing
PDF
about:HTML&Firefox
PDF
URL Design
ODP
Soap Toolkit Dcphp
PDF
It is not HTML5. but ... / HTML5ではないサイトからHTML5を考える
Push the web with HTML5
リニア放送型動画サービスの 
Web フロントエンド
Using Node.js to Build Great Streaming Services - HTML5 Dev Conf
The web can do that better - My adventure with HTML5 Vide, WebRTC and Shared ...
The Fundamentals of HTML5
Introduction to TCP/IP
WebCodecs in WebKit With GStreamer!
WebRTC Videobroadcasting
The Functional Web
Web rtc 핵심 기술에 대한 이해
HTML5 and Beyond
REST, JSON and RSS with WCF 3.5
2016 mORMot
Comet from JavaOne 2008
Interoperable Web Services with JAX-WS and WSIT
Introduction to Snort Rule Writing
about:HTML&Firefox
URL Design
Soap Toolkit Dcphp
It is not HTML5. but ... / HTML5ではないサイトからHTML5を考える

More from Domenic Denicola (19)

PPTX
The State of JavaScript (2015)
PPTX
The jsdom
PPTX
The Final Frontier
PPTX
ES6 in Real Life
PPTX
After Return of the Jedi
PPTX
The State of JavaScript
PPTX
How to Win Friends and Influence Standards Bodies
PPTX
The Extensible Web
PDF
ES6: The Awesome Parts
PPTX
PPTX
Client-Side Packages
PPTX
Creating Truly RESTful APIs
PPTX
Promises, Promises
PPTX
JavaScript on the Desktop
PPTX
ES6 is Nigh
PPTX
Real World Windows 8 Apps in JavaScript
PDF
Unit Testing for Great Justice
PDF
Understanding the Node.js Platform
PPTX
Callbacks, Promises, and Coroutines (oh my!): Asynchronous Programming Patter...
The State of JavaScript (2015)
The jsdom
The Final Frontier
ES6 in Real Life
After Return of the Jedi
The State of JavaScript
How to Win Friends and Influence Standards Bodies
The Extensible Web
ES6: The Awesome Parts
Client-Side Packages
Creating Truly RESTful APIs
Promises, Promises
JavaScript on the Desktop
ES6 is Nigh
Real World Windows 8 Apps in JavaScript
Unit Testing for Great Justice
Understanding the Node.js Platform
Callbacks, Promises, and Coroutines (oh my!): Asynchronous Programming Patter...

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPT
Teaching material agriculture food technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
A Presentation on Artificial Intelligence
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Electronic commerce courselecture one. Pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Chapter 3 Spatial Domain Image Processing.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Approach and Philosophy of On baking technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Big Data Technologies - Introduction.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
cuic standard and advanced reporting.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Reach Out and Touch Someone: Haptics and Empathic Computing
20250228 LYD VKU AI Blended-Learning.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Teaching material agriculture food technology
Digital-Transformation-Roadmap-for-Companies.pptx
MYSQL Presentation for SQL database connectivity
A Presentation on Artificial Intelligence
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Electronic commerce courselecture one. Pdf

Streams for the Web

Editor's Notes

  • #3: Let&apos;s talk for a minute about a very special book. This is the Principia Mathematica, by Isaac Newton. It&apos;s one of the most important works in the history of science.Among the many brilliant concepts introduced in the Principia, from calculus to orbital mechanics, underling it all was one central abstraction. Newton, more so than any before him, was able to see to the core of the universe and understand one of the fundamental ideas that allow us to describe it. This fundamental idea of Newton&apos;s was the idea of a force.
  • #4: The concept of force is so powerful because it explains and unifies so much of the world around us. From gravity to electromagnetism to subatomic processes, everything we experience fits within the framework of forces acting on objects.Before Newton so clearly articulated the force concept, and formalized it with his rules of motion, the many phenomena of the natural world seemed separate: special cases, each with their own rules and laws. How do the planets orbit the sun? What is lightning? How do chemical reactions transpire?But once the underlying primitive of force was made clear, we were able to understand the world around us in a much more comprehensive manner. We think about the actors of the system, and the forces between them. And at this level of abstraction, science progressed rapidly, bringing us to the world we have today.
  • #5: It&apos;s because of this kind of abstract thinking, on the level of fundamental primitives like forces, that we can understand and manipulate the universe at the level we do today.Once you know the core concepts of a system, you can build higher-level concepts on top of them, or bend them to your will. You can accomplish things that would have seemed superhuman beforehand—but are now obvious, or even easy.
  • #6: Which brings us to streams. Because of course, in programming, just as in real life, it&apos;s the underlying primitives—the fundamental abstractions—that give us the real power. We can unify large portions of the programming landscape under streams, and in doing so accomplish things easily and naturally which before might not even have occurred to us.So let’s talk about how exactly streams do this for us…
  • #8: Data comes out of a readable stream; you can read data from it. Whether it be any of the sources we mentioned before, or even just a bunch of flappy birds: the idea is to capture a streaming source of data as a concept.
  • #10: Data goes in to a writable stream; you write data to it. Any of the sinks we mentioned can be encapsulated in the notion of a writable stream.
  • #12: Transform streams are just both together: a writable stream that data goes in to, and a readable stream that data comes out of.In this case the transformation is very simple: it’s synchronous, so we can read immediately after writing, and it’s one-for-one, meaning each thing that goes in results in something coming out. In reality things are usually more complicated, e.g. a compressor will have much less data coming out than going in, and most transform streams take time to process their input before producing output.
  • #13: The most fundamental thing you can do with streams is pipe them to each other. In essence, this is the operation of reading from one and then writing to the other. But inside this seemingly-simple operation, we encapsulate a lot of complexity: matching the flow rates, propagating errors or other signals, and so on. But once we have that in hand…
  • #14: Here we see a more complex pipe chain, where data flows in one form from the original readable stream, being transformed twice before ultimately ending up in a writable stream.pipeThrough is a two-line sugar method that builds on pipeTo and the { input, output } structure of transform streams.So what does this look like, in practice? Well…
  • #22: What myself and others been working on for the last few months is streams for the web. We’re putting together a spec, and a reference implementation, for streams that could go in your browser, and solve all those problems I mentioned earlier.Most importantly, we’re trying very hard to draw upon the experience of Node. We get the benefit of a clean slate, so we can produce nicer APIs, but we want to make sure to incorporate all the important features of Node’s streams, and do better where possible. Isaac and others have been advising us on what can be fixed, simplified, or added, and we’ve been heavily drawing on their experience.Note that we’re doing our spec development in the open, on GitHub! Now if you go there, you’ll notice there’s still a lot of open issues: this is very much a work in progress. But it’s shaping up really well, with a few implementers on board already.
  • #23: So. That’s the high-level overview of what’s gong on. But I want to spend what time we have left on some more detailed stuff. Because it turns out that, when you go to write a spec for something, you end up having to really understand it---to dig deep into areas that before you just glossed over, and made assumptions about. I’ve found this process really fascinating, and I want to share some of what I’ve learned along the way with you.
  • #24: Consider a readable stream, wrapping some underlying source. There are two types: push sources, and pull sourcesA push source, like a TCP socket, will be constantly generating data. Like an EEA pull source, like a file, requires you to read from it: seek, read a specific length, etc.We want to unify both of these into a single abstraction, the readable stream. Readable streams can then present either a push or a pull interface: on(‘data’) for push, or read() for pull.The problem with on(‘data’) is that you lose data if you aren’t listening! That was Node’s original streams1 mistake. So we want a pull model; it is much more user-friendly.
  • #25: Now, let’s think about that losing data problem, and how we’ve solved it. When a push source underlies our readable stream, we’re going to keep getting data. We need to keep it ready, and not throw it away, for when somebody calls read() on the stream.So every readable stream carries around a buffer with it, containing all the data that’s come in so far but hasn’t been read.We can even use that buffer in the pull source case. Instead of using it to store data that’s being pushed at us, we can pull data into the buffer ahead of time, so that it’s ready to be read quickly when someone calls read(). This is a nice performance improvement over waiting until we are read from to go out and do our expensive disk access, for example.
  • #26: Of course, this naturally leads you to a problem: what if your buffer is getting “too full”? That is, what if nobody is reading from it, for a long period of time? Or maybe they’re reading from it pretty slowly. Like, what if you’re piping a fast filesystem stream to a slow server? Or a webcam stream to a peer on a slow mobile connection?The answer for this is called backpressure. It means, when your buffer is too full, you send some signal to the underlying push source, saying “stop sending me so much data.” Or, for a pull source, you just stop pulling so much data. It might not comply immediately, in which case you have to keep the data anyway---throwing away data is bad! But communicating this “pause” signal is crucial.
  • #27: There’s one more interesting piece in this whole puzzle. Which is, “how full is too full?”The way we usually think of this is in terms of something called a high-water mark. We let the stream’s buffer fill up until it reaches a certain point, at which we send the pause signal to the underlying source. Then we wait for the buffer to be drained, e.g. for someone to read all of the data we have. Once it’s all drained, we send a “resume” signal, and the buffer starts getting full up again.We’re still figuring out if this is exactly the best approach. Other approaches can be more complicated, involving e.g. low-water marks that let you resume before fully draining, or they can have no water marks at all, and just constantly send pause/resume signals. It’s a bit tricky.
  • #29: Writable streams also have buffers, but of course for a different reason than readable streams. The problem we’re solving for writable streams is that: most sinks expect a single write at a time, and expect those writes in order. So if someone writes two chunks of data to us, we need to be able to wait until the first chunk finishes before sending in the second chunk. While we’re waiting, we store those chunks in the buffer.What’s useful about that, is that we can use the full-ness of this buffer to communicate backward in a pipe chain. That is: if the writable stream being piped to is full, then we should stop reading from the readable stream and wait for the writable stream to drain its buffer first. It all fits together!
  • #30: OK. So. That is the most intricate part of understanding streams, and why they’re important. It’s all about backpressure, buffering, and the pipe chains.Take a deep breath.Now for some easier stuff.
  • #31: One of the other interesting questions that came up was, can streams be entirely generic? Or do we have to have byte streams, or string streams, or “object mode” like Node.js has, or other such switches?Well, the answer is “yes, but be careful.” In particular, you need to be careful with your high-water marks. It’s easy to say that your high-water mark is a megabyte, but … is it really meaningful to say that it’s 16 objects? How big are these objects? What do they hold? If it’s 16 &lt;img&gt;s, that might be too high. If it’s 16 prices streaming from the server, then that might be too low. You need to think carefully about this. This is one of the reasons we’re not sure high water marks is the best idea after all.One thing you don’t want to do is mix up strings and bytes in the same stream. Node streams have this kind of confusion built-in, where you can set the encoding for a byte stream and suddenly it becomes a string stream, and it’s horrible. Thing about Unicode characters getting cut off in the middle, and so on. Bad news. Lesson learned.
  • #32: If you think hard about all the systems we’re dealing with here, it turns out that, although in the general case they’re async, in reality much of the time the data is available synchronously.For example, when you read from a stream representing a file on disk, in general that will involve going out to the disk---an async operation. But much of the time, that operation could actually complete synchronously---your stream might be holding the data in its buffer already, or the OS might have cached that file into memory because it is accessed a lot, or any other such thing.Similarly, if you do a write, often you’re actually writing to an in-memory representation of the file, which the OS will flush at some later, scheduled time.So it’s very important that your basic reading and writing APIs present the ability to read and write data synchronously, when possible. It’s tempting to make everything simpler, and just be async always. But this introduces artificial delays into the system, which are especially bad if you introduce them at every step in the pipe chain, as they transform your best cases from a smooth flow of data into a kind of stutter-stop.
  • #33: What happens if your stream encounters an unrecoverable error reading or writing?Well, generally this means that the stream is no good and should be thrown away. All buffered data is thrown away; any further attempts to read or write fail; etc.But what about if it’s in a pipe chain?
  • #34: This is where a relatively new concept comes in: the abort and cancel signals.These are not present in Node streams, but forms of them are present in many of the experimental user-space streams I mentioned. And we’ve refined them already in our work, so much so that I haven’t even updated the spec yet---I just have an open issue with my thoughts.The idea is that you can abort a writable stream, saying, “stop writing, throw everything away, this whole thing was a failure.” And you can cancel a readable stream, saying “for whatever reason, I don’t need you anymore; stop reading, clean up, and go home.” These are slightly different.Let’s see how these two signals play out in a pipe chain, in error situations:
  • #38: The web is under attack. We’re too close to really see this, but consider how many startups these days build native apps before web apps---how much attention and user interest is captured in those walled gardens, those app stores.Our platform may have the most momentum right now, but it also has real problems---problems that could start our slow backslide into the same slow oblivion that’s greeted Java-on-the-desktop, Flash, and .NET.But I believe an open ecosystem can win, by leveraging its strengths.This includes obvious things---like shareable URLs; or auto-updating sites that don&apos;t require centralized approval; or excellent search engines much better than those of any app store. But it also includes our community, and how they contribute to our platform.This is why I’m a co-signer of the Extensible Web Manifesto. It’s a new approach to building our web platform, and it makes two points very dear to my heart. First, that we should focus on adding low-level primitives---the unifying concepts, like force, or streams, that I opened this talk with. Second, that we should use iteration by the community to inform our higher-level APIs. The streams work is an example of this process as well, given its heritage. It’s a perfect fit for our new extensible web.So this is why I am so excited about streams: it’s an important piece of the larger puzzle, of extending the web forward as best we can.
  • #39: Thanks!