LPC: The past, present, and future of Linux audio
The history, status, and future of audio for Linux systems was the topic of two talks—coming at the theme from two different directions—at the Linux Plumbers Conference (LPC). Ardour and JACK developer Paul Davis looked at audio from mostly the professional audio perspective, while PulseAudio developer Lennart Poettering, unsurprisingly, discussed desktop audio. Davis's talk ranged over the full history of Linux audio and gave a look at where he'd like to see things go, while Poettering focused on the changes since last year's conference and "action items" for the coming year.
Davis: origins and futures
Davis started using Linux as the second employee at Amazon in 1994, and
started working on audio and MIDI software for Linux in 1998. So, he has
been working in Linux audio for more than ten years. His presentation was
meant to provide a historical overview on why "audio on linux still
sucks, even though I had my fingers in all the pies that make it
suck
". In addition, Davis believes there are lessons to be learned
from the
other two major desktop operating systems, Windows and Mac OS X, which
may help in getting to better Linux audio.
He outlined what kind of audio support is needed for Linux, or, really, any
operating system. Audio data should be able to be brought in or sent out
of the system via any available audio interface as well as via the
network. Audio data, as well as audio routing information, should be able
to be shared between applications, and that routing should be able to
changed on the fly based on user requests or hardware reconfiguration.
There needs to be a "unified approach
" to mixer controls, as
well. Most important, perhaps, is that the system needs to be
"easy to understand and to reason about
".
Some history
Linux audio support began in the early 1990s with the Creative SoundBlaster driver, which became the foundation for the Open Sound System (OSS). By 1998, Davis said, there was growing dissatisfaction with the design of OSS, which led Jaroslav Kysela and others to begin work on the Advanced Linux Sound Architecture (ALSA).
Between 1999 and 2001, ALSA was redesigned several times, each time
requiring audio applications to change because they would no longer
compile. The ALSA sequencer, a kernel-space MIDI router, was also added
during this time frame. By the end of 2001, ALSA was adopted as the
official Linux audio system in favor instead of OSS. But, OSS didn't disappear and
is still developed and used both on Linux and other UNIXes.
In the early parts of this decade, the Linux audio developer community started discussing techniques for connecting audio applications together, something that is not supported directly by ALSA. At roughly the same time, Davis started working on the Ardour digital audio workstation, which led to JACK. The audio handling engine from Ardour was turned into JACK, which is an "audio connection kit" that works on most operating systems. JACK is mostly concerned with the low-latency requirements of professional audio and music creation, rather than the needs of desktop users.
Since that time, the kernel has made strides in supporting realtime scheduling that can be used by JACK and others to provide skip-free audio performance, but much of that work is not available to users. Access to realtime scheduling is tightly controlled, so there is a significant amount of per-system configuration that must be done to access this functionality. Most distributions do not provide a means for regular users to enable realtime scheduling for audio applications, so most users are not benefiting from those changes.
In the mid-2000s, Poettering started work on the PulseAudio server, KDE stopped using the aRts sound server, GStreamer emerged as a means for intra-application audio streaming, and so on. Desktops wanted "simple" audio access APIs and created things like Phonon and libsydney, but meanwhile JACK was the only way to access Firewire audio. All of that led to great confusion for Linux audio users, Davis said.
Audio application models
At the bottom, audio hardware works in a very simple manner. For record (or capture), there is a circular buffer in memory to which the hardware writes, and from which the software reads. Playback is just the reverse. In both cases, user space can add buffering on top of the circular buffer used by the hardware, which is useful for some purposes, and not for others.
There are two separate models that can be used between the software and the hardware. In a "push" model, the application decides when to read or write data and how much, while the "pull" model reverses that, requiring the hardware to determine when and how much I/O needs to be done. Supporting a push model requires buffering in the system to smooth over arbitrary application behavior. The pull model requires an application that can meet deadlines imposed by the hardware.
Davis maintains that supporting push functionality on top of pull is easy, just by adding buffering and an API. But supporting pull on top of push is difficult and tends to perform poorly. So, audio support needs to be based on the pull model at the low levels, with a push-based API added in on top, he said.
Audio and video have much in common
OSS is based around the standard POSIX system calls, such as
open(), read(), write(), mmap(), etc.,
while
ALSA (which supports those same calls) is generally accessed through
libasound, which has a
"huge set of functions
". Those functions provide ways to
control hardware and software configuration along with a large number of
commands to support various application styles.
In many ways, audio is like video, Davis said. Both generate a
"human sensory experience
" by rescanning a data buffer and
"rendering" it to the output device. There are differences as well, mostly
in refresh rates and the effect of missing refresh deadlines. Unlike
audio, video data doesn't change that frequently when someone is just
running a GUI—unless they are playing back a video. Missed video
deadlines are often imperceptible, which is generally not true for audio.
So, Davis asked, does anyone seriously propose that video/graphics
applications should talk to the hardware directly via open/read/write/etc.?
For graphics, that has been mediated by a server or server-like API for
many years. Audio should be the same way, even though some disagree,
"but they are wrong
", he said.
The problem with UNIX
The standard UNIX methods of device handling, using open/read/write/etc., are not necessarily suitable interfaces for interacting with realtime hardware. Davis noted that he has been using UNIX for 25 years and loves it, but that the driver API lacks some important pieces for handling audio (and video). Both temporal and data format semantics are not part of that API, but are necessary for handling that audio/video data. The standard interfaces can be used, but don't promote a pull-based application design.
What is needed is a "server-esque architecture
" and API that
can explicitly handle data format, routing, latency inquiries, and
synchronization. That server would mediate all device interaction, and
would live in user space. The API would not require that various services
be put into the kernel. Applications would have to stop believing that
they can and should directly control the hardware.
The OSS API must die
The OSS API requires any services (like data format conversion, routing, etc.) be implemented in the kernel. It also encourages applications to do things that do not work well with other applications that are also trying to do some kind of audio task. OSS applications are written such that they believe they completely control the hardware.
Because of that, Davis was quite clear that the "OSS API must
die
". He noted that Fedora no longer supports OSS and was hopeful that
other distributions would follow that lead.
When ALSA was adopted, there might have been an opportunity to get rid of OSS, but, at the time, there were a number of reasons not to do that, Davis said. Backward compatibility with OSS was felt to be important, and there was concern that doing realtime processing in user space was not going to be possible—which turned out to be wrong. He noted that even today there is nothing stopping users or distributors from installing OSS, nor anything stopping developers from writing OSS applications.
Looking at OS X and Windows audio
Apple took a completely different approach when they redesigned the audio
API for Mac OS X. Mac OS 9 had a "crude audio architecture
"
that was completely replaced in OS X. No backward compatibility was
supported and developers were just told to rewrite their applications. So,
the CoreAudio component provides a single API that can support users on
the desktop as well as professional audio applications.
On the other side of the coin, Windows has had three separate audio interfaces along the way. Each maintained backward compatibility at the API level, so that application developers did not need to change their code, though driver writers were required to. Windows has taken much longer to get low latency audio than either Linux or Mac OS X.
The clear implication is that backward compatibility tends to slow things down, which may not be a big surprise.
JACK and PulseAudio: are both needed?
JACK and PulseAudio currently serve different needs, but, according to Davis, there is hope that there could be convergence between them down the road. JACK is primarily concerned with low latency, while PulseAudio is targeted at the desktop, where application compatibility and power consumption are two of the highest priorities.
Both are certainly needed right now, as JACK conflicts with the application design of many desktop applications, while PulseAudio is not able to support professional audio applications. Even if an interface were designed to handle all of the requirements that are currently filled by JACK and PulseAudio, Davis wondered if there were a way to force the adoption of a new API. Distributions dropping support for OSS may provide the "stick" to move application developers away from that interface, but could something similar be done for a new API in the future?
If not, there are some real questions about how to improve the Linux audio
infrastructure, Davis said. The continued existence of both JACK and
PulseAudio, along with supporting older APIs, just leads to
"continued confusion
" about what the right way to do audio on
Linux really is. He believes a unified API is possible from a technical
perspective—Apple's CoreAudio is a good example—but it can only
happen with "political and social manipulation
".
Poettering: The state of Linux audio
The focus of Poettering's talk was desktop audio, rather than embedded or
professional audio applications. He started by looking at what had changed
since last year's LPC, noting that EsounD and OSS were officially gone
("RIP
"), at least in Fedora. OSS can still be enabled in
Fedora, but it was a "great achievement
" to have it removed,
he said.
There were only bugs reported against three applications because
of the OSS removal, VMware and quake2 amongst them. He said that there
"weren't many complaints
", but an audience member noted the
"12,000 screaming users
" of VMware as a significant problem.
Poettering shrugged that off, saying that he encouraged other distributions
to follow suit.
Confusion at last year's LPC led him to create the "Linux Audio API Guide", which has helped clarify the situation, though there were complaints about what he said about KDE and OSS.
Coming in Fedora 12, and in other distributions at "roughly the same
time
", is using realtime scheduling by default on the desktop for
audio applications. There is a new mechanism to hand out realtime priority
(RealtimeKit) that will
prevent buggy or malicious applications from monopolizing the
CPU—essentially causing a denial of service. The desktop now makes use of the
high-resolution timers, because they "really needed to get better
than 1/HZ resolution
" for audio applications.
Support for buffers of up to 2 seconds has been added. ALSA used to
restrict the buffer size to 64K, which equates to 70ms 370ms of CD quality
audio. Allowing bigger buffers is "the best thing you can do for
power consumption
" as well as dropouts, he said.
Several things were moved into the audio server, including timer-based
audio scheduling which allows the server to "make decisions with
respect to latency and interrupt rates
". A new mixer abstraction
was added, even though there are four existing already in ALSA. Those were
very hardware specific, Poettering said, while the new one is a very basic
abstraction.
Audio hardware has acquired udev integration over the last year,
and there is now "Bluetooth audio that actually works
".
Poettering also noted that audio often didn't work "out of the box" because
there was no mixer information available for the hardware. Since last
year, an ALSA mixer initialization database has been created and populated:
"It's pretty complete
", he said.
Challenges for the next year
There were a number of issues with the current sound drivers that Poettering listed as needing attention in the coming year. Currently, for power saving purposes, PulseAudio shuts down devices two seconds after they become idle. That can lead to problems with drivers that make noise when they are opened or closed.
In addition, there are areas where the drivers do not report correct
information to the system. Decibel range of the device is one of those,
along with the device strings that are either broken or missing in many
drivers, which makes it difficult to automatically discover the hardware.
The various mixer element names are often wrong as well; in the past it
"usually didn't matter much
", but it is becoming increasingly
important for those elements to be consistently named by drivers.
Some drivers are missing from the mixer initialization database, which
should be fixed as well.
The negotiation logic for sample rates, data formats, and so on are not standardized. The order in which those parameters are changed can be interpreted differently by each driver which leads to problems at the higher levels, he said. There are also problems with timing for synchronization between audio and video that need to be addressed at the driver level.
Poettering also had a whole slew of changes that need to be made to the
ALSA API so that PulseAudio (and others) can get more information about the
hardware. Things like the routing and mixer element mappings as well as
jack status (and any re-routing that is done on jack insertion) and data
transfer parameters such as the timing and the granularity of transfers.
Many of the current assumptions are based on consumer-grade hardware which
doesn't work for professional or embedded hardware, he said. It would be
"great if ALSA could give us a hint how stuff is connected
".
There is also a need to synchronize multiple PCM clocks within a device, along with adding atomic mixer updates that sync to the PCM clock. Latency control, better channel mapping, atomic status updates, and HDMI negotiation are all on his list as well.
Further out, there are a number of additional problems to be solved.
Codec pass-through—sending unaltered codec data, such as SPDIF, HDMI,
or A2DP, to the
device—is "very messy
" and no one has figured out how to
handle synchronization issues with that. There is a need for a simpler,
higher-level PCM API, Poettering said, so that applications can use the
pull model, rather than being forced into the push model.
Another area that needs work is handling 20 second buffering. There are a whole new set of problems that come with that change. As an example, Poettering pointed out the problems that can occur if the user changes some setting after that much audio data has been buffered. There need to be ways to revoke the data that has been buffered or there will be up to 20 second lags between user action and changes to the audio.
Conclusion
Both presentations gave a clear sense that things are getting better
in the Linux audio space, though perhaps not with the speed that users
would like to see. Progress has clearly been made and there is a roadmap
for the near future. Whether Davis's vision of a unified API for Linux
audio can be realized remains to be seen, but there are lots of smart
hackers working on Linux audio. Sooner or later, the "one true Linux audio
API" may come to pass.
Index entries for this article | |
---|---|
Conference | Linux Plumbers Conference/2009 |
Posted Oct 7, 2009 18:02 UTC (Wed)
by cventers (guest, #31465)
[Link] (5 responses)
One user's opinion: I do a lot of mixing with the excellent xwax, and I'm also getting into music production. This follows years of using Linux as my primary desktop. ALSA used to frustrate me, just because it was another thing that didn't always work "just right" out of the box. I've been less than impressed by reliability problems I've had with Intel HDA audio in the past, especially the fact you often have to tell the driver how the card is wired and experiment with different module loading options to get it to work. There may be a good hardware reason for this, but since most hardware works out of the box with Linux these days, the little bit that doesn't really stands out. I gave Linux a shot for music production but left it behind and began using Windows for that purpose alone. It's actually the first use for Windows I've found in years. I don't blame this on sound in Linux; rather, the stability and quality of some of the open-source tools isn't quite what I might wish it to be, and although there are ways to use VSTs under Wine, there is nothing dependable and functional enough for serious, heavy and everyday use. So for the first time in years, I dual-boot so that I can use a big heap of proprietary software to compose music. All of that said, when it comes to Vinyl Emulation, I couldn't imagine using anything but xwax + ALSA + Linux. I've read the xwax source code, and while it's not the prettiest I've seen, the author clearly understood how to write simple, reliable, real-time programs. ALSA is great because it supported my USB preamp out of the box, and provided a simple mechanism (asoundrc) where I could apply software gain to assist xwax in better tracking the timecode. udev lets me plug in the USB preamp wherever I want and makes sure it will pick the same device nodes, which is *not possible* with Windows and ASIO. This solution also lets me run with 100% reliability at 2 ms latency, which is great for live mixing. Frankly, I wouldn't trust Windows with live mixing. I tried the Torq software that came with my preamp, and it seemed like a big, bloated mess... but didn't even work on more than one occasion, even when the hardware was configured precisely as it is when I use it in my Linux environment. Moreover, you can't achieve the same low-latencies with Windows, and I have in fact seen a BSOD in one of my production sessions. With a handy mlockall() added to the xwax source code, and the fact that it buffers tracks into RAM that are decoded by external command-line utilities, my system *should* keep playing, even if the hard drive (with swap!) crashes, at least until the current tracks are over. Anyone trust Windows to do the same? As an amateur musician, I should have an opinion about the state of affairs on OS X, but I don't because I'm not fond of microkernel performance, solid-gold prices and extremely basic user interfaces. :p
Posted Oct 7, 2009 18:32 UTC (Wed)
by drag (guest, #31333)
[Link] (3 responses)
Which all that it means is that Apple copied the NT kernel design approach by incorporating some features of microkernels into what ends up being fundamentally a monolithic design.
But as far as audio stuff goes I am told that Apple's CoreAudio is actually a compelling feature over what is available in other operating systems. It's designed with music production in mind.
--------------------------
To get the best out of Linux it is still very tedious and highly technical.
It involves:
* Installing and configuring Jack
And a great deal of learning the ins and outs of how to manage all the above.
Generally the biggest difference between the the actual workflows of Linux vs Windows is that instead of using big music production apps with plugins you use a lot of smaller applications chained together through Jack.
Now keep in mind that it has been a _long_ time since I mucked around with this stuff.
But I have a simple piano-style M-audio midi controller. It connects to the PC using a USB connection.
So the workflow went like this:
USB Controller -(jack midi routing)-> Software Synth (I forget which) -(pcm audio routing)-> Alsa Modular Synth (for effects processing) -(pcm audio routing)-> volume controls -(pcm audio routing)-> digital out on my sound card --> digital receiver --> speakers.
All in all I got the system to reliably operate to the point were I could not notice a delay from when I press a key to when I heard the sound.
Of course this required a couple hours of mucking around and setup. Debian by default could barely do software synth on it's own before I started customising it.
The situation has improved somewhat with the introduction of custom Linux variants in the form of 64Studio and Ubuntu Studio and things of that nature. So at least the software setup is mostly taken care of.
Posted Oct 7, 2009 18:57 UTC (Wed)
by cventers (guest, #31465)
[Link]
And you're absolutely right about the tedious nature of setup on Linux. I too have a MIDI USB controller from M-Audio, and I too got it working under Linux (actually even patched into reFX Vanguard courtesy of dssi-vst). But what I found is that when musical inspiration hits, I want to spend the least amount of time possible getting into working music software, because it generally doesn't survive having to debug some arcane software issue.
Posted Oct 7, 2009 18:58 UTC (Wed)
by jebba (guest, #4439)
[Link] (1 responses)
Posted Oct 7, 2009 19:20 UTC (Wed)
by drag (guest, #31333)
[Link]
But for best performance you still need to patch and recompile your kernel as well learn the in-and-outs of dealing with the multiple Linux user interfaces.
With my setup I was getting pretty reliable sub-10msec latencies with Jack's settings with no xruns, although I usually let things slide to 60-70 just so I could have more responsive system.
The other thing that sucks about Intel HDA (besides the low quality of digital-analog conversion chips and relative high buffer requirements) as far audio creation stuff is concerned is just the lack of I/O options. This is the biggest real difference between 'profesional' and 'consumer' audio hardware. My old M-Audio Audiophile 24/96 has Analog stereo in, stereo out, digital in, digital out, and midi in and midi out. It also has nice-quality D-A/A-D conversion and the difference is enough that a with a quiet room and nice headphones pretty much anybody can tell the difference.
But, of course, that's PCI.
Otherwise I have no problems with using Intel-HDA for anything. It's the sound card I use the most since that is what is on my laptops. For music playback and doing some recording stuff it's perfectly fine and unless you are in a quiet area with high quality headphones the chances of anybody being able to the the difference is very unlikely.
Posted Oct 29, 2009 18:08 UTC (Thu)
by jrigg (guest, #30848)
[Link]
Posted Oct 7, 2009 18:24 UTC (Wed)
by mezcalero (subscriber, #45103)
[Link]
http://linuxplumbersconf.org/2009/program/
One correction:
"ALSA used to restrict the buffer size to 64K, which equates to 70ms of CD quality audio."
Did I really say that? 64k is actually 370ms @ 44khz/16bit/2ch. Which usually means an interrupt rate of at least 1/180ms or so.
Posted Oct 7, 2009 18:40 UTC (Wed)
by ncm (guest, #165)
[Link] (15 responses)
Posted Oct 7, 2009 19:33 UTC (Wed)
by drag (guest, #31333)
[Link] (11 responses)
The Maemo 5 folks and Palm WebOS folks seem to prefer to just write directly against PulseAudio for their audio needs.
Although for cross-platform compatibility your best bet would be to target Gstreamer since that runs on Alsa/PA/OSS/Windows/OSX/etc.
Posted Oct 7, 2009 20:02 UTC (Wed)
by ncm (guest, #165)
[Link] (10 responses)
Posted Oct 7, 2009 21:12 UTC (Wed)
by drag (guest, #31333)
[Link] (8 responses)
PulseAudio is portable also. It runs on Alsa/OSS/Windows/OSX/ blah blah blah. But I think for what your asking you'd have better luck with Gstreamer.
If you target Alsa then you can use the 'safe' subset that is supported by PulseAudio then you might be fine. It's possible to port the 'safe' parts of the libasound to other platforms, but what a pain.
You could target SDL and get cross platform compatibility, but that is mostly for game makers.
If you target full Alsa then that is Linux-only. If you OSS then that means it's only useful on some of the BSDs and possibly Solaris.
Posted Oct 7, 2009 21:31 UTC (Wed)
by ncm (guest, #165)
[Link] (1 responses)
Posted Oct 8, 2009 12:25 UTC (Thu)
by nye (subscriber, #51576)
[Link]
Not so much - because it's only technically true (in other words, it's lies). A couple of years ago they managed to build it on a selection of platforms so they could claim 'portability' as a ticklist item.
Try building PulseAudio for Windows. When you've given up, try installing the binary package that is usually mentioned whenever this discussion comes up - it's two years old and I couldn't get it to work *at all* with a few hours' hair-pulling.
I don't know where the OS X idea came from - PA doesn't even *claim* to work there, and according to the PA website, the last time it was tested on anything other than Linux was 2007.
Posted Oct 7, 2009 22:10 UTC (Wed)
by ncm (guest, #165)
[Link] (1 responses)
Posted Oct 7, 2009 22:43 UTC (Wed)
by drag (guest, #31333)
[Link]
Posted Oct 8, 2009 12:09 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Posted Oct 14, 2009 12:15 UTC (Wed)
by pharm (guest, #22305)
[Link] (2 responses)
If you target Alsa then you can use the 'safe' subset that is supported by PulseAudio then
you
might be fine.
The (slightly abrasive, but ultimately useful) discussion on the Braid blog about audio
support under
Linux eventually revealed that the safe Alsa subset isn't really a great deal of use, because you
can't
guarantee to get your hands on the audio ring buffer & rewrite the parts that haven't been
played
yet on the fly: The alsa mmap functions that let you do this aren't part of the safe core :(
The *biggest* issue that arose from that discussion was that it's well nigh on impossible for
a
developer to work out what they're expected to use if they need more than the basic SDL sound
API
(which can't do a great deal more than 'play this sound now please'). The safe ALSA subset plus
the
mmap alsa functions (since most hardware can expose those in reality) is probably it, but that
isn't
exactly well-advertised.
Posted Oct 20, 2009 9:32 UTC (Tue)
by njs (subscriber, #40338)
[Link] (1 responses)
A better API is clearly needed, but I don't think it involves mmap.
Posted Oct 20, 2009 10:57 UTC (Tue)
by cladisch (✭ supporter ✭, #50193)
[Link]
Classic mmap() can't. However, the ALSA API requires that the applications tells when and where it wants to access the buffer, and when it is done, so it is possible to emulate mmap on top on devices without a memory buffer. (In that case, the extra buffer adds latency, of course.)
> A better API is clearly needed
ALSA has snd_pcm_forward/rewind functions to move around in the buffer. However, these functions are optional, and the PulseAudio plugin does not implement them.
Posted Oct 7, 2009 21:19 UTC (Wed)
by dmarti (subscriber, #11625)
[Link]
Posted Oct 9, 2009 5:44 UTC (Fri)
by magnus (subscriber, #34778)
[Link] (2 responses)
SDL has a nice and friendly audio API, is cross-platform and has worked very well in my experience but it doesn't do recording.
The PulseAudio API is OK to work with as well. It probably would be my choice for a VoIP app if I didn't have to care about portability.
GStreamer is extremely focused on media-player like applications, and all API documentation is built around the assumption that data comes from somewhere else and you're just building a pipeline. Using an application as the data source seems so to be rare and it's not obvious how to do it.
Posted Oct 15, 2009 10:12 UTC (Thu)
by Uraeus (guest, #33755)
[Link] (1 responses)
As for using an application for the data, that has been addressed quite some time ago and there are now two GStreamer elements called appsrc and appsink which specfically targets getting or sending data to an application.
Posted Oct 17, 2009 21:21 UTC (Sat)
by magnus (subscriber, #34778)
[Link]
Still, I don't think that it is obvious how one should port an audio app using another audio API (ALSA for example) to use GStreamer for output. GStreamer seems to be designed more like a toolkit which you have to design your app around (like GTK for graphics) rather than just the audio bottom layer that most other API:s provide.
Posted Oct 7, 2009 21:08 UTC (Wed)
by mjthayer (guest, #39183)
[Link] (1 responses)
BeOS anyone?
Posted Oct 8, 2009 10:57 UTC (Thu)
by tialaramex (subscriber, #21167)
[Link]
For example, where is the audio data going? In _theory_ BeOS lets applications connect up any kind of graph. In practice, nearly all software asks for the system's default (software) mixer and feeds it 16-bit PCM.
Someone wrote a piece of software "Cortex" which exposes the graph, but if you actually install it and play around, first of all you'll crash a lot (Cortex and sometimes BeOS too) and secondly you'll start to find all the weird little bugs no-one encountered because they always hooked things up to the default mixer. So rather than exposing the graph in a way that users can play with it, like the various JACK graph tools, it behaves more as a debug tool for developers who know how to tread carefully.
Posted Oct 8, 2009 12:52 UTC (Thu)
by epa (subscriber, #39769)
[Link]
And BTW, why is the <q> element not allowed in comments?
Posted Oct 8, 2009 14:29 UTC (Thu)
by mezcalero (subscriber, #45103)
[Link]
We actually already do the revoking for the 2s buffers. It works quite well these days. The big difference between 2s and 20s when doing this however that when you fill up the full 20s with new audio, this might be a quite expensive operation, for example because you decode it from MP3. Now if the user is seeking around and we have to revoke what we already wrote to the hw playback buffer dropping 20s and decoding that again comes at a very steep price, while dropping 2s and decoding that again might still have been acceptable. So, the idea I was ventilating at LPC (or at least wanted to explain, I might not have been clear on this) was that we use some kind of stochastic model so that for some time after the last seeking we don't fill up the full 20s but only 5s or so, which is much cheaper. And then when during the next iteration we noticed that we never had to revoke it, we generate 10s for the next iteration. And when after it we noticed that we didn't have to revoke it we go for the full 20s for the following iterations. However, if the user seeks around during that time we go back to filling up only 5s again. We'd do that mechanism under the assumption that if the user seeks around he does that in "bursts", i.e. seldom, but when he does it he does it a couple of times shortly following on each other. And we don't want to pay the price for having to decode the full 20s again each time.
Posted Oct 13, 2009 14:53 UTC (Tue)
by christian.convey (guest, #39159)
[Link]
Posted Oct 18, 2009 11:52 UTC (Sun)
by hannu (guest, #61409)
[Link] (3 responses)
"The OSS API requires any services (like data format conversion,
Operations like data format conversions are no rocket science. They do
"It also encourages applications to do things that do not work well with
Like what?
History of the anti-OSS campaign is based on this kind of silly
"OSS applications are written such that they believe they completely
This is complete BS. Yes, there are many OSS applications that open
"Because of that, Davis was quite clear that the 'OSS API must die'. He
This is intentional misinformation. There is very loud group of audio
The big reason to kill OSS is not OSS itself. If OSS is really that bad
As a workaround it's mandatory that linux distributions like Fedora drop
Posted Oct 19, 2009 12:34 UTC (Mon)
by cladisch (✭ supporter ✭, #50193)
[Link]
In the original LPC article, there was a link to the slides:
> "The OSS API requires any services (like data format conversion,
This is perfectly true.
> Operations like data format conversions are no rocket science. They do
These are not arguments for putting the services in the kernel.
Davis' point was that the OSS API requires that _all_ services _must_
> "It also encourages applications to do things that do not work well with
E.g., mmap(), which cannot be reasonably emulated; or mixer devices.
(The OSS v4 API is better in this regard, but developers cannot rely on
> "OSS applications are written such that they believe they completely
Please watch your language. And your next sentence proves you false:
> Yes, there are many OSS applications that open /dev/mixer and
Opening /dev/mixer and using it was the _only_ way how OSS was
The /dev/something interface implies hardware control; even when
> OSS is perfectly adequate for needs of 99% of the audio programs.
It's the OSS implementation that's lacking.
> They are also required to move to RT kernels because
Please don't publish misinformation. Real-time response times are only
Posted Oct 20, 2009 9:38 UTC (Tue)
by njs (subscriber, #40338)
[Link] (1 responses)
Posted Oct 20, 2009 17:43 UTC (Tue)
by bronson (subscriber, #4806)
[Link]
He can safely be ignored.
That said, I do wish ALSA wasn't such a mess. I'm hoping GStreamer and Phonon become the de facto application APIs so app writers don't have to care about the slipperly kernel sound APIs anymore.
LPC: The past, present, and future of Linux audio
LPC: The past, present, and future of Linux audio
* Configuring your applications to use Jack
* Purchasing a audio card with good performance characteristics. (Intel-HDA, while it is fine for music playback, is not designed for low-latency performance regardless of what drivers you use on it)
* Installing a custom OS kernel with *-rt patches.
LPC: The past, present, and future of Linux audio
LPC: The past, present, and future of Linux audio
LPC: The past, present, and future of Linux audio
Quote:LPC: The past, present, and future of Linux audio
I gave Linux a shot for music production but left it behind and began using Windows for that purpose alone. It's actually the first use for Windows I've found in years. I don't blame this on sound in Linux; rather, the stability and quality of some of the open-source tools isn't quite what I might wish it to be, and although there are ways to use VSTs under Wine, there is nothing dependable and functional enough for serious, heavy and everyday use. So for the first time in years, I dual-boot so that I can use a big heap of proprietary software to compose music.
Another user's perspective:
I've used Ardour on Linux for music recording for a few years now. I suspect I'm one of a relatively small number who use it for paid work, but I've found it to be very solid and reliable for multi track recording and editing. The current lack of "pretty" plugin GUIs is a positive advantage to me (few things are more disruptive to work flow than having to turn a picture of a knob with a mouse). Those who need good MIDI support might still be better off with Mac or Windows, but I don't require this. I would say stability of my system is noticeably better (for straightforward recording and editing) than that experienced by many of my Mac- and Windows-using colleagues.
One area that still needs improving is using multiple sound cards to boost channel count. In my mobile system I use an RME MADI card (up to 64 channels of simultaneous in/out at 48kHz) with external AD/DA converters, but that is probably too expensive an option for most semi-pro and hobby users. Combining a few eight channel cards for a cheaper setup still requires jumping through difficult configuration hoops (so difficult that AFAIK none of the dedicated media distros come with configurations for doing this).
LPC: The past, present, and future of Linux audio
API target
API target
API target
API target
Thanks, I had not got that PA itself had been ported to all these platforms.
That seems to change everything.
API target
API target
Do I understand correctly, that if I write directly to PA, then a PA server
needs to be running alongside my program, but if I write to Gstreamer, then
I might actually be talking to PA, or to ALSA, or to Apple's or MS's
services, and most of my code needn't know which? I expect discovering
microphones and headsets, and volume controls for them, will be a nuisance
no matter what.
API target
API target
sound and expected it to work in windows (myself being limited to relatively
low-complexity python programs), but I expect that with Gstreamer you'll have
to depend on the platform to setup everything like that for you to use. So
it seems most appropriate for a standalone application you want to integrate
into the OS its running in.
API target
API target
API target
API target
Gstreamer gives you much more than just a basic audio API. It also lets you easily use codecs that the user might have decided to install but that you don't want to distribute for whatever reason. The user can play MP3s or SID files, even if you just had FLAC and Ogg on your machine when you built the application.
Gstreamer and codecs
API target
API target
API target
LPC: The past, present, and future of Linux audio
LPC: The past, present, and future of Linux audio
Unclear sentence
By the end of 2001, ALSA was adopted as the official Linux audio system in favor of OSS.
I think you meant to say 'ALSA was adopted instead of OSS', or 'OSS was dropped in favor of ALSA'.
LPC: The past, present, and future of Linux audio
Copy CoreAudio?
LPC: The past, present, and future of Linux audio
if there were already in the original presentation. Hopefully not since
both Paul and Lennart have been around long enough time to learn the
basic facts.
routing, etc.) be implemented in the kernel."
require few extra CPU cycles but so what. The same number of CPU cycles
will be spent even if the conversions are done in user space. The same
is true with "routing". Routing/mixing is always based on some kernel
level IPC mechanisms. In case of OSS these mechanisms are just hidden
behind the kernel level audio API.
other applications that are also trying to do some kind of audio task."
arguments that don't make any sense. Any API can be misused by
developers who don't care to read the documentation. This is true with
OSS, ALSA, Jack, PulseAudio as well as every single API in the software
industry.
control the hardware."
/dev/mixer and peek/poke the global hardware level volume controls
directly. However this is not the way how OSS is designed to be used.
noted that Fedora no longer supports OSS and was hopeful that other
distributions would follow that lead."
(API) developers who insist that OSS must die based on arguments like
the above. This started about 10 years ago but OSS is still used by
large number of applications. Why? Simply because OSS is perfectly
adequate for needs of 99% of the audio programs. And I doubt ALSA is any
better than OSS for the remaining 1% of applications (the laws of the
nature are the same for both of them).
then it should have died spontaneously years ago. However it's still
hanging around. The real reason is a design mistake made by developers
of ALSA . It prevents ALSA from co-existing with OSS. This mistake
prevents ALSA from becoming successful as long as there are any
applications using the OSS API. To be specific this "mistake" was ALSA's
decision to implement their own kernel level drivers instead of just
implementing alsa-lib on top of OSS drivers. Another fundamental mistake
was dmix that separates software mixing conceptually from hardware
mixing. For dmix to work it's necessary that all applications go through
ALSA's library interface even they don't need any of ALSA's "advanced"
features. Without these fundamental design flaws both OSS and ALSA could
exist at the same time.
OSS prematurely. They are also required to move to RT kernels because
ALSA and its followers depend on real time response times. Only after
that the emperor can finally get his new clothes.
Re: OSS vs ALSA flamewar
> if there were already in the original presentation.
http://linuxplumbersconf.org/2009/program/
> routing, etc.) be implemented in the kernel."
> require few extra CPU cycles but so what. The same number of CPU cycles
> will be spent even if the conversions are done in user space. The same
> is true with "routing". Routing/mixing is always based on some kernel
> level IPC mechanisms. In case of OSS these mechanisms are just hidden
> behind the kernel level audio API.
be in the kernel; it's not possible to add user-defined plugins without
going through some sort of loopback device.
> other applications that are also trying to do some kind of audio task."
>
> Like what?
it as long as most implementations only offer v3.)
> control the hardware."
>
> This is complete BS.
> peek/poke the global hardware level volume controls directly.
> However this is not the way how OSS is designed to be used.
designed to be used.
implementing 'virtual' devices, OSS has to use the same interface and
to pretend that it's a 'real' hardware device.
> And I doubt ALSA is any better than OSS for the remaining 1% of
> applications (the laws of the nature are the same for both of them).
The percentage of programs that, for example,
* want to make use of the full capabilities of any modern device (like
USB, HD-Audio, Bluetooth, Xonar), or
* want to use MIDI, or
* want to work correctly with suspend/resume, or
* must work on embedded architectures,
is way more than 1%. As long as these deficiencies exist, OSS will not
even be considered a theoretical replacement for ALSA.
> ALSA and its followers depend on real time response times.
needed for real-time tasks like low-latency signal processing or
synthesizers, and this applies to any implementation, i.e., to both OSS
and ALSA.
LPC: The past, present, and future of Linux audio
LPC: The past, present, and future of Linux audio