LWN.net Weekly Edition for September 18, 2008
LPC: Fitting into the kernel ecosystem
The first Linux Plumbers Conference started on September 17, 2008; the opening talk was a keynote by Greg Kroah-Hartman. He got the conference going with with a provocative sermon on how the development ecosystem works and the niche we all occupy within it. It was a fun talk - unless you happen to work for Canonical.He started with an apology to Canonical, though. In earlier talks, he had said that only eight kernel patches had ever come from Canonical. In fact, he has been corrected; the proper number is 100.
So, Greg asked, why is he picking on Canonical? His answer came in the form of a table of contributors to the kernel. It looked like this:
Distributor Changesets Red Hat 11,846 Novell 7,222 MontaVista 1,074 Debian 288 Gentoo 229 Mandriva 237 Wind River 207 rPath 186 Canonical 100
Then Greg asked: does anybody from Canonical want to say anything? Nobody did.
Moving on to the Linux ecosystem. Greg put up a slide showing the larger
components of this ecosystem - the low-level stuff that makes Linux what it
is. Some of the largest components, beyond the kernel, were GCC,
binutils, X.org, and the man pages distribution. Looking at lines of
code, the kernel amounts to about 40% of the total. Other large components
are all significantly smaller.
It turns out that Greg has been doing repository data mining in a number of projects beyond the kernel. So, for projects like GCC, X.org, and binutils, he was able to put up tables listing the top contributors. The results varied somewhat, but there were a number recurring themes. Red Hat tends to be toward the top of the list on all of these projects; companies like IBM and Novell also appear regularly. CodeSourcery is a significant contributor to GCC and binutils. The U.S. National Security Agency contributes 2.1% of the patches into X.org; why is not clear. In all of these projects there are significant contributions from unpaid developers, but those contributions are overshadowed by those from paid developers.
And Canonical is always at the bottom of the chart - if it is there at all.
At this point Greg moved to a whiteboard to present his view of how the community works. At the development level, you have developers contributing to projects, which then release the code. There may be a few users at that level who feed back information (and maybe patches), but, in general, the biggest consumers of the project's releases are the distributors.
Distributors package everything and provide it to their users. At this point, another feedback loop comes into play: users feed their experiences and problems back to the distributor. Those distributors will respond to the user feedback, improving their products. The amount of feedback from the distributors to the upstream projects varies, but it tends to be small. For enterprise distributions, it is quite small; they are running ancient versions of everything and have little to do with current upstream. The community-oriented distributions, such as Fedora or openSUSE, tend to feed more changes back to their upstream sources.
Then, there is the matter of redistributors who base their products on another distributor's work; these are distributors like Ubuntu or CentOS. There are no contributions back to the community from that kind of distributor at all. They are not functioning as a part of the Linux ecosystem.
Greg finished up with what appears to be the message he came to the Linux Plumbers Conference to deliver: if you are a developer, if you want to be a part of the ecosystem, and if you work for a non-contributing company: quit. There are plenty of companies that understand the ecosystem and which need good people; at least one company, it seems, had wanted to set up a recruiting table at the conference. It is a very good time for people with community participation skills; there is no reason for anybody who wants to work in the community to stay on the outside.
[As a postscript, it is amusing to note that, while the conference did not allow companies to set up recruiting tables, nobody has prevented prospective employers from filling a prominently-placed whiteboard with information about available positions.]
Firefox 3 EULA raises a ruckus
End User License Agreements—or EULAs—are a mainstay of the proprietary software world that tend to rub free software advocates the wrong way. When a EULA is presented in a click-through window as part of the initial execution of a program, it can really raise some ire as Mozilla is finding out. Its plan to present a click-through license for Firefox 3 on Linux has not met with widespread approval; quite the reverse in fact.
The issue has been kicking around since at least last May, when Fedora folks noticed that Firefox 3 builds moved the EULA popup window from the installer—which Linux folks rarely see—to the first time Firefox is run. More recently the issue erupted in the Ubuntu community when a user filed a bug that reads, in part:
The predictable outcry followed, mostly because people who are used to free software have a visceral reaction to seeing a click-through EULA. For that reason alone it is a poor choice by Mozilla, at least on Linux. Windows users, who make up a substantial portion of the Firefox userbase, are generally unfazed by EULAs as they are confronted by them regularly—generally blithely clicking through with little or no hesitation.
There are a number of objections to the Mozilla EULA, starting with the current
text of the license. Mozilla Corporation chairperson Mitchell Baker agreed
with the critics of the license text, saying "the most important
thing here is to acknowledge that yes, the content of the license
agreement is wrong.
" New
license text is now available in draft form, but it still doesn't
address an underlying issue: do we need to consult a lawyer when we install
or run
free software?
One of the guiding principles of free software is that it doesn't limit what "end users" can do with the software, it only limits those who wish to distribute it. When a page or two of legalese—undoubtedly toned down from what the lawyers would really like—is presented to a new user, what exactly are they supposed to do with it? Users have rights under free software licenses, and it is important that they can find out about them, but it is fairly rare for a program, or even a distribution, to require a user to click through a copy of the license.
Mozilla's position is that they need to protect their trademarks as well as inform users about the web services used to try to detect phishing and malware sites. In answer to those who think a click-through EULA is unnecessary—often using Linux distributions as a counterexample—Baker points out:
So far, Mozilla does not seem willing to budge from its requirement to show the EULA as a click-through agreement. Fedora was able to get a waiver of sorts for Fedora 9 which allowed shipping Firefox 3 without the EULA while the projects worked out language they both could live with. In Fedora 9, Firefox opens to a page that describes the web services when it is run for the first time. Some kind of compromise along these lines for Linux distributions would seem to satisfy most of the concerns for both sides, but other than for Fedora 9, that solution has not been blessed by Mozilla.
Fedora Engineering Manager Tom "spot" Callaway has an excellent overview of the history as well as a nice analysis of the EULA. He notes that almost of all of the terms in the EULA are either covered by applicable laws or by the Mozilla Public License (MPL). None of that really matters though as distributions really only have two choices as outlined by Ubuntu leader Mark Shuttleworth:
That is the risk that Mozilla takes; if it is too heavy-handed in what it requires to call a browser "Firefox", distributions will take the code without the trademarks and call it "Iceweasel" as Debian has or "abrowser" which is the Ubuntu equivalent. The Iceweasel "fork" was made because Mozilla objected to Debian backporting security fixes into older browsers without its consent, while abrowser has come about because of the EULA issue. Given that Linux users were some of the earliest and most enthusiastic adopters of Firefox, it is truly unfortunate that many may have to run it under other names.
There is an issue that may be getting lost in the shuffle here as well. Fedora board member Jef Spaleta has expressed concerns about how to notify users about web services:
Web services clearly bring along a number of additional concerns. There are privacy issues to consider. In many places, particularly Europe, there are fairly stringent requirements regarding data collection and retention that are required to be communicated to users. How that will be done for free software that use these services is an open question. As Spaleta points out, Mozilla may be the only free software organization that is even looking at the problem.
The EULA mess is a situation that certainly could have been handled better by Mozilla. One hopes that some kind of compromise can be worked out so that users aren't poked in the eye with legal documents—that aren't even valid in many jurisdictions—and distributions don't feel like they need to fork to preserve their freedoms. Mozilla definitely has some legitimate interests to protect, but it needs to find a saner way to do that.
There is hope that is happening as Baker has described in an update on her blog:
More details are imminent, but it looks like this could all resolve amicably.
Review: Intellectual Property and Open Source
Free software inevitably runs into the body of law known collectively as "intellectual property." Many developers do their best to avoid the legal side of things whenever possible; others seem to like nothing better than extended debates on the topic. Regardless of one's own feelings in the matter, the fact remains that the legal system exists, it affects our lives, and that we can only be better off if we understand it. To that end, O'Reilly has published Intellectual Property and Open Source by Van Lindberg.The book starts off with a Lessig-like comparison between code intended for computers and legal code. The legal code base is not as clean as one might like:
Mr. Lindberg is clearly trying to write for programmers, so code-based analogies abound. Patents are like regular expressions - quite powerful in the technologies they can match, but you never really know what they will catch until you try them. Patent documents are structured like ELF program headers, and the patent system as a whole is a sort of memorization scheme (we get a Python Fibonacci number generator as an example here). Contracts are like a distributed version control system - they let anybody create their own, localized law. And so on.
Roughly the first half of the core part of the book is dedicated to
explaining how the four main branches of intellectual property (patents,
copyright, trademarks, and trade secrets) work. The chapter on the patent system notes
some of the problems with software patents (in particular, the
industry's use of oral tradition and the late recognition of software patents makes most prior
art invisible to investigators), but, to a great extent, it seems to be
written for people who want to obtain patents, rather than those who feel
the need to defend themselves against software patents. It might have been
nice to get a treatment of the often-quoted idea that software developers
are better off not knowing about patents because that way they cannot be
accused of willful infringement, but that topic was not touched. There is
also no talk of the Open Invention Network or any other efforts to protect
the community as a whole.
The copyright chapter is a reasonably thorough treatment of the subject which notes how the scope of copyright has expanded over the years. The current situation is compared to an "allow by default" security policy where anything which can be said to have an expressive aspect gets copyright protection by default. Derivative works are discussed at length, leading to this interesting observation:
Just a few pages earlier, it is stated that joint ownership means that each author has full rights over the entire work and can do just about anything with it - like license it to others. A finding that the kernel was a joint work could lead to some unpleasant consequences; one hopes that Mr. Lindberg is not really saying that could happen.
The book mentions the abstraction-filtration-comparison test used by some courts to determine if one body of code is derived from another, but says nothing about how that test works. It would have been nice to learn a bit more, since that is an important part of how copyright cases are resolved in the US. Also nice would have been some discussion of the value of registration of copyrights.
The chapter finishes with this discouraging note:
The discussion of trademarks (compared to desktop shortcut icons) is pretty much as one would expect. The chapter is more concerned with obtaining and defending trademarks than balancing trademarks against the ideals of free software. There is not much to say about trade secrets, though the chapter does touch on what happens if unreleased code is incorporated into a free application. The author concludes that the open development process makes this kind of contamination less likely than with proprietary projects.
Next we move into a chapter on contracts and licenses which talks mostly about how contracts are formed and enforced. The book takes a strong position that all licenses are contracts; they are just a special form of contract which grants permission to use some sort of intellectual property. The other point of view (that licenses are distinct from contracts) is touched upon, but dismissed this way:
Later on, the author refers to the GPL in particular as a "Schrödinger's license" with a currently undetermined nature; it might be "just a license" after all. Clearly there is some confusion on this point. It is worth noting that the book predates the appeals court decision in the JMRI case, which makes the "it's a license" interpretation far more likely.
There is a chapter on the "economic and legal foundations of open source," talking about how the community works and, in particular, how free licenses work. There is little here which would be new to most LWN readers, but it might be good to hand to the corporate legal office. Speaking of that office, the next chapter talks about how to contribute to a project without getting into trouble with your employer. There is talk about proprietary information agreements, some important cases (including the Medsphere case, which you editor wishes had been more prominent on his radar), works for hire, and so on. The key advice from the author is to disclose your work and your ideas to your employer as soon as possible - preferably before beginning employment. This is a chapter that many free software developers should read.
Chapter 10 is about choosing a license for a free software project. The importance of the topic is stressed - as is the importance of not trying to write one's own license. The author recommends that most projects should limit themselves to considering the 2-clause BSD license, the Apache license (v2), the Mozilla Public License, the GPL or LGPL (versions 2 or 3, though GPLv3 is said to be "a better and surer foundation for future development"), or the Open Software License (v3).
Chapter 11 is about the issues involved in accepting patches from others. The author strongly recommends using some sort of signed contributor agreement or even copyright assignments. Getting assignments, he says, allows for "unified legal control," ease of relicensing, and the ability to do commercial licensing. It's probably good advice for a strongly corporate-controlled project, but may not fit with more community-oriented projects. Unfortunately, the book perpetuates this particular fiction:
And, to make it worse:
There are a few problems here. No single entity owns the entire Linux kernel, but that code has been quite vigorously defended against some strong legal challenges. (It is interesting, actually, that the author managed to write this entire book without mentioning SCO once.) Kernel developers have also been able to enforce the kernel's copyright numerous times. Meanwhile, a quick look at the BusyBox code is sufficient to turn up copyright assertions from far more than two developers. Unified ownership of a code base may be the right thing for some projects, but the reasons cited here are clearly not applicable.
That complaint notwithstanding, this chapter does contain useful information that should be kept in mind when accepting patches from others.
Chapter 12 is about the GPL in particular. There is a lot of talk about just what is a derived work under the GPL - does it apply to kernel modules, for example? Unfortunately, the answer is "we just don't know." So, while the chapter is a reasonable summary of how the GPL works, once again there will be little there for most LWN readers.
Chapter 13 gets into reverse engineering, providing a quick overview of how it can be done without getting into trouble. According to the book, reverse engineering is generally allowed in the US, even to the point of disassembling proprietary code to learn its secrets. There are a lot of pitfalls, though, and the DMCA changes the game significantly. This chapter is a good starting point, but anybody wanting to do reverse engineering in the US will probably want to learn rather more than what is on offer here.
The final chapter talks about the creation of a non-profit corporation to own and/or manage a code base. It's mostly about what's required to create a corporation and keep it in good standing. This information may be useful to some, but it seems a little out of place here. After that, there are 80 pages of license lists and the full texts of a number of free software licenses. Perhaps it's useful reference material, but it's all easily available online; it's not clear that dedicating nearly 25% of the book to this material was necessary.
The subtitle of this book is "a practical guide to protecting code," which makes one omission especially striking: there is not a word on how a project should deal with license violations. There is, by now, a fair amount of collective wisdom on how such problems should be approached, but it has not been collected here. There's also little talk on protecting projects against software patent problems, no talk of patent pools, and no talk of related issues like the Microsoft/Novell deal. Software patents have cast a big shadow over free software in the US, but the issue is not really touched upon in this book.
It is also worth noting that the book is very heavily based on US law, and the author never attempts to look beyond the border. Certainly it would never have been possible to cover intellectual property law worldwide, but this narrow focus is still a little puzzling. Much intellectual property law in the US is based on international agreements, so an understanding of those agreements would help with the larger picture. A mention of Berne Convention would not have been out of place, for example. The other problem is that free software tends to have little respect for borders; there are few projects which are limited to a single country. Even if a project is based in the US, the existence of contributors elsewhere in the world is almost certain. Free software is a global phenomenon; it is not sufficient to think about US law alone.
Despite these complaints, your editor has to say that this is a valuable book. It covers many of the basics of the law in a much clearer way than has been done before. Anybody who manages or contributes to a free software project (in the US, at least) should be familiar with the concepts discussed here. And certainly all of the people peppering the net with IANAL posts would be better informed after reading Intellectual Property and Open Source. This book should bring some light to a complex but crucially important part of the legal code which governs our actions, and that is a good thing.
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: OpenSSH and keystroke timings; New vulnerabilities in apache2, kernel, libxml2, pam_mount,...
- Kernel: The 2008 Linux Kernel Summit
- Distributions: The openSUSE Project's first board elections; new releases from ALT Linux, CentOS, Foresight, Lunar and Syllable
- Development: Audacity gets new functionality via Google Summer of Code, Adding a signing key to RPM, DRI2 Protocol Draft, Building Debian with GCC 4.4, new versions of phpMyAdmin, Replybot, lutz, TurboGears, jack_capture, Chandler Server, opentaps, GNOME rc, GARNOME, CQRLOG, boox, Ember Elisa, Patchage, Dirac, OpenSwing, GIT, OpenOpt.
- Press: Lindependence movement, End Runs Around Vista, Canonical funds desktop improvements, Lenovo stops Linux PC sales, management of open-source projects, KDE at CERN, Java sound and music apps, Linux Scalability in NUMA, Easystroke review, Linux course development grant.
- Announcements: EFF on cell phone records, Openmoko used for education, Microsoft/Novell virtualization system, Sun's Project Kenai, Wyse and Novell's thin client, RailsConf EU report, SCALE cfp, SCSS cfp, OpenSAF Dev Days, Texas Unconf, O'Reilly's StartWithXML, Django videos.