Showing posts with label programming languages. Show all posts
Showing posts with label programming languages. Show all posts

Wednesday, January 19, 2011

The Read Eval Print Lie

It's a common misconception that a language must be "dynamic" or "interpreted" to have a REPL, a Read Evaluate Print Loop (also sometimes confusingly called an interpreter), or to have an "eval" function. That's not true at all.

Haskell, OCaml, F#, and Scala are all statically typed languages with very sophisticated static type systems, yet all have REPLs. All are usually compiled although interpreters exist for Haskell. Typed Racket (a variant of Scheme) is statically typed but has a REPL and I think it's always compiled.

Nor does a language have to be interpreted to have a REPL. Groovy is a dynamic language in both the dynamically typed sense and the "mutate my program structure at runtime" sense. It's always compiled, yet it has a REPL. JRuby (Ruby for the JVM) is similarly always compiled as far as I know, but has a REPL. Clojure and Erlang are always compiled, but both have REPLs.

The misconception about the nature of REPLs is caused by the C family of statically typed languages: C, C++, Java, C#, Objective-C and D. They do not tend to lend themselves to REPLs, but it's not because of the static typing or the compiling. It's because of the statement orientation and verbose boilerplate around every construct. A REPL is possible in those languages, it just wouldn't be very useful.

At heart, a REPL is always possible because "eval" is possible in any Turing complete language. That's CS 101 - see universal Turing machine and the Church-Turing thesis.

Remember that compiling vs interpreting is just an implementation choice as far as semantics are concerned. And stop looking to the C-ish languages as examples of the limits of static typing.

Wednesday, May 5, 2010

Types à la Chart

I recently saw a debate on whether a certain language had a "weak" or "strong" type system. Naturally the debate ended in flames. As far as I could tell neither side knew what the other was talking about which isn't surprising as I've almost as many definitions of "weak" and "strong" typing as there are developers who use the words. Frankly the words have too many meanings to be useful.

If the first part of the problem is ambiguity then it's still only a start of the problems. Even if you get something crisp like "static" vs "dynamic" you've still managed to condense out a lot of useful information. Agda and Java live in incredibly different spaces even though they are both statically typed. Or how 'bout the world of difference between E and Ruby, two dynamically typed languages?

In this article I'm going to try to convince you that the space is far more complicated than the typical debates would allow. The design space is very highly dimensioned, perhaps infinitely so, but I'll boil it down to 2 important ones which none-the-less already show considerably more sophistication. Here's the set up: what can you know statically about an expression, like f(x), without dynamically executing/evaluating it and if all you know is static types? The two dimensions I use are 1) How "safe" is it - are there back channels that might violate abstractions? And 2) How rich and sophisticated can the static type be?

One big, fat caveat: this is almost certainly wrong in many details. I've listed a bunch of languages. There just aren't enough "21 days" in a lifetime to understand in detail exactly what each one can promise and how one type logic does or does not contain another. Naturally if you are an expert on one of these languages and I got something a bit wrong you will conclude I'm an idiot even if everything else is 100% right. Such is a the price of writing for a technical audience. In spite of any such errors I contend that even if some details are wrong the overall picture should be reasonably accurate and should convey my central thesis that this space is too complicated for binary distinctions.

Click to zoom

Chart of Static Type Properties

Power

The horizontal axis is "power" or "sophistication" of the static type system's underlying logic. For this analysis I focused only on the ability of the static type system to describe actual data structures and computations. While it's interesting that C++, GHC Haskell, and Scala all have Turing complete type systems, none of their type systems seem to have the same ability to usefully describe computation as Agda's. Of course, I could be wrong. Feel free to bump the asterisked languages to the right accordingly.

The horizontal axis can also be read backwards in terms of type system simplicity.

The far left requires some explanation. The languages in this column tend to be dynamically typed which at first seems like a challenge for thinking about them in static terms. But there's a nice theoretical framework where you think of them as having one static type. A valid Scheme expression has "the Scheme type" and an invalid one does not. Different people who mean to emphasize different things will call these languages "untyped," "unityped", or "dynamically typed." For my purpose "unityped" gives the right connotation. This column corresponds to the untyped lambda calculus.

Moving to the right there are simple first order type systems. Such type systems are a bit richer than the unityped ones, but don't have any form of type abstraction: they don't allow a user to define types that will be parameterized by other types. This column corresponds to the simply typed lambda calculus.

Further right are second order type systems. These allow type abstraction qualified with "forall", and correspond to System F. In between first and second order is HM (Hindley-Milner) which allows full type inference but at the price of not allowing some second order things like higher rank polymorphism. Moving to the right are higher kinded languages, corresponding to System Fn up through System Fω

At the far right are dependently typed languages: languages that allows types to be parameterized by values[1][2]

Safety

The vertical axis is various kinds of "safety." For safety analysis I'm completely ignoring foreign function interfaces and the like. FFIs are useful and pragmatic but also make otherwise pretty safe languages no more safe than Assembly.[3]

You can also read this axis going the opposite direction in terms of a kind of operational power - Assembly isn't safe, but it lets you make the machine dance.

At the bottom end are memory unsafe languages. An expression like f(x) might very well totally hose memory.

A bit up from that is memory safe languages. Such languages have a few built in abstractions (like you can't write past the end of an array) that f(x) can't possibly violate. A couple of steps up is abstraction safety. That means that one developer's abstractions can be protected from another developer in some fashion. For instance, private variables can only be accessed through functions defined in a scope with access to the variable. Many popular "dynamic languages" are not abstraction safe because their powerful dynamic meta-programming facilities allow abstractions to be smashed. In between I've listed a few JVM languages as being configurably safe: the Java standard includes some pretty powerful reflection capabilities that allow many developer written abstractions to be violated, but the JVM standard includes a security mechanism that allows that to be turned off. Move these points up and down accordingly.

At the next level are capability safe languages. In most languages, any function call can have all manner of harmful side effects including formatting your hard drive or launching missiles. But at the capability safe level the expression cannot possibly do such crazy things unless granted that power in the form of an "object" that represents the capability to do so. While you cannot statically know whether f(x) has received a missile launching capability, you can statically know that if f(x) lives in a program that has not been granted missile launching capability then it will not launch missiles.

The pure level is for languages with a strict, statically enforced wall between side effecting expressions and non-side effecting ones. Examining the static type is sufficient to know the difference. Finally, the "pure & total" level is for languages where the expression f(x) is either guaranteed to terminate normally (no infinite loops, no exceptions)[4] or where static types indicate possible non-totality.

Random Notes

In many languages safety properties are enforced dynamically so a knee jerk reaction is "why is this stuff in a discussion about static properties." But remember the point was what I can statically know about an arbitrary expression f(x) without evaluating it. In Ruby I know with absolute conviction that, barring implementation bugs and foreign function interfacing, any arbitrary f(x) absolutely does not write past the end of an array. In C I don't know that.

It might seem odd to put Java further to the right than Haskell '98 but it's true. Haskell'98 has a slightly extended Hindley-Milner type system while Java has a kind of warped System F<: (use nesting to create rank-n polymorphism). Of course, a chart that ranked type systems on "bang for the buck", versatility vs ease of use, would rank things quite differently.

C# gets stuffed into the unsafe row because of the "unsafe" keyword and associated pointer machinery. But if you want to think of these as a kind of inline foreign function mechanism feel free to bump C# skyward.

I've marked Haskell '98 as pure but GHC Haskell as not pure due to the unsafe* family of functions. If you disable unsafe* then jump GHC Haskell up accordingly.

Assembly is problematic. In a certain weird sense some Assemblies are "memory safe" because they do protect their language abstractions by virtue of having almost none.

Conclusion

"Weak" and "strong" mean so many things that they mean nothing. "Static" and "dynamic" mean something but don't really convey many differences. This chart is not the end-all-be-all of language categorization, but it is the beginning of the start of an inkling of a glimpse at how rich the space really is. Hopefully this will make static typing advocates understand that dynamic typing doesn't necessarily mean total insanity while dynamic typing advocates can begin to see the range of sophisticated expression afforded by many modern statically typed languages. I highly recommend "What To Know Before Debating Type Systems" for more on the subject.

Postscript

Inspired in part by this post and a follow up discussion, David MacIver is investigating the dimensions of language design that are actually useful in picking the right tool for the job.

Footnotes

  1. Type theory folk will note that I've pretty casually collapsed two legs of the "lambda cube" into one dimension of power. That's not very valid but that seems to work out in practice. It's hard to find a sophisticated dependently typed language (types indexed by values) that isn't also very rich when it comes to types indexed by types.
  2. C++'s ability to parameterize templates by a few forms of literal shouldn't be confused with dependent typing.
  3. Technically there's no such thing as "Assembly;" it's a family of vaguely related languages with semantics tied extremely closely to underlying machines.
  4. A total language can of course crash due to resource exhaustion or hardware failure. There's no way to prevent that.

Friday, August 21, 2009

Getting to the Bottom of Nothing At All

Except when I'm being a total smart ass or self indulgent emo telling the Ruby community to f' off, most of what I write talks about practical stuff, often with a small dose of theory. But this time it's about theory with a small dose of practice.

Last time I talked about the way programmers in popular C derived languages C++, Java, and C# think of a void function as being one that returns nothing. I contrasted this with the functional view that such functions return something, they just return the unit value (), something that doesn't carry any information. To expand on this I want to talk about actually returning nothing.

In any Turing complete language it is possible to write a function that really truly never returns anything, not even a made-up singleton value like (). In C style languages the simplest such function might contain an infinite loop like

while(true);
For my purposes a better way to explore the idea is with infinite recursion. Pretend your favorite C style language does tail call optimization[1] so we can ignore stack overflow issues and examine the following
X foo(){ return foo(); }

The question is what should the X be? And the answer is that if you plug that line of code into C, C++, C#, or Java then X can be just about any type you want[2].

If that doesn't give you pause then consider this: your type system appears to be lying to you. The function is promising an X but has no hope of delivering. And its quite happy to promise you any X. If you write "String foo(){ return foo(); }" then your type system considers the expression "foo();" to have type String so that you can write "String myFoo = foo();". But guess what, foo doesn't return a String and myFoo will never get a String. Is the type system lying?

The answer is no, of course not. It's just proving something a bit weaker than you might think at first. Instead of proving that "foo() will compute something that is a String" it's proving that "foo() won't compute something that isn't a String." If your "Law of the Excluded Middle" sense just tweaked, then you're on the right page. You'd think there are only two cases to consider: either the function definitely computes a String or it definitely doesn't. But there's this third case where it doesn't compute anything, and since the type checker can't reliably detect non-termination (c.f. Turing Halting Problem) it has to do something a bit odd. First, it assumes that the foo() declaration is correct and does return a String. Then it goes looking for a contradiction such as returning an int. Inside the function the compiler sees that the return is the result of the expression foo(). But the compiler has already assumed that foo() returns a String. There's no contradiction between declared and result so all is happy. The word "tautology" should come to mind right about now. By assuming X is true the type checker has proved that X is true. This weakening is a practical consequence of the Turing Halting Problem and there are at least two good ways to think about it. But first some more examples of the phenomenon.

Exceptions and Exits and ...

I very casually suggested that we ignore stack overflow issues in the infinite recursion above. But exceptions are an important part of this story because they are, in some interesting way, related to non-termination. Consider this function (or its equivalent in your favorite language)
X foo(){ throw new RuntimeException(); }
Once again, X can be any type. And once again foo() does not in fact compute an X.

Clearly these two definitions of foo are different things. Non-termination means we're stuck whereas an exception means we might be able to recover and try something else, or at least notify that there's a problem and shutdown cleanly. None-the-less they both behave similarly. First, they hijack the normal flow of control in such a way that it never returns back to the caller. And second they are both examples of a kind of weakening in the type system since any type can be substituted for X. Formally they are both examples of functions that diverge (functions that return normally are said to converge).

The Type

Here's a third example of a diverging function in Java. Translate as necessary for your language (in Java System.exit(n) stops the program immediately and returns n to the calling process).
X foo(){ System.exit(1); return null; }

Yet another case where foo() won't compute anything, it diverges. In fact, the return statement after the exit call is dead code. However, this is example is slightly different than the other examples because we had to create the dead code for the compiler to be happy[3] and, in fact for a different "X" like int the return value might have to be changed. That bit of dead code is closely related to the heart of this article.

Java can detect some kinds of dead code. If you write "X foo() { throw new RuntimeException(); return null; };" then Java recognizes that the return is unreachable and complains. In my System.exit(1) example Java didn't recognize that the call to exit() will never return so it required a following return statement. Obviously "throw" is a keyword and can get special attention that a mere library function can't but it would be useful to be able to let Java know that, like throw, exit() diverges.

One of the best ways to tell a compiler how a function behaves is by using types and, in type theory, there's a type that expresses just what we need. The type is called bottom (often written ⊥), and while there are different ways to look at the bottom type I'm going to go with a subtyping based view that should be accessible to C++, C#, and Java programmers.

If a language has subtyping and a function says it will return a type "X" then the function is allowed to return a "Y" instead as long as Y is a subtype of X. In my example of a function that just throws an exception the return type could be anything at all. So if we wanted System.exit(1) to indicate that it diverges the same way throw does, then its return type should be a subtype of all types. And indeed, that's exactly what bottom is.[4] bottom is a subtype of String, and int, and File, and List<Foo>, and every other type. Usefully, conventional type hierarchies are written with supertypes above subtypes, which makes a convenient mnemonic for "bottom" which needs to go below everything on such a hierarchy.

Now, if you're used to OO thinking then you expect a value with a certain subtype to in some sense be substitutable everywhere that a supertype is expected. But how can any one object behave like a String, an int, a File, etc? Remember that bottom indicates divergence: an expression with type bottom can never compute a value. So if exit()'s return type was bottom it would be totally safe to write "String foo() { return System.exit(1); }" while another bit of code could have "int bar() {return System.exit(1); }."

Making it Useful, A Divergence to Scala Divergence

Occasionally it might be useful to indicate that a function diverges. Examples are functions like System.exit(1), or functions that always throw an exception, perhaps after logging something or doing some useful calculation to create the exception. But interestingly out of all the statically typed languages that have any following outside of pure research only Scala has an explicit bottom type, which it calls Nothing. The reason Scala has a bottom type is tied to its ability to express variance in type parameters.

For some reason a lot of programmers run screaming into the night when you say "covariance" or "contravariance." It's silly. I won't get into all the details of variance, but I will say that in Scala the declaration "class List[+T] {...}" means that List[Subtype] is a subtype of List[Supertype]. No screaming necessary. And List[+T] brings me to one extremely practical use of bottom - what type should an empty List have?

Well, an empty List can have type List[String] or List[Foo] or List[int]. T can be whatever. And what's a subtype of whatever for all values of whatever? You guessed it: bottom (Nothing). Indeed, Scala has one constant called Nil whose type is List[Nothing]: a subtype of List[String], List[int], and List[whatever]. It all ties up in a bow when you consider that a List[T] has a method called head which returns the first element of the list as type T. But an emtpy list has no first value, it must throw an exception. And sure enough, head in List[Nothing] has type Nothing.

C# 4.0 is supposed to be getting definition site variance similar to Scala's but using the clever mnemonic keywords "in" and "out". I haven't heard anything yet on whether it will also add a bottom type, but it would make a lot of sense.

Java has usage site variance using wildcards. You can say "List<? extends Supertype> x" to indicate that x can have a List<Supertype> or a List<Subtype>. The bottom type would be useful in Java, too, although not as compelling since wildcards are so verbose people rarely use them even when they would make sense. Plus, Java folk tend to mutate everything and List[Nothing] in part makes sense in Scala because Scala Lists are immutable.

C++ does not have any simple way to express this kind of variance so the bottom type is even less compelling in C++ than it is in Java.

Back On Track

Haskell and languages in the ML family don't have an explicit bottom type. Their type systems don't have subtyping so adding bottom as a subtype would confuse things. None-the-less, they do have a nice way to express bottom that can be teleported back to Java, C#, and C++ (but not C). Recall that bottom is a subtype of all types. Another way of saying that is that if a function returns type bottom then for all types A, the function returns something compatible with A so why not express that directly? In Haskell the "error" function takes a string and throws an exception.
Prelude> :type error
error :: [Char] -> a
In Haskell, a lower case identifier in type position is always a type parameter, and [Char] means "list of Char", aka String. So for all types a, if "error" doesn't diverge then it will take a String and return an a. That can pretty much be expressed directly in Java
public static <A> A error(String message) { throw new RuntimeException(message); }

or C++

class message_exception : public std::exception {
  public:
    explicit message_exception(const std::string& message) : message_(message) {}
    virtual ~message_exception() throw() {};

virtual const char * what() const throw() { return message_.c_str(); };

private: const std::string message_; }; template <typename A> A error(const std::string& message) {throw message_exception(message); }

And for either language, usage would be something like

int divide(int x, int y) {
  if (y == 0) {
    return error<int>("divide by zero"); // drop the "<int>" in Java
  } else {
    return x / y;
  }
}

Haskell also has a function called "undefined" that simply throws an exception with the message "undefined." It's useful when you want get started writing some code without fully specifying it.

Prelude> :type undefined
undefined :: a

The function isn't as interesting as the type, which promises that for any type a "undefined" can compute an a or diverge. Since "undefined" can't possible just produce a value of any arbitrary type, it has no choice but to diverge. The same idea can be added to Java

public static <A> A undefined() {return error("undefined"); }
or C++
template <typename A>
A undefined() { return error<A>("undefined"); }

In either language it might be used as

string IDontKnowHowToComputeThis(int input) { 
  return undefined<string>(); // again, make appropriate changes for Java
} 

Given the requirement to write the "return" keyword in C#, Java, and C++ I'm not sure how practical something like a generified error function really is as compared to having it return an exception and making the user write 'throw error("blah")', nor whether "undefined" is that much more useful than just throwing an UndefinedException, but this does illustrate the relationship between the bottom type and functions that, in Haskell terms, compute "forall a.a" or in C#/Java/C++ terms return the type parameter A without taking an A as an argument.

Also, as always care should be taken when transliterating from one language to another. Java would allow a function like "undefined" to return a null instead of diverging. C++ would allow it to return anything all and would only fail to compile if it were used in an incompatible way. That contrasts with languages like Haskell and ML in which the only way to implement "undefined :: a" is to make it diverge in some form or fashion[5].

The Bottom Value

I've spent some time talking about the bottom type as having no values. But it does have expressions like "undefined()" and that leads to a rather philosophical notion of how to think of bottom as having a value. Sorta. Skip this section if you don't like total gray beard philosophizing. If you're brave, then stick with me and imagine a subset of your favorite C derived language that does not allow side effects. No mutation, no IO, and no exceptions. In this strange world functions either just compute values or they diverge by not halting. In such a world the order of evaluation mostly isn't important as long as data dependencies are met - in f(a(), b()), you can compute a() before b() or b() before a(), whatever, they can't interfere with each other. Doesn't matter. The only time order of evaluation is forced on us is when there are data dependencies, so "a() + b()" must compute a() and b() (or b() and a()) before their sum can be computed. Nice.

Well, almost. Order of evaluation can matter for expressions that diverge. Let me give an example.

int infiniteLoop() {return infiniteLoop();}
int f(int x) {return 42;}
int result = f(infiniteLoop());

Because f ignores its argument, if we call "f(infiniteLoop());" there are two possible outcomes. If "infiniteLoop()" is eagerly evaluated before f is called then the program will diverge. On the other hand, if the expression "infiniteLoop()" is lazily remembered as being potentially needed later then f can successfully return 42 and the diverging expression can be forgotten just like it never happened (because it didn't).

We've gone to the pain of eliminating side effects from our language so it's a little irritating to have to keep thinking about evaluation order just to deal with divergence, so we perform a little mental trick; a little white lie. Imagine that functions like infiniteLoop above don't get stuck, they compute a value called ⊥, which is the only value of type bottom.

Now, since the bottom type is a subtype of all types and ⊥ is the bottom value, then it follows that every type must be extended with ⊥. Boolean = {⊥, true, false}, int = {⊥, 0, 1, -1, 2, -2, ...}, and unit = {⊥, ()}. That means we need some rules for ⊥ and how it interacts with everything else. In the vast majority of languages including Java, C#, and C++ but also impure functional languages like F# and OCaml doing just about anything with ⊥ can only compute ⊥. In other words, for all functions f, f(⊥) = ⊥. If you write f(infiniteLoop()) in those languages then the result is ⊥. This kind of rule is called "strictness".

In contrast, Haskell is often called a "lazy language," meaning that expressions aren't evaluated until they're needed. That's not quite technically correct. The Haskell spec just says that it is "non-strict." The spec doesn't care when expressions are evaluated so long as programs allow ⊥ to slide as far as possible. An expression like f(infiniteLoop) must evaluate to 42. Haskell basically forces an expression involving ⊥ to evaluate to ⊥ only when the argument must be used[6]. The distinction between "lazy" and "non-strict" is subtle, but by being "non-strict" rather than "lazy" a Haskell compiler can use eager evaluation anytime it can prove that doing so doesn't change behavior in the face of ⊥. If a function always uses its first argument in a comparison, then Haskell is free to use eager evaluation on that argument. Since Haskell truly does forbid side effects(unlike our imagined neutered language above), the choice of evaluation strategy is up to the compiler and invisible except for performance consequences[7].

C++, Java, and C# have just a tiny bit of non-strictness. In these languages "true || ⊥" is true and "false && ⊥" is false. If these languages were totally strict then "true || ⊥" would be ⊥. Users of these languages call this behavior "short circuiting" and it's done for performance reasons rather than being a philosophical goal, but it's still a curious departure from their normal rules.

There you have it. The bottom value ⊥ is a clever mental hack to allow purely declarative functional code to be reasoned about without injecting sequencing into the logic. It allows people to talk about the difference between a purely declarative strict language and a purely declarative non-strict language without getting into details of evaluation order. But since we're talking about languages that aren't so purely declarative, we can take off our philosopher hats and return back to the world where side effects are unrestricted, bottom is a type with no values and divergence means that flow of control goes sideways.

Tuples and Aesthetics

Last time I talked about the unit type and how, if you interpret types as logical propositions, the unit type behaves as a "true" and tuple types act as logical and "∧." I also talked about an algebraic interpretation where unit type acts like 1 and tuple types act like multiplication "×". So the type (A,B) can also be written as A∧B or A×B. The type (A,unit) is isomorphic to A, A×1 = A, and A∧True <=> A.

Bottom has similar interpretations as 0 or False. The type (A,bottom) is isomorphic to the bottom type because you can't compute any values of type (A,bottom). A×0 = 0, and A∧False <=> False. Nice how it all hangs together, eh?

Bottom behaves like False in another way. In logic if you assume that False is true you can prove anything. Similarly in type systems the bottom type allows the programmer to promise to do even the impossible. For instance, here's a Java function signature that promises to convert anything to anything.

public static <A,B> B convert(A arg) {...}
If you ignore dynamic casting and null (they're both weird in different ways) there's no meaningful way to implement that function except by diverging. More on this in an upcoming episode.

Conclusion

I somehow don't think anybody will be running into the office saying "I just read an article on the bottom type, so now I know how to solve our biggest engineering challenge." But the bottom type is still interesting. It's a kind of hole in your static type system that follows inevitably from the Turing Halting Problem[8]. It says that a function can't promise to compute a string, it can only promise to not compute something that isn't a string. It might compute nothing at all. And that in turn leads to the conclusion that in a Turing complete language static types don't classify values (as much as we pretend they do) they classify expressions.

Footnotes

  1. gcc does do tail call optimization under some circumstances.
  2. Oddly, in C, C#, and Java X can't be "void" because its an error to return an expression of type void. C++ does allow void to make writing templates a bit easier.
  3. Yes, I can make it a void function and not have a return. But again that would illustrate how exit() is different from a throw: throw doesn't require the return type to be void.
  4. Note I'm saying "subtype" here and not "subclass." int, List<String>, and List<File> are 3 different types. In C++, Java, and C# int doesn't have a class. In Java and C# List<String> and List<File> both come from the same class. Yet bottom must be a subtype of all 3 so it can't be a subclass.
  5. A plea to my readers: I'm too lazy to see if "forall a.a" is bottom in F# or C#, or if, like Java, null would be allowed. My bet is that null won't be allowed because only Java has the peculiar notion that type parameters must be bound to reference types.
  6. Very loosely speaking. Please see the Haskell Report for all the gory details about when Haskell implementations are allowed to diverge.
  7. Haskell has another clever hack where throwing an exception isn't an effect, but catching one is. Since order of evaluation is unspecified in Haskell, the same program with the same inputs could conceivably cause different exceptions. Exceptions are theoretically non-deterministic in Haskell.
  8. A language that isn't Turing complete can prove termination and does not need a bottom type, but such a language isn't powerful enough to have a program that can interpret the language.

Thursday, July 16, 2009

"Fixing" Groovy?

In response to my last article on Groovy's lack of static typing James Strachan, the original Groovy guy, tweeted "groovy should fix this really IMHO." Well, perhaps. But "fixing" it would substantially change Groovy's semantics. Most importantly it would prevent Groovy from executing some programs that work just fine right now.

An Example

Here's a simple example of a program that works now but wouldn't under a static typing regime. It's a simple variation on the program from the previous article.

interface Foo {}
interface Bar {}
class Baz implements Foo, Bar {}

Foo test(Bar x) {return x}

test(new Baz())
println("Everything's Groovy")

Compile it, run it, and feel groovy.

~/test$ groovyc test.groovy
~/test$ groovy test
Everything's Groovy

If I transliterate into Scala

trait Foo
trait Bar
class Baz extends Foo with Bar

def test(x : Bar) : Foo = x

test(new Baz)
println("Everything's Scala?")

And try to compile

~/test$ scalac -Xscript test test.scala
!!!
discarding <script preamble>
(fragment of test.scala):5: error: type mismatch;
 found   : this.Bar
 required: this.Foo
def test(x : Bar) : Foo = x
                           ^
one error found

Scala rejects a program that Groovy is happy with.

Conclusion

That's the nature of the beast. Static typing rejects some good programs that might otherwise work. And as my last article showed, dynamic typing allows some bad programs that can't possibly work. That's a pretty fundamental difference and there's no way around it.

If the Groovy community wants to make its type annotations work in a more statically typed manner that's ok. This article just shows that such a change wouldn't be 100% compatible with today's Groovy.

Wednesday, July 15, 2009

Groovy Does Not Have Optional Static Typing

Every now and then somebody writes that Groovy supports optional static typing. But it doesn't. They've confused the use of type annotations on variables with a static type system.

This article is emphatically not an entry in the long, sad, and mostly insipid static vs. dynamic debate. It is a clarification of a misunderstanding.

Definitions

The word "static" in "static type system" is related to phrases like "static analysis." Basically, without executing the code in question, a static type system analyzes code to prove that certain kinds of misbehaviors won't ever happen.

The word "dynamic" in "dynamic type system" is related to the "dynamic execution" of a program. In contrast to a static type system, a "dynamic type system" automatically tests properties when expressions are being (dynamically) evaluated.[1]

The Proof

With those definitions its easy to show that adding type annotations to a Groovy program does not make it statically typed. Create a file named test.groovy. In it define a couple of unrelated classes

class Foo {}
class Bar {}

Now write a function that promises to take a Bar and return a Foo. Add a println for fun.

Foo test(Bar x) { return x }
println("Everything's groovy")

Compile it and run it (explicitly compiling isn't necessary, it just reinforces my point)

~/test$ groovyc test.groovy
~/test$ groovy test
Everything's groovy

There were no complaints at all. A static type system would have rejected the program.

Open up test.groovy and make a slight adjustment so that the function is actually called

Foo test(Bar x) {return x}
println("Everything's groovy")
test(new Bar())

Compile and run this

~/test$ groovyc test.groovy
~/test$ groovy test
Everything's groovy
Caught: org.codehaus.groovy.runtime.typehandling.GroovyCastException: 
Cannot cast object 'Bar@861f24' with class 'Bar' to class 'Foo'
 at test.test(test.groovy:4)
 at test.run(test.groovy:6)
 at test.main(test.groovy)

A runtime exception is finally thrown at the (dynamic) moment when the test function is executed. That's dynamic checking rather than any form of static proof.

Actual Static Typing

Now open a file named test.scala.

class Foo
class Bar
def test(x : Bar) : Foo = x

Compile as a script

~/test$ scala -Xscript test test.scala
(virtual file):1: error: type mismatch;
 found   : this.Bar
 required: this.Foo
object test {
^
one error found

Ta da, static typing. With no attempt to execute my test function the type checker says "I have reason to believe this might go sideways."

The Misunderstanding

There's an oft repeated phrase that "in a statically typed language variables are typed." There's also a common misunderstanding that static typing and type annotations are the same thing. I can show a simple counter example to both misconceptions with one language: Haskell.

Create a file named Test.hs. Here's a function called flatten. It flattens a list of lists one "level" to just be a list.[2]

flatten = foldl (++) []

No variables were harmed during the creation of that function. It's written in what's called "point free" style. Yet with neither variables nor type annotations I can show it's statically typed by defining a bogus main function

main = print $ flatten [1,2,3,4,5,6]

And trying to compile

~/test$ ghc Test.hs -o test
Test.hs:3:24:
    No instance for (Num [a])
      arising from the literal `1' at Test.hs:3:24
    Possible fix: add an instance declaration for (Num [a])
    In the expression: 1
    In the first argument of `flatten', namely `[1, 2, 3, 4, ....]'
    In the second argument of `($)', namely
        `flatten [1, 2, 3, 4, ....]'

Thus you need neither variables nor annotations to have static typing. A small change fixes things right up

main = print $ flatten [[1,2,3],[4,5,6]]

~/test$ ghc Test.hs -o test
~/test$ ./test
[1,2,3,4,5,6]

It's easy enough to see what type GHC has assigned the function. Just add a module declaration at the very top of the file

module Main (flatten, main) where

And fire up ghci

~/test$ ghci
Loading package base ... linking ... done.
Prelude> :load Test
Ok, modules loaded: Main.
Prelude Main> :type flatten
flatten :: [[a]] -> [a]

Exactly as described, it takes a list of lists and returns a list.

Groovy Does Have Static Typing

Having shown that Groovy does not have static typing let me show that it does have static typing. :-) At least, it has static typing for some kinds of things. Open up test.groovy again and add this

class MyRunnable implements Runnable {}

Compile that and you get an error

~/test$ groovyc test.groovy
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed, test.groovy: 8: 
Can't have an abstract method in a non-abstract class. 
The class 'MyRunnable' must be declared abstract or the method 'void run()' must be implemented.
 @ line 8, column 1.
   class MyRunnable implements Runnable {}
   ^

1 error

The error happens without executing anything so it's a form of static analysis, and it's caused by failure to adhere to the Runnable type's requirements hence it's a static type error. But it's not an optional static type check - you can't turn it off.

Interesting, no?

Conclusion

Whatever you feel positively or negatively about Groovy please don't perpetuate the myth that it has optional static types. It doesn't. Groovy's optional type annotations are used for dynamic testing not static proofs. It's just that with type annotations Groovy dynamically tests are for "dynamic type" tags attached to objects rather than its more normal mechanism of dynamically testing for the presence of certain methods. What little static typing Groovy does have is not optional.

Update: Some time after this article was written, a team announced Groovy++, an extension to Groovy that really does have optional static typing.

Footnotes

[1] In a certain formal sense the phrase "dynamic type system" is an oxymoron and "static type system" is redundant. However, the phrases are commonly used and there doesn't seem to be a commonly recognized concise phrase to describe what a "dynamic type system" does. Pedants will argue for "untyped" but that doesn't give me a concept on which to hang the different ways that say Groovy, Ruby, and Scheme automatically deal with runtime properties.

[2] A more general function called join already exist for all monads in Control.Monad module. It can be defined as "join m = m >>= id"

Thursday, May 7, 2009

A Brief, Incomplete, and Mostly Wrong History of Programming Languages

1801 - Joseph Marie Jacquard uses punch cards to instruct a loom to weave "hello, world" into a tapestry. Redditers of the time are not impressed due to the lack of tail call recursion, concurrency, or proper capitalization.

1842 - Ada Lovelace writes the first program. She is hampered in her efforts by the minor inconvenience that she doesn't have any actual computers to run her code. Enterprise architects will later relearn her techniques in order to program in UML.

1936 - Alan Turing invents every programming language that will ever be but is shanghaied by British Intelligence to be 007 before he can patent them.

1936 - Alonzo Church also invents every language that will ever be but does it better. His lambda calculus is ignored because it is insufficiently C-like. This criticism occurs in spite of the fact that C has not yet been invented.

1940s - Various "computers" are "programmed" using direct wiring and switches. Engineers do this in order to avoid the tabs vs spaces debate.

1957 - John Backus and IBM create FORTRAN. There's nothing funny about IBM or FORTRAN. It is a syntax error to write FORTRAN while not wearing a blue tie.

1958 - John McCarthy and Paul Graham invent LISP. Due to high costs caused by a post-war depletion of the strategic parentheses reserve LISP never becomes popular[1]. In spite of its lack of popularity, LISP (now "Lisp" or sometimes "Arc") remains an influential language in "key algorithmic techniques such as recursion and condescension"[2].

1959 - After losing a bet with L. Ron Hubbard, Grace Hopper and several other sadists invent the Capitalization Of Boilerplate Oriented Language (COBOL) . Years later, in a misguided and sexist retaliation against Adm. Hopper's COBOL work, Ruby conferences frequently feature misogynistic material.

1964 - John Kemeny and Thomas Kurtz create BASIC, an unstructured programming language for non-computer scientists.

1965 - Kemeny and Kurtz go to 1964.

1970 - Guy Steele and Gerald Sussman create Scheme. Their work leads to a series of "Lambda the Ultimate" papers culminating in "Lambda the Ultimate Kitchen Utensil." This paper becomes the basis for a long running, but ultimately unsuccessful run of late night infomercials. Lambdas are relegated to relative obscurity until Java makes them popular by not having them.

1970 - Niklaus Wirth creates Pascal, a procedural language. Critics immediately denounce Pascal because it uses "x := x + y" syntax instead of the more familiar C-like "x = x + y". This criticism happens in spite of the fact that C has not yet been invented.

1972 - Dennis Ritchie invents a powerful gun that shoots both forward and backward simultaneously. Not satisfied with the number of deaths and permanent maimings from that invention he invents C and Unix.

1972 - Alain Colmerauer designs the logic language Prolog. His goal is to create a language with the intelligence of a two year old. He proves he has reached his goal by showing a Prolog session that says "No." to every query.

1973 - Robin Milner creates ML, a language based on the M&M type theory. ML begets SML which has a formally specified semantics. When asked for a formal semantics of the formal semantics Milner's head explodes. Other well known languages in the ML family include OCaml, F#, and Visual Basic.

1980 - Alan Kay creates Smalltalk and invents the term "object oriented." When asked what that means he replies, "Smalltalk programs are just objects." When asked what objects are made of he replies, "objects." When asked again he says "look, it's all objects all the way down. Until you reach turtles."

1983 - In honor of Ada Lovelace's ability to create programs that never ran, Jean Ichbiah and the US Department of Defense create the Ada programming language. In spite of the lack of evidence that any significant Ada program is ever completed historians believe Ada to be a successful public works project that keeps several thousand roving defense contractors out of gangs.

1983 - Bjarne Stroustrup bolts everything he's ever heard of onto C to create C++. The resulting language is so complex that programs must be sent to the future to be compiled by the Skynet artificial intelligence. Build times suffer. Skynet's motives for performing the service remain unclear but spokespeople from the future say "there is nothing to be concerned about, baby," in an Austrian accented monotones. There is some speculation that Skynet is nothing more than a pretentious buffer overrun.

1986 - Brad Cox and Tom Love create Objective-C, announcing "this language has all the memory safety of C combined with all the blazing speed of Smalltalk." Modern historians suspect the two were dyslexic.

1987 - Larry Wall falls asleep and hits Larry Wall's forehead on the keyboard. Upon waking Larry Wall decides that the string of characters on Larry Wall's monitor isn't random but an example program in a programming language that God wants His prophet, Larry Wall, to design. Perl is born.

1990 - A committee formed by Simon Peyton-Jones, Paul Hudak, Philip Wadler, Ashton Kutcher, and People for the Ethical Treatment of Animals creates Haskell, a pure, non-strict, functional language. Haskell gets some resistance due to the complexity of using monads to control side effects. Wadler tries to appease critics by explaining that "a monad is a monoid in the category of endofunctors, what's the problem?"

1991 - Dutch programmer Guido van Rossum travels to Argentina for a mysterious operation. He returns with a large cranial scar, invents Python, is declared Dictator for Life by legions of followers, and announces to the world that "There Is Only One Way to Do It." Poland becomes nervous.

1995 - At a neighborhood Italian restaurant Rasmus Lerdorf realizes that his plate of spaghetti is an excellent model for understanding the World Wide Web and that web applications should mimic their medium. On the back of his napkin he designs Programmable Hyperlinked Pasta (PHP). PHP documentation remains on that napkin to this day.

1995 - Yukihiro "Mad Matz" Matsumoto creates Ruby to avert some vaguely unspecified apocalypse that will leave Australia a desert run by mohawked warriors and Tina Turner. The language is later renamed Ruby on Rails by its real inventor, David Heinemeier Hansson. [The bit about Matsumoto inventing a language called Ruby never happened and better be removed in the next revision of this article - DHH].

1995 - Brendan Eich reads up on every mistake ever made in designing a programming language, invents a few more, and creates LiveScript. Later, in an effort to cash in on the popularity of Java the language is renamed JavaScript. Later still, in an effort to cash in on the popularity of skin diseases the language is renamed ECMAScript.

1996 - James Gosling invents Java. Java is a relatively verbose, garbage collected, class based, statically typed, single dispatch, object oriented language with single implementation inheritance and multiple interface inheritance. Sun loudly heralds Java's novelty.

2001 - Anders Hejlsberg invents C#. C# is a relatively verbose, garbage collected, class based, statically typed, single dispatch, object oriented language with single implementation inheritance and multiple interface inheritance. Microsoft loudly heralds C#'s novelty.

2003 - A drunken Martin Odersky sees a Reese's Peanut Butter Cup ad featuring somebody's peanut butter getting on somebody else's chocolate and has an idea. He creates Scala, a language that unifies constructs from both object oriented and functional languages. This pisses off both groups and each promptly declares jihad.

Footnotes

  1. Fortunately for computer science the supply of curly braces and angle brackets remains high.
  2. Catch as catch can - Verity Stob

Edits

  • 5/8/09 added BASIC, 1964
  • 5/8/09 Moved curly brace and angle bracket comment to footnotes
  • 5/8/09 corrected several punctuation and typographical errors
  • 5/8/09 removed bit about Odersky in hiding
  • 5/8/09 added Objective-C, 1986
  • 5/8/09 added Church and Turing
  • 4/9/10 added Ada (1983) and PHP(1995)

Monday, April 6, 2009

Erlang Style Actors Are All About Shared State

Actors have become quite the popular topic. Besides Erlang, there's a famous library implementation in Scala and at least 3 for Java. But the "no shared state" propaganda is setting people up for failure. In last week's exciting episode I defined both what state is and what it isn't. It is the fact that something responds differently to the same inputs over time. It is not the details of how that is represented. Based on that I can now show why saying that Erlang style actors don't share state is totally wrong headed - that, in fact, if you don't want to share state the actor model is a very clunky way of doing things.

Before I go on, let me emphasize that I like Erlang. I like actors, too. It's a very nifty model. But it's not a stateless model. Let me show what I'm ranting against:

The problem is that Erlang style actors can have shared, mutable (changeable), state. The model is all about shared state. If you didn't need to share mutable state then there are better alternatives than actors. And, importantly, even when you are using actors well you must think about all of the problems of shared mutable state. All of them. People who believe the propaganda are in for rude shocks.

The Situation

Last time I described a simple sock tube dispensing machine. Insert two coins, press the button, get a tube of socks. Insert too many coins and you get the extra coins returned. Push the button before you've inserted enough coins and you get nothing. Here's the diagram.

Imagine that Alice and Bob work at Crap Corp. ("Making High Quality Crap Since 1913"). Once a day they each like to saunter down to the break room and purchase a nice warm tube of socks. But, this is Crap Corp. and the machines don't take regular coins but Crap Corp. Coins instead. Each CCC weighs 40 pounds (very roughly 18.1436948 kilograms).

Naturally, Alice and Bob don't want to carry 80 pounds of crappy tokens around with them so they each laboriously drag a token down to a machine, insert it, walk back to their cubicle, grab another and repeat. Now, if Alice and Bob take their sock breaks at very different times that's probably going to work fine. But if they tend to overlap bad things happen. It's possible for Alice to insert her first coin, Bob to insert his first coin, Alice to insert her second coin and get an coin back (BONUS! she cries happily, experiencing the greatest joy she's ever experienced at Crap Corp.) So Alice pushes the button, gets her tube of socks, and merrily skips back to her cube. Well, maybe not skip exactly, but whatever you do when you're ecstatically happy while carrying 40 pounds of crap.

Then Bob shows up, inserts his second coin, eagerly smashes the button to get his well deserved tube of socks and ... gets nothing. Feeling cheated he pounds on the machine, kicks it, shakes it, and rocks it back and forth. Inevitably, the machine tips over, falls on Bob, and crushes him with all the tons of Crap Corp. Coins that have been inserted over the weeks. A tragic ending for somebody who just wanted some socks.

Now, that outcome isn't guaranteed even if Bob and Alice start about the same time. On the way to inserting her first coin Alice could be waylaid by the boss looking for his TPS report. As Alice patiently explains that a) TPS reports were never her job and b) they were discontinued three years ago and c) her eyes are on her face not her chest, Bob could have merrily taken the two coin trips and safely received a tube of socks without ever knowing the mortal injury he narrowly avoided.

Finally, Some Damn Code

Any time something unwanted can happen as the result of unpredictable delays, scheduler priorities, workload, etc you have a race condition. What could be more unwanted than being crushed by a vending machine? And what could be more unpredictable than a pointy haired boss? We can write this up exactly in Erlang.

In a file called sockmachine.erl. First, a little standard module definition and export business.

-module(sockmachine).
-export([start/0, insertcoin/1, pushbutton/1, test/2]).

Here are the guts of the machine. zerocoins(), onecoin(), and twocoins() are the states of the machine. When one is called it blocks, waiting for an message in its inbox. Based on the message it gets it responds with {nothing} if nothing happens, {coin} if it needs to return a coin, or {tubeofsocks} for the win. It also then calls the appropriate function for the next state - which might be the same state. These are all private functions not exported by the module. Note, there are more clever ways to write this - but for explanatory purposes I like this.

zerocoins() ->
   receive
       {coin, From} ->
           From ! {nothing},
           onecoin();
       {button, From} ->
           From ! {nothing},
           zerocoins()
   end.

onecoin() ->
   receive
       {coin, From} ->
           From ! {nothing},
           twocoins();
       {button, From} ->
           From ! {nothing},
           onecoin()
   end.

twocoins() ->
   receive
       {coin, From} ->
           From ! {coin},
           twocoins();
       {button, From} ->
           From ! {tubeofsocks},
           zerocoins()
   end.

Start spawns a new sock machine actor in the zerocoins state

start() -> spawn(fun() -> zerocoins() end).

insertcoin and pushbutton are rpc style convenience functions that insert a coin or push the button. Or did I get that backwards? Well, whichever, they each return whatever they recieve as a message back from the machine.

insertcoin(Machine) -> 
   Machine ! {coin, self()},
   receive X -> X
end.

pushbutton(Machine) -> 
   Machine ! {button, self()},
   receive X -> X
end.

Test spawns as many concurrent test loops as requested to simultaneously pound one machine.

test(Machine, Processes) -> 
  if 
    Processes > 0 ->
       spawn(fun() -> testloop(Machine, 100) end),
       test(Machine, Processes - 1);
    true ->
       io:format("all test processes launched~n")
  end.       

Testloop repeatedly walks through the cycle of inserting 2 coins and pushing the button. It calls helper functions that mirror the state of the sock machine to show what it expects to happen at each step, complaining when things don't go well.

testloop(Process, Machine, Count) ->
   if 
     Count > 0 -> testzerocoins(Process, Machine,Count);
     true -> io:format("[~w] testing completed~n", [Process])
   end.

testzerocoins(Process, Machine, Count) ->
  case insertcoin(Machine) of
    {nothing} -> testonecoin(Process, Machine,Count);
    {coin} -> 
       io:format("[~w] BONUS COIN!~n", [Process]),
       testtwocoins(Process, Machine,Count)
   end.

testonecoin(Process, Machine, Count) ->
  case insertcoin(Machine) of
    {nothing} -> testtwocoins(Process, Machine,Count);
    {coin} ->
       io:format("[~w] BONUS COIN!~n", [Process]),
       testtwocoins(Process, Machine,Count)
  end.

testtwocoins(Process, Machine, Count) ->
  case pushbutton(Machine) of
    {tubeofsocks} -> io:format("[~w] Got my socks.~n", [Process]);
    {nothing} -> io:format("[~w] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH~n", [Process])
  end,
  testloop(Process, Machine, Count - 1).

Now fire up erl, compile, start a machine, and test it with only 1 running test loop

1> c(sockmachine).
{ok,sockmachine}
2> Machine = sockmachine:start().
<0.38.0>
3> sockmachine:test(Machine,1).
all test processes launched
ok
[1] Got my socks.
[1] Got my socks.
[1] Got my socks.
[1] Got my socks.
[1] Got my socks.
[1] Got my socks.
[1] Got my socks.
[1] Got my socks.
[1] Got my socks.
[1] Got my socks.
[1] testing completed

Ah, sweet, sweet success! But now run another test with 2 concurrent test loops. 1 = Bob, 2 = Alice...or was that the other way around?.

4> sockmachine:test(Machine,2).
all test processes launched
[2] BONUS COIN!
[1] BONUS COIN!
ok
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] BONUS COIN!
[1] BONUS COIN!
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] BONUS COIN!
[1] BONUS COIN!
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] BONUS COIN!
[1] BONUS COIN!
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] BONUS COIN!
[1] BONUS COIN!
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] BONUS COIN!
[1] BONUS COIN!
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] BONUS COIN!
[1] BONUS COIN!
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] BONUS COIN!
[1] BONUS COIN!
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] BONUS COIN!
[1] BONUS COIN!
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] BONUS COIN!
[1] BONUS COIN!
[2] Got my socks.
[1] Blasted machine ate my money! Give it to me!  Rattle, rattle, ARRRGGHGHHRHRHGH
[2] testing completed
[1] testing completed

It's a litany of socks, bonus coins and crushed intestines. On my machine it's an oddly predictable litany, but in a real, distributed Erlang app it would be much more interesting and random litany as network delays would emulate pointy haired bosses even better than Erlang's scheduler.

Some Last Notes

With Erlang style programming, actors are the central unit of statefulness. Multiple actors can share access to one stateful actor. Hence shared state, race conditions, and ruptured spleens. Am I saying that Erlang or actors are bad? No, in fact I quite like them. What the Erlang model does very nicely is separate that which must be stateful because it is concurrent from that which is more pure computation. By making state so much more painful to write than "foo.x = foo.x + 1" the actor model encourages you to think about the consequences of sharing it. It also cleanly mirrors the mechanics of distributed computing and asynchronous IO. It's nice, but it's not stateless.

One last note. I started with "actors are all about shared state." Naturally one might ask "well, what about stateless actors - actors that don't change state or depend on state via IO?" Certainly those are viable uses of actors. But that's no longer concurrency, that's parallelism and IMHO futures, data flow variables, and Haskell's data parallelism are all cleaner ways to deal with parallelism. Someday soon I hope to write about them. In the meantime, the whole point of having the complexity of message passing instead of those simpler mechanisms is precisely to deal with the complexity of concurrent state.

One really last note. Sadly, simple straight up actors don't automatically compose very well. You can design a set of actors that interact correctly for one use case but that don't interact at all well when plugged into a different system. This is another aspect that actors share with traditional manual locks and shared mutable memory. To date the best known way to deal with composable state is transactions (optimistic, pessimistic, distributed, compensating, software, hardware, database, whatever). There are very nice transactional capabilities available for Erlang, but this is yet another area where the "no shared state" mantra can lead people to think that actors are the entire answer without needing anything else.

Try not to get crushed and good luck with your socks!

Postscript

It has been suggested in the comments that when people say that Erlang style actors don't share state they mean it doesn't share memory. First, I clearly defined state in the previous article as being different from its representation. But, just as importantly, I think that saying "we don't share memory" is a distinction without much relevance. It's mostly an implementor's point of view that doesn't reflect how a user must think about actors.

Here's some Erlang code for simple shared mutable variables. This isn't recommended Erlang usage, but it doesn't break the model in any way.

-module(variable).
-export([malloc/0, fetch/1, store/2]).

malloc() -> spawn(fun() -> loop(0) end).

loop(Value) ->
   receive
       {fetch, From} ->
           From ! Value,
           loop(Value);
       {store, NewValue} ->
           loop(NewValue)
   end.

fetch(Variable) -> 
   Variable ! {fetch, self()},
   receive X -> X end.


store(Variable, Value) -> 
   Variable ! {store, Value}.

And here's me using it.

1> c(variable).
{ok,variable}
2> X = variable:malloc().
<0.38.0>
3> variable:store(X,42).
{store,42}
4> variable:fetch(X).
42
5> variable:store(X, variable:fetch(X) + 1).
{store,43}
6> variable:fetch(X).
43

And, since these variables are actors they are just as easy to share as my sock machine example.

Monday, March 16, 2009

Operator Overloading Ad Absurdum

Bruce Eckel's recent blog post on The Positive Legacy of C++ and Java has opened a small can of worms on the Internet. The argument is on operator overloading[1]: tool of Satan or road to bliss? The arguing that's resulted is the usual well thought out, reasoned debate I've come to expect from software engineers. Which is to say, not at all.

The Argument Against

Eckel mentions issues with resource management as being the primary reason that operator overloading was seen as a bad thing in C++ and therefore shouldn't be seen as bad in managed environments. But the consensus in the debates is that's not the real issue and Gosling's writing indicates that that wasn't why he dropped it from Java.

The real arguments against operator overloading boil down to abuse of a feature. Even the most ardent anti-operator folks recognize that it is occasionally a good thing to name something +. Their argument against it tends to look something like this.

The problem is abuse. Somebody will name something '+' when it has nothing to do with the common notion of '+'. The resulting confusion is a bigger downside than the benefits of allowing programmers to be flexible in naming.

Phrased that way it sounds like a reasoned trade-off. Yeah, sometimes + is good, but it can lead to more confusion than it helps resolve so let's scrap the whole idea. But watch as I do a little search and replace.

Reductio

The problem is abuse. Somebody will name something 'plus' when it has nothing to do with the common notion of 'plus'.

I've sneakily dropped the last clause but I'll get back to it. Without it, and with the textual replacement, my new statement is obviously true. Naming a function or method "plus" when it doesn't do anything like "plus" is a recipe for throwing people right off the track. I can do it one more time and get another true statement.

The problem is abuse. Somebody will name something 'getName' when it has nothing to do with the common notion of 'getName'.

The statement is still true. The convention in Java is that getName should just return the value of the name property with minimal side effects. But I've seen Java methods named getName used to make permanent changes to a shared database. Certainly you can imagine other ways that getName (or get_name or whatever) is entirely the wrong name for something in any other language - like if it adds numbers.

What's that leave us with? If I can seemingly search and replace with anything and still have a true statement then I must have bit of universal wisdom here.

The problem is abuse. For all valid names X that evoke a common conception, somebody will name something 'X' when it has nothing to do with the common notion of 'X'.

Ad Absurdum

Earlier I removed one phrase from the argument and now I want to stick it back in to the universal wisdom.

The problem is abuse. For all valid names X that evoke a common conception, somebody will name something 'X' when it has nothing to do with the common notion of 'X'. The resulting confusion is a bigger downside than the benefits of allowing programmers to be flexible in naming.

In other words, programmers simply shouldn't be allowed to name things - or at least shouldn't be allowed to use words found in a dictionary.

Where's the Flaw in the Argument

By following the structure of the original argument and with a common observation that misnamed things cause confusion I came to the conclusion that we can't let programmers name things at all or at least can't let them use common words. Good luck trying to figure out a useful process where programmers don't name things or always use gibberish. In the meantime, I'm just going to call the result absurd which leads to the conclusion that the original argument is flawed. To see the problem with the original argument let me make a few observations.

  • Programmers with good taste try to name things reasonably.
  • Most teams have a coding guideline.
  • Smart teams have at least partial code review process to ensure general code quality.

There is a downside to inappropriate naming, but smart teams and individuals already work to mitigate that problem. So the downsides of abuse don't outweigh the benefits of flexible naming because we keep an eye out for abuse in our own code and in others.

This isn't an argument for always trusting programmers with all power. For instance, garbage collection really does take away a huge source of error created by manual memory management. What this is is an argument that operator overloading doesn't change in any fundamental way the core problem of misnaming things nor does it add additional burden to the solutions we already have. If you concede that occasionally + is a useful name (e.g. for complex numbers, matrices, etc.) then the only reasonable conclusion is that programmers should be allowed to use it as a name when it makes sense and strongly discouraged from using when it doesn't - exactly the same guidelines as for any other name.

On the other hand, teams without guidelines, reviews, or good programmers are screwed with or without operator overloading. They've got problems that cannot be fixed at the language level.

footnotes

[1] Throughout this post I've used the common phrase "operator overloading." In a strict pedantic sense that's not really what happens in Scala and Haskell. Those languages simply don't have many operators in the way that most C derived languages do. Instead, they have a very flexible grammar for naming things. In Scala 2 + 3 is the same as sending a + message to a 2 object with a 3 argument 2.+(3) and in Haskell 2 + 3 is the same as calling the + function with 2 and 3 as arguments (+) 2 3. Either approach is actually conceptually much simpler than the C++ approach of having a few special operators plus rules for overloading them. None-the-less, by "operator overloading" I mean Scala and Haskell style flexibility as well as C++'s rigid rules or any other mechanism that lets users define the meaning of symbols like + for their own types.

Monday, August 13, 2007

You Might Be a Blub Neck

In Beating the Averages, Paul Graham formulated the Blub paradox. In short a programmer who only knows a language called Blub looks down on all languages that don't support features that Blub has. At the same time, he1 is incapable of understanding why he would want to use some weird language that has features that Blub doesn't have. The Blub programmer is so used to thinking in Blub that he can't get his head around anything non-Blub.The question is, how do you know if you or somebody else is a Blub programmer? Of course it goes without saying that anybody who reads this blog is kind, highly intelligent, open minded, and motivated to learn.

The comedian Jeff Foxworthy had about five minutes of fame with a schtick of starting each joke with "you might be a redneck if..." Five minutes is a lot more fame than the average blog entry gets, so I thought I'd steal his formula for success.

Here, with a slight modification of the formula, I present

That Jerk on the Internet Might be a Blub Programmer If...

He dismisses other language features as "mere syntactic sugar"

Of course its syntactic sugar. Anything above raw machine code is syntactic sugar. Syntactic sugar is what helps programmers think and work at higher levels. Look at it this way: C and C++ have "goto" and "if". But they also have "while" which is really just syntactic sugar for the right combination of "goto" and "if." C, C++, C# and Java have "for", but it's just syntactic sugar for a certain kind of "while" loop. C++ has object oriented programming, but C++ type OO can be created in C using the right combination of structs and pointers.

In other language families I could go on and on about list comprehensions, pattern matching, type inference, regular expressions, function composition, currying, etc. But one in particular deserves mention: Lisp's most powerful feature is all about syntactic sugar. Lisp macro facilities (including quasiquotes, @ splices, etc) are syntactic sugar that make it easy for a programmer to create new syntactic sugar. If you understand macros then you'll think they're damned cool or damned scary (or both), but you won't dismiss them as "mere" anything. In On Lisp , Paul Graham creates an entire embedded Prolog in Common Lisp. Or take a look at Qi. It creates an incredibly powerful, statically typed, Haskell like language on top of the dynamically typed Common Lisp using a whole lot of macro black magic.2

He dismisses most examples as "toys" that don't show a feature is useful

Face it, most teaching examples are going to be a toys. Real world code is frequently complicated with error conditions, domain specific logic, and adherence to proprietary code standards. None of these things help demonstrate the feature being demonstrated, so examples tend to be small and focused on something that most people are familiar with or can learn in a few minutes. Refusing to try to grok a feature because the example given isn't immediately pertinent to you is a failure of one's imagination.

He questions why "anybody would want to do that in the real world"

The phrase "real world" is highly loaded. It depends entirely on the domains you have experience in. You might not see why C's unsafe casts and unions are useful but let me assure you that for extremely low level performance critical code they can be critical. Haskell's concept of monads might seem arbitrary and weird, but monads have been used to statically prove significant aspects of real programs. And Ruby's dynamic metaprogramming facilities really do help create an environment that is extremely productive in non-trivial systems.

He says his favorite language "can do that" and then creates a giant ball of crufty boilerplate to prove it

Most programming languages are Turing complete. All such languages can perform the same computations somehow. But if you have to create a ball of cruft every time you want to do some particular action, how often are you going to use that feature? If you had to recreate all of number theory every time you wanted to multiply two numbers wouldn't you stop tipping waiters?

Again Lisp is special here. Any feature in another language can be implemented in Lisp using a bit of macro cruft; the difference is that in Lisp the cruft only has to be created in one place. Besides that, macros can be surprisingly uncrufty - most look very much like the syntax they're trying to generate.3

He denies that another language is more powerful than his because "they are all Turing complete"

Hmmm, is he writing raw machine code? If not, he's implicitly admitting that some languages are more "powerful" for some sense of the word "powerful". Powerful is a surprisingly tricky concept to pin down, but because our Blub programmer has chosen Blub and not machine code he must have some definition beyond the computational theoretic meaning. Either that or he's splitting hairs about "powerful" vs "expressive" or some other term.

He dismisses language feature because the syntax isn't "intuitive"

If you think blocks must be denoted using curly braces then you're going to miss out on a very rich set of languages that use some other arbitrary choice. All programming languages are artificial creations and nothing about them is "intuitive" except in the narrow sense of "similar to something you already know." For example, most imperative languages use the single equal sign (=) to mean (re)assignment. It might seem "intuitive," but what's intuitive about using a symmetrical symbol for a fundamentally asymmetrical operation? Before programming languages stole it, that symbol already had a deep history and meaning in mathematics that is completely incompatible. Frankly, Scheme's "set!" and Smalltalk/Pascal's ":=" seem like more reasonable choices, and what could be more intuitive than "<-". But try getting any of them past the Blub lawyers.

This isn't to deny the utility of working on syntax when adding a feature to an existing language. The sense of "intuitive" given above definitely applies here. It's just to say that dismissing a feature in another language because it has a different syntax is pure parochialism.

Help Stamp Out Blub Programmers

This post has been unusually strident for this blog. I mostly intend to present my thoughts with more gentleness and good humor.In a sense, though, this entry is all about preventative medicine. In my future posts I intend to talk about several languages and features. In the process I expect to get many poorly thought-out comments from Blub programmers. It's my hope that this post will at least slow some of them down to think, "if I write this, will everybody think I'm that jerk on the Internet?"

It probably won't work, but that's never stopped me before.

Before I close this out, there are a few things that do not make you a Blub programmer:

You probably aren't a Blub programmer if...

You can admit ignorance

While ignorance is a key ingredient for Blubness, it's not the only ingredient. The Blub programmer is willfully ignorant, actively pushing back against anything new. But ignorance itself is just a temporary condition if its joined with an open mind.

You've tried something and just don't like it

Hey, not every feature is all the hype promises it to be. Not liking something that you have tried isn't the same as not liking something you never want to try. Just be open to the possibility that the feature might work well in some other environment, in another language, or with a better implementation.

Footnotes

1. The usual apologies about using "he" instead of "he or she." One Div Zero officially recognizes that women have full and equal rights to be jerks and Blub programmers.

2. Liskell does the opposite. It puts a Lisp like syntax, including macros, on top of Haskell.

3. One complaint that even thoughtful people have leveled against both Common Lisp and Scheme is that they seem more like Language Development Kits than languages. In part this impression is created by the fact that the standard definitions for both are getting a bit crusty with age. Sure, you can build a nice list comprehension macro but wouldn't it be nice if a standard one came in the box? The counter argument is that the language development kit allows you to move your environment into domains that are too specialized for a standard.