September 16th, 2005
|12:32 pm - bits and pieces|
There’s a lot to say about the trip to Turkey, but the fact that it’s a lot is getting in my way of saying anything. I’ll write of other things then: mostly Perl.
I gave a talk the other day to the local Perl group about Perl 6, the language itself. It was intended for regular people who actually use Perl 5 today and may like to use something better, if it isn’t too damn painful to switch. It was a conscious decision to aim the talk at the user, not necessarily someone who’s into language design or compiler implementation. (I know a few people who’d just be impatient with a talk that doesn’t present something new in itself—almost all of what I said can be found on the net—but you gotta choose what you want to say and take into account who you want to say it to.)
Here’s a nice language feature I didn’t mention for lack of time. You know how some languages have exceptions to signal error conditions: the code that detects the problem raises an exception and someone up the call stack who can actually handle the problem catches it. The alternative (in mainstream languages) is to have the function that detected the error return a special value which the caller must interpret as a failure, and explicitly take care of immediately. In c, there are no exceptions, so you always have to write this way.
if (do_something() == FALSE)
if (do_something_else() == FALSE)
/* ... */
This is tedious, because you have to write a lot of explicit code to handle what presumably isn’t the common case of program flow: it isn’t what your code is about. Exceptions are of course meant to solve that. They also solve the problem of expressing several different kinds of errors: in the code above,
handle_error doesn’t know what went wrong; but if this were rewritten with exceptions you could have the the utility functions throw different kinds of exceptions and write different catch blocks for each of them.
The thing is that sometimes code written with exceptions has its own unwieldiness. One principled objection is that when you look at a piece of code, you should be able to see the flow: with exceptions you do not, because by looking at a function call, you can’t tell if it might throw something. Exceptions, the argument goes, are action at a distance, something which is usually considered a bad thing in programming.1 Less systematically, and despite the example code above, sometimes the specific circumstances are such that non-exceptional code is actually clearer on the page.
So here’s yet another holy war going on about how to design your code: you must, as the author of a module, either raise exceptions on errors or use the function return value. Your user, the client code, is forced to adopt your way. If there’s a project that uses several libraries, there’s the sad possibility that the code will have the ridiculous mixture of exceptional and non-exceptional code in the same block, not out of a literary decision (of what makes sense in context) but out of having to use two disagreeing modules. That sucks, and there’s nothing you can do about it.
In Perl 6, there is something you can do. And it’s remarkably simple to use, too. All you need as a module author is to use a new keyword where you’d previously thrown an exception / returned a failure value:
die "file not found";
return ENOENT; # "file not found"
fail "file not found";
Now the wonderful part. By default,
fail is synonymous with
die, that is, it raises an exception. But at the caller's discretion, it can instead be made to return a false value, and populate a special error variable with the unthrown exception object. The caller can control this behavior lexically, that is, declare
no fatal; at one point and have that in effect until the end of the current scope, and have the rest of the program work as before. Poof! The holy war just went away. People will still argue about what’s better, but less often, because a programmer’s decisions won’t be infectious.
1 I want to ask people who argue that whether they think virtual methods are also a bad idea.
2 GLib, which introduces many concepts to c, including inheritance and closures—though nobody uses the latter ouside of GLib and GTK itself, they're just too weird in c—also has a way to emulate exceptions. The idea is to pass a "GError" pointer that the called code might populate with an error and immediately return; the caller code is expected to treat the GError according to a certain protocol that gives you most of the interesting semantics of exceptions. Except that the language tax is high; you have to spend a lot of time doing something essentially bureaucratic. This is exactly the kind of thing Perl 6 the language set out to minimize.
Current Music: Renaissance - Turn of the Cards
|Date:||September 16th, 2005 02:20 pm (UTC)|| |
What about functions that already return true or false? Like
isDebugFlagSet() or something?
Just 'cause I rant about Haskell as much as you rant about Perl6, here's how they do this in Haskell.
A function effectively says in its type whether it can fail. Consider a function divide: it'll normally have type
Float -> Float -> Float. I can make it a possibly-failing function by writing it like this:
Float -> Float -> Either String Float
The "Either" says its return is either a String or a Float. You create an "Either" with the functions "Left" (which creates an Either of the left sort) or "Right".
div :: Float -> Float -> Either String Float
div x 0 = Left "division by zero"
div x y = Right (x / y)
But the "Left" and "Right" clutter up the code, so it turns out that Either is a monad (of course!).
div :: Float -> Float -> Either String Float
div x 0 = throwError "division by zero"
div x y = return (x / y)
As you can see, all that means here is that "throwError" is the same thing as "Left" and "return" gets mapped to "Right". Making Either into a monad just means you get to reuse its well-known function names, like "return". (I love that return is a well-defined function in Haskell!)
In either case, the caller needs to be explicit about handling the return value and checking whether it failed. (I love this aspect of ML-style languages, too: this is the same idea as fixing the NULL pointers problem by having the types disallow you from dereferencing a NULL pointer.)
In summary, it's a pretty standard Haskell approach: make it explicit in the type, define it in terms of something already in the language, and use a monad to make the syntax pretty.
(There's more to the Either monad, too. If you try to run multiple possibly-failing operations in a row and any of them fail, it stops at the first one. It makes me think very much of GError in that it's expressed in terms of existing data structures, but Haskell is expressive enough that it's totally transparent and painless.)
(PS: They also have exceptions, but they're mostly only used for IO and seem sorta hacky -- you can't create your own exceptions, for example. I think people generally avoid them.)
|Date:||September 17th, 2005 12:38 am (UTC)|| |
What about functions that already return true or false? Like isDebugFlagSet() or something?
Specifically in Perl, you could check the error object for defindedness. Ugly, yes. And specifically regarding this example of a boolean test function,
returns undef, so you can test for that explicitly.
But functions like these have a semipredicate problem built into them if they can fail. Out-of-band signalling is one of the reasons exceptions were invented, I guess.
I think I remember a discussion about exceptions and type checking you had with graydon
once, but I can't find it. I actually understand some of this stuff now, so it's nice to read it :)
|Date:||September 24th, 2005 04:03 pm (UTC)|| |
There's more to the Either monad, too. If you try to run multiple possibly-failing operations in a row and any of them fail, it stops at the first one.
The other day, I needed to effectively the opposite: I wanted to compose a monadic guard, which terminates on the first successful
computation. The original code was:
| fileExists fileA = Just fileA
| fileExists fileB = Just fileB
| otherwise = findHelper xs
But fileExists returned
, not a pure boolean. What I ended up doing was:
findHelper (x:xs) = do
exA <- fileExists fileA
exB <- fileExists fileB
case () of
| exA -> return $ Just $ fileA
| exB -> return $ Just $ fileB
| otherwise -> findHelper xs
The syntax here is quite funky—reminds me of a Duff's Device how the guard and case are interleaved—and it suffers from the incorrectness that exA and exB are computed even when only one is really sufficient. The nice folks on #haskell
, but that scares me :-)
Another idea was to use a monadic if:
mif :: (Monad m) => m Bool -> m b -> m b -> m b
mif test tru fls = do
vtest <- test
if test then tru else fls
findHelper (x:xs) =
mif (fileExists fileA) fileA $
mif (fileExists fileB) fileB $
But then the language syntax doesn't express what I want any more. That's hardly surprising, because the starting point was that I wanted something a little uncommon!
Well, except it should really have its return type be Monad m => Float -> Float -> m Float so I can choose whether to receive an error (Either or Maybe) or throw an exception (IO). :) Except that if you use an exception, you need to parse the string to understand it; as you said, you can't (currently) define your own exceptions. (But on the gripping hand, in reality that's also true of using Either String for errors. This is an area that could use some work.)
|Date:||December 24th, 2006 04:58 pm (UTC)|| |
How about something like
class Typeable a => CException a where
asStr :: a -> String
-- I forget if this is actually syntactically valid. Hmm, maybe not?
type Exceptional e v = CExeption e => Either e v
type ExceptionalFloat = Exceptional DivEx Float
instance CException DivEx where
asStr _ = "division by zero"
-- and finally, the point:
theGreatDivide :: Exceptional Float -> Float -> ExceptionalFloat
theGreatDivide _ 0 = fail -- oughta get Left from Either, IIRC
theGreatDivide a b = return $ a / b
(This almost certainly contains bogosity, I just whipped it up in a minute...)
Actually, turns out there's Control.Monad.Error which already provides for non-String error types — and the facilities we've discussed are special cases of it. But it seems to have limitations when it comes to defining one's own errors.