Sentinels in JavaScript

Can you find the substring?

Can you find the substring?

In older procedural languages, the return values that came back from a function were restricted. If you said you were going to return a number, you returned a number. If you wanted to sometimes return a number, but other times return an indication of failure, you resorted to what is known as a “sentinel value” to return the failure.

A sentinel value is a number that doesn’t represent the answer to the problem that the function was asked to solve, but instead flags the caller to the fact that something rather out-of-the-ordinary has happened. For example, a value of -1 can indicate the end of a file being read in.

The problem with a sentinel is that it means nothing special to the language. The programmer has to keep in mind that the sentinel exists and that it has to be handled on every call that could possibly generate the sentinel value. Programmers are notorious for paying attention to and handling sentinels only when they crash a program.

Also, the programmer has to be careful in choosing the value. Is it 0? -1? 99? 255? 999? -999? MAXINT? The choice can bite you if you’ve misunderstood the possible values that can be generated in your function. I regret to inform you that I once wrote a BASIC program that had three different sentinel values returned from three different subroutines!

Sentinel values sound old and busted, don’t they? Their usage fell a bit once we could pass around pointers to data structures. With a little more elbow room, we could put a “success” boolean right up front in the data structure and do away with a lot of sentinels.

They Live!

But sentinels are still around. In JavaScript, the string method “indexOf” returns a -1 if the needle (substring) can’t be found in the haystack (the string to be searched).

Let’s look at a few different ways we can deal with that awkward -1.

function wrappedIndexOf(needle,haystack) {
    var res=haystack.indexOf(needle);
    if (res===-1) {
        res=false;
    }
    return res;
}

And you call it like this:

res=wrappedIndexOf(needle,haystack);
if (res!==false) {
    location=res;
}

We haven’t done much here but replace the need to check explicitly for -1 with a need to check explicitly for false. Too weird to sometimes return a number and sometimes return a boolean? Yeah, probably. I like it a bit better than the numeric sentinel because the calling code makes the exception check a bit more obvious. Because JavaScript functions can return just about anything (including wild things like anonymous functions), you always need to think about what can come out of a function, so what’s important is a consistent convention.

Similarly, you could return null or undefined.

But let’s move on to another solution–returning an object.

function wrappedIndexOf(needle,haystack) {
    var res=haystack.indexOf(needle);
    if (res===-1) {
        res={"success":false,"value":-1};
    } else {
        res={"success":true,"value":res};
    }
    return res;
}

And you call it like this:

obj=wrappedIndexOf(needle,haystack);
if (obj.success) {
    location=obj.value;
}

Note that I’ve left the -1 in obj.value, so you can still use that as a sentinel if you like.

What makes JavaScript really good for this task? Its object literals. You don’t need to have a structure or class around to hold the extra info. You just build the object on the fly and return it.

Advertisements

~, !, +, & –

Does that headline look like a bunch of cussing to you? I was hoping it would be offensive.

I’m also looking forward to see how it’s covered by the tech aggregators like Digg and Reddit. And I want to see if perhaps people land here as the result of searching for a bunch of punctuation.

Mostly I wanted to talk about JavaScript types, and how you can convert from boolean to numeric and vice versa.

This is going to be a pretty freeform discussion. If you get bored jump right to the end–there’s a trick there that’s pretty useful.

Big Mistake. Big, Big Mistake

Douglas Crockford has often mentioned that the dual usage for the plus symbol is a mistake in JavaScript. He’s right. It’s used both for numeric addition and for string concatenation. This trips programmers up all the time and there’s just no excuse for it. No upside. Another symbol should have been used for string concatenation.

The common solution to this is to wait until you get tripped up, find which term is being mistaken for a string, and then use one of the functions that JavaScript provides that converts strings to numbers. I prefer to use yet another usage of the plus symbol to do that work. I use the unary version of plus.

For example:

num = 1 + +"2"; //num is assigned the value 3

See the space between the plus signs? Yeah, that’s critical. You don’t want to mistake those two pluses for the symbol ++, the increment operator.

A unary minus works too, if you happen to want the negated version.

num = 1 - -"2"; //num is assigned the value 3, again

There’s another cool use for the plus sign. It’ll turn true and false into numbers.

num=+true; // num is 1
num=+false; //num is 0
num=-true; // num is -1
num=-false; //num is 0

So we turn true and false into 1 and 0 with unary plus. How do we turn 1 and 0 into true and false? One way is with the bang (exclamation point). In this case, we actually use a double-bang or “bang bang.” (Google “bang bang cher” and find out that long before William Hung banged, Cher banged.)

bool = !!1; //bool is true
bool = !!0; //bool is false

The first bang casts from the number to a Boolean, and the second undoes the logical not that was performed by the first bang.

You still following? How about a curve ball?

num = 1 + ~~"2"; //what the hell is that?

Welcome to the tilde. The result is 3 again, but what’s with the squiggles? The squiggle, of course, is the tilde operator. JavaScript programmers learn it and then promptly forget about it. But it plays a part in one of my favorite JavaScript tricks. We’ll see that right at the end. For now, let’s look at what it does to some values.

console.log(~-2); //1
console.log(~-1); //0
console.log(~0); //-1
console.log(~1); //-2
console.log(~2); //-3
console.log(~true); //-2
console.log(~false); //-1

As you can see, ~ is doing -(N+1). For fun, let’s throw a unary minus into the mix.

console.log(-~-2); //-1
console.log(-~-1); //0
console.log(-~0); //1
console.log(-~1); //2
console.log(-~2); //3
console.log(-~true); //2
console.log(-~false); //1

So -~N is the same as N+1.

Now let’s look at the bang symbol along with the tilde.

console.log(!-2); //false
console.log(!-1); //false
console.log(!0); //true
console.log(!1); //false
console.log(!2); //false
console.log(!true); //false
console.log(!false); //true

And for fun, let’s put the bang and the tilde together…

console.log(!~-2); //false
console.log(!~-1); //true
console.log(!~0); //false
console.log(!~1); //false
console.log(!~2); //false
console.log(!~true); //false
console.log(!~false); //false

Now that’s interesting. Only the value -1 makes it out alive with a true value. And come to think of it, there’s something special about -1. It’s a common sentinel value returned by functions that return indexes (JavaScript’s indexOf() is a great example, but you may have your own functions that return -1 as a sentinel value).

WTF?

What am I talking about? Suppose you want to know where the letter “s” is in a string. Well, you either get a number back that tells you where it is, or you get a -1 back telling you, hey buddy, no “s” in this string. So you end up with code like…

if (index>=0) {
 //found it
} else {
 //didn't
}

or

if (index>-1) {
 //found it
} else {
 //didn't
}

or

if (index!=-1) {
 //found it
} else {
 //didn't
}

But with what we now know about tilde, we can just do this…

if (!~index) {
 //found it
} else {
 //didn't
}

Or, finally, we can turn the logic around and do this…

if (~index) {
 //didn't find it
} else {
 //did
}

Credit

I’ve seen this trick a few times before. Add a comment to this post if you can help me find other places this has been mentioned.

Web Reflections