On The Market

ZCE (PHP5) Whoohooo, I’m now officially a Zend Certified Engineer. Besides looking nice on my application it was actually a pretty good exercise and preparation for my upcoming job interviews. I wouldn’t say it was particularly hard but you definitely need to prepare yourself, if only to know what kind of questions to expect. And I did learn a thing or two, some more useful than others (there is a function called strspn? Really?).
One could argue that quizzing people about the behaviour of particular functions isn’t really all that meaningful. After all, that’s the kind of stuff you can look up in the docs should need be. But then again, this is a PHP-specific certification and not one about your general ability to program. With that in mind it actually makes sense to ask some rather pedantic questions about PHP.

I’ve thought rather long and hard about what kind of jobs I should be applying for. I’ve worked with PHP, Java, C# and C++ and although PHP definitely has its quirks and shortcomings I do have a somewhat masochistic love/hate-relationship with it. It’s just a fun language. So for that reason I’ve been looking for a job as a PHP developer and it seems that there are plenty of openings and the offers are slowly trickling in. This is a very exciting time and I believe you can make or break quite a lot with your first job. Let’s see how things turn out, wish me luck.

Bitter coffee With a little sugar.

Coming back to Java after working with PHP almost exclusively for several years was, not surprisingly, a very pleasant surprise. I am very excited about the upcoming features in 5.3 (closures, lambda functions, and… namespaces, finally!) and I do like PHP, but it’s most certainly no golden hammer.

One of Java’s biggest annoyances from back when I had to deal with it in college was its lack of type variables. When using ADTs you had to do tons of potentially unsafe typecasts. And it got really fucking clumsy when you wanted to use primitive types in such a container because they always expected (and returned) Objects. Given some List l and int i, you had to weed through syntactic abscesses like this:

l.add(new Integer(i));
i = ((Integer) l.get(0)).intValue();

That was about as much fun as having your intestines torn out with a hedge trimmer (and almost as messy). With J2SE 5.0 then came autoboxing and generics, and lo and behold, we can finally simply write

l.add(i);
i = l.get(0);

Thanks to Java’s generics, gone are all those braces that painfully reminded you of the unholy days of LISP. Hoo-fucking-ray.

So far, so good.

Pairing up

The next thing everybody and their brother tries is to write a generic Pair class, by now we probably have more of those than MySpace pages *shudders*:

1
2
3
4
5
6
7
8
9
class Pair<A, B> {
    public A first;
    public B second;
 
    public Pair(A first, B second {
        this.first = first;
        this.second = second;
    }
}

And suddenly you can do fancy stuff like

1
Pair<Integer, String> pair = new Pair<Integer, String>(42, "Foo");

so you start dancing around the Playstation in your basement because while you still don’t have any friends, at least you have a generic Pair class.


Getting equal

Alas, this will not keep you happy for long. As soon as you try to implement the infamous equals you will inevitably open a can of worms big enough to fish for the next decade.

Version 1

1
2
3
4
5
6
7
8
9
10
11
public boolean equals(Object other) {
    if (other == this) return true;   // from here on implicitly assumed
    if (other == null) return false;  // dito
 
    if (other.class instanceof this)    // this too
        return false;
 
    Pair<?, ?> o = (Pair<?, ?>) other;  // and this as well
 
    return (first.equals(o.first) && second.equals(o.second);
}

Ah, blast! Pesky null values. Let’s try this again.

Version 2

1
2
3
4
5
6
public boolean equals(Object other) {
    // the usual stuff (like above)
    boolean f_eq = (first  == null ? o.first  == null : first.equals(o.first));
    boolean s_eq = (second == null ? o.second == null : second.equals(o.second));
    return f_eq && s_eq;
}

And this is when you think you’re done and start hunting done some stupid, nonsensical bug and waste an entire afternoon trying to figure out what the fuck is going on, and since you are a Computer Science major you don’t even bother googling because what fun would that be and you’re smarter than that anyway.

Eventually you track it down to something like this:

1
2
3
Pair<X, Y> p1 = new Pair<X, Y>(null, null);
Pair<A, B> p2 = new Pair<A, B>(null, null);
System.out.println(p1.equals(p2)); // outputs TRUE

Crap. If everything is null, we don’t have jack shit to compare. But semantically it makes sense for two pairs to always be distinct if they have different type parameters, even if their members are equal. Back to the drawing board:

Version 3

1
2
3
4
5
6
public boolean equals(Object o) {
    // compare against null, etc.
    if (this.class != o.class)
        return false;
    // compare individual elements
}

There, that should do it.

But it doesn’t.

Turns out, no matter with what type parameters our generic pair is instantiated, its class is always Pair. There is no way to distinguish the two, no matter what you try. And the reason for this is a nasty little thing called type erasure. After compilation the type information is simply gone, and nothing can bring it back. In other words:

If the individual members are equal and of the same dynamic type, then the Pair is equal because Pair<A, B>.class == Pair<X, Y>.class – always.

The reason for this is that Java doesn’t use a real template mechanism. As I understand it, the type safety of generics is solely achieved during compile time. At runtime a List is a List is a List.

Consequently type erasure rears its ugly head even when there are no null values involved:

1
2
3
4
5
Integer i = 42;
String s = "foo";
Pair<Integer, String> p1 = new Pair<Integer, String>(i, s);
Pair<Object, String> p2 = new Pair<Object, String>(i, s);
System.out.println(p1.equals(p2)); // outputs TRUE

In a case like this you simply can’t determine that p1 and p2 shouldn’t be equal because they have identical types and their members have equal values.

So now what?

Turns out, we’ve one more trick up our sleeve. If the type information gets lost during compilation, how about we just keep track of it ourselves?

Version 4

Let’s try this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Pair<A, B> {
    public A first;
    public B second;
    private Class<?> a_class;
    private Class<?> b_class;
 
    public Pair(Class<?> a_c, Class<?> b_c, A fist, B second) {
        // assign accordingly
    }
 
    public boolean equals(Object other) {
        // ...
        if ((a_class != other.a_class) || (b_class != other.b_class))
            return false;
        // ...
    }
}
 
Pair<Integer, String> p;
p = new Pair<Integer, String>(Integer.class, String.class, 42, "foo");

It’s clumsy, but it works. Problem solved! Well, almost. You can still screw yourself over if you do something like this:

1
2
Pair<String, Integer> p;
p = new Pair<String, Integer>(Map.class, Boolean.class, null, null);

And due to limitations in the language itself, you simply can’t prevent this from happening. You can, however, make it a tad but more foolproof by putting type constraints in the constructor:

Version 5

1
2
3
public Pair(Class<? super A> a_c, Class<? super B> b_c, A first, B second) {
    // ...
}

Again, this still isn’t the holy grail. But it does enforce some kind of inheritance between A and a_c. So you can’t accidentally write something like new Pair(Integer.class, String.class, false, "foo"). You could just as well use extend instead of super, but then you lose the minor benefit of the compiler keeping you from putting the parent class on the left hand side of the assignment which is what you do all the time (think List l = new LinkedList() and such).

Conclusion

Java Generics were a huge step forward, and they are undoubtedly “good enough” for most practical purposes. You will, however, occasionally encounter situations where you long for the raw template power of languages like C++.
Granted, the approach I outlined seems a bit unnatural and is, in a way, error prone since it forces you to enter redundant data that will blow up in your face if it is inconsistent. But if you want the added semantics described above, this is about the best you can do in Java.

P.S.: I am not sure how far you would get using Reflection. In any case it isn’t something you should be doing in a low-profile all-purpose ADT. If you really wanna read how deep the rabbit hole goes, take a look at this.


My Pair Skeleton

Here’s a slimmed-down version of my Pair class that you can use for your own implementation, should you be interested in it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
public class Pair<X, Y> {
    private final Class<?> cx;
    private final Class<?> cy;
    public X x;
    public Y y;
 
    public Pair(Class<? super X> cx, Class<? super Y> cy, X x, Y y) {
        this.x = x;
        this.y = y;
        this.cx = cx;
        this.cy = cy;
 
        if ((cx == null) || (cy == null))
            throw new IllegalArgumentException();
        if ((x != null) && (x.getClass() != cx))
            throw new IllegalArgumentException();
        if ((y != null) && (y.getClass() != cy))
            throw new IllegalArgumentException();
       }
 
    @Override
    public int hashCode() {
        // map two hashcodes to one
        return (((x == null) ? 0 : x.hashCode()) << 16) ^ ((y == null) ? 0 : y.hashCode());
    }
 
    @Override
    public boolean equals(Object o) {
        if (this == o)
            return true;
 
        if ((o == null) || !(o instanceof Pair))
            return false;
 
        Pair<?, ?> other = (Pair<?, ?>) o;
 
        if (!cx.equals(other.cx) || !cy.equals(other.cy))
            return false;
 
        boolean eqx = x == null ? other.x == null : x.equals(other.x);
        boolean eqy = y == null ? other.y == null : y.equals(other.y);
 
        return eqx && eqy;
    }
 
    @Override
    public String toString() {
        return String.format("(%s,%s)", x, y);
    }
}

I saw assign…

..and it fucked up my code, I saw assign. Ten points for anyone who gets the pop reference, and an extra five for those who get the pun.

Take a peek at this snippet of JavaScript code and tell me why it bails out in line 7 with a nasty “foo.bar is not a function” type error despite the explicit check against null:

1
2
3
4
5
6
7
8
9
var foo = {
	isBar: function() { ... }
}
 
// do stuff with foo
 
if ((foo =! null) && foo.isBar()) {
	...
}

Of course, stripped down to less than ten lines and with all the confusing stuff like callbacks, inheritance or asynchronous function calls removed, the error is blatantly obvious to see: it’s a simple typo. Instead of “foo != null” I wrote “foo =! null” and that is equivalent to “foo = !null“, which is an assignment and effectively throws foo away and replaces it with the value of !null (which happens to be true). And, sure enough, true.bar() doesn’t really work all that well. Classic case of PEBCAK.

The irony in this is that just a couple of days ago I was talking to my roomie about how some people consider it good style to do boolean expressions with variables “backwards” like such:

if (500 == $x) { ... }

The argument being that it prevents errors like the one shown above since “500 = $x” blows up in your face right away with a syntax error. We basically agreed that it looks stupid and is, in essence, retarded.

Fifteen minutes of tedious debugging and an considerable amount of profanity later I am starting to think maybe those guys were onto something after all.