Sunday, August 9, 2009

Dot Notation Redux: Google's Style Guide

Before I get into this post, let me make a few things absolutely clear. I do not want my intentions misunderstood.

  • When coding for yourself, do what feels right to you. If you don't like dot notation, don't use it, and don't feel like you should apologize for not using it.
  • When coding for a client or employer who has a coding style guide or other published coding conventions, use those, even if they disagree with your personal opinion of the "right" way to code. In a group programming environment, consistency is extremely valuable.
My goal here is not to tell you that you must or should use dot notation, it is only to refute the idea that dot notation shouldn't have been added to the language and that it inherently makes code harder to read.

My illustrious writing partner, Dave Mark tweeted today about the Google Objective-C Style Guide's argument against using dot notation in Objective-C, which reads as follows:
  1. Dot notation is purely syntactic sugar for standard method calls,whose readability gains are debatable. It just gives you another way to make method calls.

  2. It obscures the type that you are dereferencing. When one sees:
    [foo setBar:1]
    it is immediately clear that you are working with an Objective-C object. When one sees
    foo.bar = 1
    it is not clear if foo is an object, or a struct/union/C++ class.
  3. It allows you to do method calls that look like getters.
    NSString *upperCase = @"foo".uppercaseString;
    which is not only confusing, but difficult to spot in a code review.
  4. It hides method calls.
    bar.value += 10;
    is actually two separate method calls (one to set and one to get) and if your properties are not simple you may find a lot of work being done in a hidden manner.
As you read through these, they sound rather logical, and possibly even compelling. But in reality, they are not logical at all. In fact, the whole argument is basically one series of logical fallacies. Let's look at the specific arguments in order and then put the pieces together at the end.

The First Argument: the Non Argument

Dot notation is purely syntactic sugar for standard method calls,whose readability gains are debatable. It just gives you another way to make method calls.
This first "argument" contains no actual argument against the use of dot notation. The first part of the first sentence is taken almost verbatim from The Objective-C 2.0 Programming Language on Apple's site and is just a restatement (out of context) of how dot notation is implemented.

The second half of the first sentence is an attempt to discount one of the benefits of dot notation by simply dismissing it offhand without evidence or support.

The second sentence is simply an attempt to bolster the arguments that follow by trivializing dot notation as "just" something we can already do. It's sort of like saying that jet engines do not add value over propellers because they're "just" another way to create thrust. Every construct in any programming language that's higher-level than assembly is "just" another way to do something we can already do. This sentence has no semantic or logical value, it's simply here to set a negative tone toward the use of dot notation without actually offering any facts or reasons not to use it. This first "argument" is rhetoric, nothing more.

The Second Argument: the Invalid Assumption

It obscures the type that you are dereferencing.
This argument brings to mind the arguments for Hungarian Notation. The argument for Hungarian Notation is that when you look at a variable, you know right away what it is. By prefixing every variable with a jumble of individual letters, each with its own meaning, you know (in theory) all that there is to know about that variable just by glancing at it.

In reality, you don't see much Hungarian Notation these days. Variables with semantic meaning - those that use words that are recognizable by and have meaning to the brain - work much better. We may not know the variable's specific type, but we know what purpose it serves, which is really more important.

Dot notation doesn't "obscure" the type you are dereferencing unless there's some reason why you would know the type from looking just at that line of code. This argument makes the assumption that we already know and that we should know what the type of foo is. Sure, with bracket notation, we know we're dealing with an object, but we don't know what kind of object it is from looking at this one line of code in a vacuum.

But, when do you ever look at a line of code in a vacuum? You don't. Code has no meaning taken out of context. If it was vital that we know everything about a variable from looking at it out of context, then we'd all be using Hungarian Notation. Yet we're not.

Somewhere, probably not more than a few lines above
        foo.bar = 1
is either a declaration of foo or the start of the method. If you're confused about the type, generally scrolling up a few lines can resolve that confusion. If that doesn't work (because it's an instance or global variable, for example), command-double-clicking on it will take you to its declaration and then you'll know its type.

You can't obscure something that you don't have a reason to know. The amount of information that bracket notation gives us over dot notation is trivial and not enough to make an informed decision about what the code is doing anyway, so you have to consider its context. If it's not your code, you have to look at the preceding code to understand it anyway.

The Third Argument: the Red Herring

It allows you to do method calls that look like getters.
Allows? This argument is that it's bad because it "allows" you to do something? And what it allows you to do is create method calls that look like getters? What are getters? They are a kind of method, right? Am I missing something here?

Any programming language, to be useful, has to allow some kinds of bad code. I doubt it's possible to create a programming language that doesn't "allow" an inexperienced programmer to do all sorts of completely horrible things. I could come up with dozens of examples of ways that Objective-C 1.0 "allows" you to do bad things. This isn't an argument, it's a one-line example of bad code that's being passed off as an argument. It's disingenuous because there's nothing to prevent you from creating methods that look like getters but aren't without dot notation. There's no language-level constraint on that in Objective-C, and no compile-time checks for it regardless of whether dot notation is used. Dot notation changes this in no way whatsoever.

I actually find it hard to believe that an experienced Objective-C programmer would even attempt this argument because, frankly, it sounds like an argument you'd get from a C++ programmer. Objective-C is a permissive language. It's in Objective-C's DNA to let you do things. It's weakly typed and handles at runtime many things that are handled at compile-time in C++ (and all other OO languages based on the Simula object model). These are intentional design decisions. This language is designed to give you a lot of flexibility and puts trust in the developer that you'll use its features appropriately. Objective-C's dot notation doesn't run contrary to that in the slightest. In fact, it's a logical extension of that underlying philosophy. They're faulting dot notation for something that's inherent in Objective-C

The Fourth Argument: Missing the Point

It hides method calls.
Why yes, yes it does. The sample line of code supporting this "argument"
        bar.value += 10;
will result in exactly the expected behavior if you're using dot notation to represent object state. If the value and setValue: methods are something other than an accessor/mutator pair, then this it is true that this line of code could cause unexpected results, but the fault for that lies not with dot notation, but rather with a programmer who made extremely poor method naming choices, essentially lying to the world about their methods by not respecting the naming convention for accessors and mutators. Under this scenario, you'd have exactly the same problem with this line of code that doesn't use dot notation:
        [bar setValue:[bar value] + 10];
In other words, this argument is only a problem when somebody does bad things with their code, and it's just as much of a problem when not using dot notation.
Whoops! It was pointed out in the comments that I sorta missed the point on this one, and that the "problem" is that there are two method calls when someone who didn't understand dot notation might reasonably think there was only one. My response to that is: so what? How is it a problem if the result is correct? The code used to illustrate the problem will achieve the results that you should reasonably expect. After the line of code, value will be correct. The fact that there are two messages sent and not one will not matter in the vast, vast majority of situations. What counts is that the result is correct, and in the example, it would be assuming the accessor and mutator are correctly implemented. If you're having performance problems and determine through profiling that it's caused by the extra message, then you optimize by implementing a method that increments the value in just one call. It's still a non-issue.

Illusory Arguments

The law has an interesting concept called an illusory promise (or condition), which is a promise that really isn't a promise at all. It's something that looks like a promise, and is couched in the language of a promise, but which simply isn't a promise.

These arguments against dot notation in Google's Objective-C Style Guide are illusory arguments. The first one, isn't an argument at all. The second rests on assumptions that are provably untrue (that you know what type a variable is from looking at just its line of code). The remaining two are predicated on a programmer doing something wrong and can both be demonstrated just as easily without using dot notation.

Google makes the case that dot notation is bad because it can result in confusing code when a developer pays no attention to established naming conventions or makes really poor design choices. But these problems have nothing to do with dot notation. Poorly written code is poorly written code. The simple fact of the matter is, if you're trying to read code like that, nothing is going to help. With, or without dot notation, the code will be hard to read because it's bad. The correct solution in that situation is to fire or train the developer who wrote the offending code.

How I Use Dot Notation


But, there are ways in which dot notation can be used to make code more readable. The way I use it (picked up from Apple's sample code) is to use properties and dot notation to represent object state, and bracket notation when calling methods that represent behavior or trigger actions. In fact, it could be argued that using bracket notation for both state and behavior has at least as much potential for causing confusion as using dot notation does. Take this line of code, for example
        [foo setIsOn:YES]
Am I setting state or triggering behavior? It could be either. It could be both. To know for sure, I have to check the documentation for the method being called. If, on the other hand, I've used dot notation and properties to separate out state from behavior, it's instantly understood that
        foo.isOn = YES;
is setting state, but
        [foo turnOn];
is triggering behavior (which may, of course, affect state). Yes, this requires consistent design choices, but all good coding does. If you throw that out as a requirement, you can argue that anything is bad.

Arguing against letting people use dot notation because some people will make poor decisions is ignorant. It's basically saying "I don't like it, so you shouldn't use it", and when you state it like that, it sounds kinda silly, doesn't it?

0 nhận xét:

Post a Comment

 
Design by Wordpress Theme | Bloggerized by Free Blogger Templates | coupon codes