dagfinn | 04 January, 2009 03:24
There was a lot of disagreement on the value of code comments after my earlier post Comments considered harmful. Perhaps the most important objection that was raised was the idea that it's OK to improve the code, but it's even better to keep the comments in addition to the improved code.
As one of the critics expressed it:
No, comments are there to comment. Period.This is one of those ideas that seem obviously true in theory, but fail to work in practice. The reason is that comments get out of sync with the code. The comments rot, or rather, their meaning does. They become more and more misleading as the code gets changed and the comments are not adequately updated.
That has nothing to do with how good is your code.
You can write perfectly clean code and add good comments to it. Nothing wrong with that.
Let me give you an example. But first, I want to make it clear that I don't think code comments are always a bad thing. Sometimes, they are necessary. But much less often than people think.
API documentation is often indispensible, but that's really a different matter. When API documentation is generated from comments in front of each method, the primary purpose is to explain how the code can be used, not how it works. In fact, you might say they are code comments only in a syntactical sense.
On to the example. One of the comments to my blog post presented a code snippet that illustrates my point well. It's supposed to be a example of a useful inline comment in code.
// first handle the case where no records were found (if $records == 0) { return false; }
Is this really an case of good commenting? I don't think so. If it's hard to understand that $records == 0 means that no records were found, we can change the code to make it easier. The simplest way to do it is to rename the variable to something like $numberOfRecordsFound or $numRecordsFound. Or extract a method. But typically, it's possible and preferable to avoid this kind of check altogether.
Anyway, the comment is unnecessary. But as I only realized a while later, it's also misleading. Does the code "handle the case where no records were found"? No, it leaves it to the calling function to handle the case. The one line that returns false is not handling anything, it's just passing a message.
So the comment is already misleading, never mind what will happen when someone changes the code and neglects to update the comment. For example, let's say the next programmer to work on this code is in hurry to fix a bug. Now maybe the code will look like this:
// first handle the case where no records were found (if $records == 0 && $state == ACTIVE) { return false; }
The programmer wonders briefly whether the comment is still fully appropriate, is unsure, and decides to do nothing about it (or postpones the decision and forgets about it). Now, perhaps (we can't quite tell without the full context), the comment is even more misleading.
In the next round, yet another programmer comes along, makes some more code changes and wonders: "Should I update the comment? It looks all wrong to me, but it was probably written by someone with deeper insignt into the code. Better leave it alone."
The code would have been better off without the comment, but no one wants to delete it, especially since they have been told that comments are so important. It's a downward spiral: the code changes make the comments misleading less chance that they will
Comments that lie are worse than no comments, and in practice they tend to big liars.
dagfinn | 26 December, 2008 05:14
Conflict is not a bad thing. That is, if it leads eventually towards clarity and understanding rather than confusion and misunderstanding. And it's almost always better to have a conflict openly than to sweep it under the rug. Fear of conflict is the second of the five dysfunctions of a team.
And even though anger tends to cloud people's judgment, it need not be terribly destructive. If the angry person can manage to cool down and eventually keep an open mind in spite of it, it's not so bad. The problem is that people often get stuck trying to prove that they're right, placing blame and defending whatever position they started with at all costs.
One factor that exacerbates this is when different discussions get mixed up. When discussions get heated, you will often you get a discussion about the discussion mixed up with the original discussion. A debate about a technical matter may be conducted in parallel with an altercation that addresses the relationship between the participants. This is confusing in itself. Keeping these discussion apart will help maintain clarity.
In this vein, let me address the criticism I got in my latest blog post ("Comments considered harmful") regarding the appropriateness of the "considered harmful" idiom. This is not a technical issue, so let me give it a separate treatment.
A response I got was:
If ever there was a posting to which the "Considered Harmful" Essays Considered Harmful posting applied, it's this one.
To quote my own reply:
I have read that before. I fully understand the argument and agree with it in general terms. In particular, I agree with the idea that 'the writing of a "considered harmful" essay often serves to inflame whatever debate is in progress..'
In this case though, there seems to be no debate in progress in the first place. I am trying to make my point strongly enough that people will at least notice it.
If the rest of the blog post had been over the top, then I would probably have been shooting myself in the foot. I don't think it is, though.
I find the idea of "considered harmful considered harmful" valid, but over-generalized. Sometimes, a little conflict is needed to wake people up. Sometimes, you need to have two opposed camps for a while to make the difference of opinion clear before you can come to agreement.
When it comes down to it, it's an empirical question. I choose to measure the success, or lack of it, of my provocative title by the actual responses I get. Few of the commentators seem to have been derailed by the title. Some had specific technical objections, and that's fine even when I don't agree with the objections. Some asked me how specifically comments might be harmful. That also is totally relevant and, in fact, helpful.
Let me try to be even more precise. My aim in using this kind of title is not primarily to provoke, but to (over)simplify the message to make it easy to get some idea of the content by reading the title. If I had said "comments are not always useful", it would be so vague that anyone could embrace it. I would rather make it clear that I'm pushing a point of view that many are going to disagree with. Maybe I'll choose differently the next time, but I'm not promising anything.
dagfinn | 23 December, 2008 11:11
There is too much old advice in PHP. A recent case comes from the PHP Advent calendar. Eli White is a strong believer in commenting code, including inline comments inside functions.
Unfortunately, he's at least 10 years too late. This used to be good advice, but not any more.
Up to a point, he's right. Making code as easy to understand as possible is essential and can save a huge amount of time later. And adding comments to unreadable code is better than leaving it the way it is.
There is a better way, though. Refactor your code so it's easy to understand even without comments. I'm just reading Robert C. Martin's recent bookClean Code. He is very clear on this point.
Clear and expressive code with few comments is far superior to cluttered and complex code with lots of comments. Rather than spend your time writing the comments that explain the mess you made, spend it cleaning that mess.
The principle is simple: if you have an inline comment in a method (or function), take the chunk of code that the comment refers to and extract it into a separate method. Give the method an intention-revealing name. Typically, you won't need the comment afterwards.
I'm sure we need an example. Here's a small excerpt from a method taken from Zend Framework.
protected function _doUpdate() { ... /** * Execute cascading updates against dependent tables. * Do this only if primary key value(s) were changed. */ if (count($pkDiffData) > 0) { $depTables = $this->_getTable()->getDependentTables(); if (!empty($depTables)) { $db = $this->_getTable()->getAdapter(); $pkNew = $this->_getPrimaryKey(true); $pkOld = $this->_getPrimaryKey(false); foreach ($depTables as $tableClass) { try { @Zend_Loader::loadClass($tableClass); } catch (Zend_Exception $e) { require_once 'Zend/Db/Table/Row/Exception.php'; throw new Zend_Db_Table_Row_Exception($e->getMessage()); } $t = new $tableClass(array('db' => $db)); $t->_cascadeUpdate($this->getTableClass(), $pkOld, $pkNew); } } }
By changing some temporary varibles into instance variables and extracting a couple of methods, we can get code like this:
if ($this->primaryKeyValuesHaveChanged() { $this->executeCascadingUpdatesOnDependentTables(); }
Or even, by extracting a couple of classes, making objects out of two of the concepts involved, we might end up with something like this.
if ($primaryKeys->valuesHaveChanged() { $dependentTables->executeCascadingUpdates(); }
This uses most of the words from the original comment, but it's superior for several reasons. First and foremost, the section of code as a whole is clearer. Second, comments have a tendency to get out of sync with the code. By expressing the message in the code itself instead, we can avoid that. Third, the extraction itself has the positive side-effect of making it possible to test the extracted method in isolation.
Improving the code itself is harder than writing comments, but it's worth it. .
dagfinn | 14 December, 2008 05:53
The core of your average web framework is a Front Controller. Front Controllers are commonly considered complex and esoteric. That's a myth. I sometimes brag that I can construct a Front Controller in 15 minutes. Actually, it's doesn't take quite that long. In PHP, a Front Controller can be simplified to just one line of code:
call_user_func($_REQUEST['action']);
Warning: Do not use this in a real application.
There is one and only one good reason why you shouldn't: it's totally insecure. Except for that, it's simple but perfectly viable. Your actions will be just plain functions, which is pretty simplistic and will be hard to handle if you build many of them. But it's a huge improvement over the average PHP script with no systematic handling of the PHP request.
If you want to do something more like the frameworks such as Zend Framework, you want your actions as methods grouped into classes (called Controllers in ZF).
Now you need two lines. The following script demonstrates this:
// Pretend we have an HTTP request $_REQUEST = array('controller' => 'Contacts', 'action' => 'edit'); // Front controller $controller = new $_REQUEST['controller']; call_user_func(array($controller,$_REQUEST['action'])); // The action controller class class Contacts { function edit() { echo "ok\n"; } }
You can start with something like it (assuming that security has been taken care of) and add features as you need them.
In case you're wondering how to what exactly the security issues are, the most obvious problem is the fact that any user can run any method in any class that is available to this script. The way to avoid that append or prepend some fixed string to the class and the method name. For instance, in Zend Framework, the action name edit will activate the method editAction().
Beyond that, the security problems are the usual ones. You need to filter input, And in general you should beware of $_REQUEST, since it can also contain cookie data.
dagfinn | 27 November, 2008 14:19
They say there's no free lunch, but at least there's free breakfast. Last week I attended a "breakfast seminar" with Robert C. Martin (Uncle Bob). There really was free food.
Anyway, Uncle Bob held an extremely entertaining and useful introduction to the FitNesse testing tool. He got me hooked on it, but I'm even more fascinated by something else. I asked him, "what is the nature of the test API [you've been talking about]?" He answered that question, and also another, more general one: What is a good strategy for complete automated integration and acceptance testing of an application? That was the question I really wanted to ask, only I wasn't quite aware that I wanted to ask it. And, impressively, he answered it anyway.
As I understood it, the strategy he outlined was as follows. The diagram is probably only a mild perversion of the one he drew on the blackboard.
(More)
dagfinn | 18 November, 2008 14:45
I admit I don't follow Zend Framework very closely, since I haven't been using it for any serious work. But I did write a piece about testing a Zend Framework action controller with View Helpers.
This might need updating, since the testing capabilities of the Zend Framework have grown substantially since then. In particular, there is now a component called Zend_Test. I haven't had time to study it closely yet, but I hope to do so soon.
dagfinn | 07 November, 2008 20:34
I got some interesting comments to my previous post on "beautiful code". Some were pretty strong disagreements.
So am I wrong? Did I get carried away? Did my critical faculty go on vacation somewhere nice and sunny? I admit that sometimes I deliberately look at the positive and ignore the negative. (And sometimes I do the opposite; It's a good exercise if you're careful.)
I wasn't drunk, anyway. But let me take a closer look at the particular line of code I was praising:
$this->assertThat($form->hasSelect(withName('statusConfirm'))->hasValues(), array('Yes','No'));
My main point is that it's close to plain English. Not everyone agrees that that's a good thing, but I argue that we're built (genetically wired, in fact) to understand natural languages, not program code. Therefore code should be easier to understand when it approximates natural language and expression. And we're trying to create or approximate a Domain Specific Language (DSL), which should express exactly what's required for the domain and not the demands of the technical implementation.
So for this experiment, let's translate this one into a (plain English sentence:
Assert that Form (this particular one) has a select menu with the name "statusConfirm" and values "yes" and "no"
Translating back into code, it might look more like this:
$this->assertThat($form)->hasSelect()->withName('statusConfirm')->andValues('yes','no');
To me, this is even more natural than the other one. I think we've gotten rid of some syntax that has to do with implementation details rather than making the API simple to use.
It also seems clear to me how this could be implemented. All of the method calls could be to an assertion object that would take all these various inputs and always return itself at the end of the method so you can chain the calls in what's known as a fluent interface.
dagfinn | 03 November, 2008 04:36
Max Horwath has published his slides on Making Selenium Test Writing easier using a DSL onlinefrom IPC 2008. Let me quote the whole short description:
Implementing automated tests by using Seleniums API methods has several drawbacks. Selenium is great for what it does, providing a generic framework for testing a generic application. Using the Testing_SeleniumDSL framework, I will show you how to create your own Domain Specific Language (DSL), which would allow you to write tests in the language of your business rather than in Seleniums language.
I'm quite impressed by the examples he presents, such as:
$this->assertThat($form->hasSelect(withName('statusConfirm'))->hasValues(), array('Yes','No'));
This is truly expressive, readable code. I've refactored web test code in this direction many times, but I admit I never got quite this far.
The DSL is planned as an open source release. It would be interesting to try something similar for the SimpleTest web tester, which is my favorite for testing web interfaces without too much JavaScript. (Based on the paparrazzi principle.)
dagfinn | 11 October, 2008 14:41
Refactoring is by definition a design actitivity, since the definition of refactoring is "improving the design of existing code". But is this generally and fully recognized? After attending my friendly local agile conference (Smidig2008—sorry, it's in Norwegian), I'm getting more of a feel for how different people think about it. And I'm wondering whether the use of metaphors such as "cleaning" makes refactoring seem too much like unskilled labor. After all, physical cleaning jobs are seen that way.
The analogy between cleaning and refactoring is useful for making the non-developers understand that refactoring is absolutely necessary. But beyond this pragmatic similarity, are the two really similar in deep and meaningful ways? I don't think so. Refactoring is not unskilled labor. It's a task that both requires and builds design skill and experience. While anyone can see that a floor is dirty, identifying code smells is non-obvious, tricky and demanding. This is true even of the simplest code smell, duplicated code. Although spotting code duplication is sometimes easy, at other times, the duplication is too subtle to be easily identifable. When you clean a floor, the goal is well-defined and easy to visualize. When refactoring, you may know what you're aiming for at each small step, but just a few moves further ahead you may end up with a structure you hadn't imagined.
dagfinn | 06 October, 2008 12:53
There's a tutorial that appeared recently called Get Links With DOM. Planet PHP lists the author as Kevin Waterson, although his name is not mentioned on the page itself. Anyway, he claims:
Perhaps the biggest mistake people make when trying to get URLs or link text from a web page is trying to do it using regular expressions. The job can be done with regular expressions, however, there is a high overhead in having preg loop over the entire document many times. The correct way, and the faster, and infinitely cooler ways is to use DOM.
Yes, of course it's cooler. But I'm a little bit surprised at the claim that it's the "correct" (only) way, since there's at least one more that I find even cooler: XPath. Admittedly, it's slower, yet it's a more powerful language.
In his example, we just need to add a line to create an XPath object after we've created the DOM object:
$xpath = new DOMXpath($dom);
Then, instead of the DOM call:
/*** get the links from the HTML ***/ $links = $dom->getElementsByTagName('a');
we can use an XPath query:
/*** get the links from the HTML ***/ $links = $xpath->query('//a');
That's all. So why is that cooler? Because you can do more powerful searches easily. The DOM just happens to have a simple call to find all elements with a certain tag name, so there's not much difference in this case. But more complex stuff is something else. For instance, we can get just the URLs with a single expression:
$links = $xpath->query('//a/@href');
Or we can get just the URLs of just the links whose CSS class is "bookmark":
$links = $xpath->query("//a[@class='bookmark']/@href");
I've been using this for ages when testing web pages. Then there's the not quite official SimpleTest DOM tester, which uses CSS selectors to specify paths. But I won't go into that right now.
dagfinn | 16 September, 2008 13:27
This is something I posted to the Sitepoint PHP Application Design Forum with a little bit of added background.
The background is the idea that unit test methods, for the sake of readability, should test only one single behavior. This may mean several tests for one method under test, since one method may have several behaviors. One version of this is the notion of one assertion per test.
Web tests are not quite the same thing as unit test. On the Sitepoint forum, in a discussion on web testing, I said:
It's also very slow to have too many separate test methods. So instead, to make it readable, I tend to use custom assertions.
Marcus Baker asked me for examples. I tried to illustrate the principle with an example that may be a something of a caricature.
If performance and speed were not an issue, I might do something like this, keeping each test method short and sweet: PHP Code:
class ArticleListTest extends WebTestCase { function setUp() { // Some code to log in and go to the article list page //.... } function testHttpStatusOk() { //... } function testHasCorrectTitle() { //... } function testHasNoDebugInfoAccidentallyLeftBehind() { //... } }
Unfortunately, it would be nauseatingly slow, since you have to traverse some web pages for every single test method. Instead, I can try to get similar small chunking by using custom assertions: PHP Code:
class VariousPageTests extends WebTestCase { function testLoginAndArticleList() { // Some code to log in //.... $this->assertStartPage(); // Some code to go to article list page //.... $this->assertArticleListPage(); } function assertArticleListPage() { $this->assertHttpStatusOk(); $this->assertHasCorrectTitle(); $this->assertHasNoDebugInfoAccidentallyLeftBehind(); } function assertStartPage() { // } }
This way, the code has names for approximately the same details as in the first example.
dagfinn | 06 September, 2008 23:46
Max Horvath has implemented a library for type hinting scalars. That interests me, since I find that type hinting for objects has limited usefulness.
I tried using type hints extensively from an early beta version of PHP 5. I mostly gave up on them for three reasons:
I admit that these jugdments are hard to make. I could be wrong, more or less. Type hints are probably useful when code becomes stable enough and at the boundaries between modules. But I still tend to avoid using them until I get an actual bug that might have been prevented by a type hint. Their usefulness is and has to be an empirical question. The purpose of using them has to be catching errors earlier, so if they don't have that effect, there's no point.
For the purposes of this blog post, reason 2 is the important one. The idea is that type hints are more useful for arrays and scalars than for objects. Why? Because they have the potential to catch errors that would otherwise be hard to find or escape unnoticesd. As I said, if you pass an object of the wrong class, you will usually get a "non-existent method" error quickly. The same thing happens if you try to pass an array or scalar instead of an object. But if you try to pass an object instead of an array, an array instead of a scalar or vice versa, you can keep using the passed array or scalar for some time before it blows up. The problem becomes harder to track down. I've used type hints for arrays, and found them more meaningful and less troublesome than object type hints.
One case I've seen in which scalar type hinting would be useful is a simple homegrown date and time object that is initialized by passing a Unix timestamp as an argument to the constructor. A couple of times I've made the mistake of passing an object instead. The ensuing error can be hard to figure out. If I could type hint to make sure it's an integer, that would be helpful.
dagfinn | 01 September, 2008 14:02
I like weird ideas. There should be more of them. And the idea that we need to make programs efficient to save energy is weird enough to interest me.
David Peterson tells his version of Rasmus Lerdorfs views on it:
He continues on by stating that PHP developers really need to think about performance for not only scalability reasons but for green reasons. If programs were more efficient it would cut the number of data centres and would reduce energy needs as a result. In our newly emerging age of energy awareness this does become an important aspect and I am glad that he is raising awareness.
Well, theoretically, using less of anything makes good environmental sense. But you have to consider which actions have a big impact and which are marginal. For example, Norwegian politicians have considered banning incandescent light bulbs. But the idea makes a lot less sense here than in some other countries, since it's hardly ever dark and warm at the same time this far north. Therefore, the heat generated by indoor lighting contributes to heating that would be needed anyway and is usually electric already.
"Premature optimization is the root of all evil." It occurs to me that energy saving is a kind of optimization and subject to similar considerations. It needs to be focused so it's applied where it matters most and where you can get the most out of the least effort.
How much traffic does a web app need to have to make optimization worthwhile? Does the green angle add anything to that assessment?
Theoretically, it might if the cost of energy does not cover the cost of its environmental consequences. But it's a question that requires analysis. If cutting data center energy consumption is important, it makes sense to know and apply the strategies that are most cost-effective. Putting the waste heat to use? Using more of the hardware and techniques that are used to make laptops energy-efficient? I don't know, I'm not an expert on these things. What I do know is the idea of green optimization needs that kind of thinking to put it in context.
dagfinn | 10 June, 2008 13:40
I came across a Zend Framework (ZF) example I wanted to refactor. You really have to have unit test coverage to refactor effectively, and since there were no tests, I started trying to find out how to test it. There didn't seem to be a wealth of information available on the web, so I've tried to figure it out by myself.
Several challenges presented themselves. It might seem as if the ZF controller system is not optimally designed for testing action controllers, but there are always ways to get around obstacles.
A ZF action controller (which is really just a group of actions collected in a class) always extends Zend_Controller_Action. (For an introduction, see the official QuickStart).
The one I was playing with was an authentication controller:
class AuthController extends Zend_Controller_Action...
That's all we need to know about the specific class under test for now.
I always like to set up the basic objects for the test as instance variables in the setUp() method, so that I can write many small test methods exercising them in various ways. To even get started testing an action controller, we need to instantiate it outside its usual environment, the Zend Front Controller. It requires a request object and a response object:
class AuthControllerTest extends UnitTestCase { function setUp() { $this->request = new Zend_Controller_Request_Http(); $this->response = new Zend_Controller_Response_Cli(); $this->controller = new AuthController($this->request,$this->response); }
(I'm using SimpleTest here, but the same techniques should apply to PHPUnit with different syntax.)
The challenge increases increase when we want to use View Helpers, such as the flash messenger. When programming the action controller, it looks like this:
$flashMessenger = $this->_helper->FlashMessenger;
To test this meaningfully, we need to replace the flash messenger with a mock object. That means also having to replace $this->_helper, which is a "helper broker" object.
First, we generate the mock classes:
Mock::generate('Zend_Controller_Action_HelperBroker','MockHelperBroker'); Mock::generate('Zend_Controller_Action_Helper_FlashMessenger','MockFlashMessenger');
Unfortunately, the helper broker is a protected instance variable, with no apparent way to change it. But it's easy to fix by adding a setter method to the action controller class:
class AuthController extends Zend_Controller_Action... public function setHelperBroker($helperBroker) { $this->_helper = $helperBroker; }
This allows us to set up the mock helper broker and let it return the flash messenger:
$this->helperBroker = new MockHelperBroker; $this->controller->setHelperBroker($this->helperBroker); $this->flashMessenger = new MockFlashMessenger; $this->helperBroker->setReturnValue( '__get',$this->flashMessenger,array('FlashMessenger'));
Now we can use the mock objects to test an action which is supposed to send a flash message:
function testIdentifyActionSendsFlashMessage() { $this->flashMessenger->expectOnce( 'addMessage', array('Please provide a username and password.')); $this->controller->identifyAction(); }
That's as far as I'm going with this now. There may be more later.
dagfinn | 02 June, 2008 21:12
Redirects are useful in web programming, especially when implementing the Post-Redrect-Get pattern. But there is a problem with redirects: there is no simple way to send a message to the user across the redirect. When processing a GET request, you can display whatever messages you want. The most simplistic way is to echo them directly; or if just slightly more sophisticated, set it in the template that's about to become the web page. When processing a POST request that is to be followed by a redirect, you can't do that. The response (redirect) sent back to the browser does not have any text or HTML content. In practice, it just contains the URL of the page you're redirecting to. If you try echoing the message, it will cause the redirect to fail because you sent content before the header that signals redirect.
So how to get the message displayed? The simplest way to do it would be to include it in the URL:
header('Location: http://www.example.com/index.php?message='.urlencode($message));
It's possible, but your URLs can get very long and you might find it silly:
http://www.example.com/index.php?message=This+is+a+message+to+demonstrate+how+long+a+URL+can+get+when+you+put+long+strings+into+it
The more common choice is to keep the message in session and make sure it's removed from session after it's been displayed. That's what flash messages do. In military terminology, a flash message is defined as:
A category of precedence reserved for initial enemy contact messages or operational combat messages of extreme urgency.
I don't like a name that could be confused with Adobe Flash, so when I implemented my version of this, I looked up "flash" in the thesaurus and decided to call them flares instead.
Here are a couple of examples I've picked up.
Zend Framework:
$flashMessenger = $this->_helper->FlashMessenger; $flashMessenger->setNamespace('actionErrors'); $flashMessenger->addMessage($message);
CakePHP (from http://labs.iamkoa.net/2008/01/13/session-based-flash-messages-look-better-cakephp/):
$this->Session->setFlash('Your post has been saved.'); $this->redirect('/news/index');
In my own work, I've done it in a slightly different way by letting the redirect object hande the flash/flare and returning the redirect object from the action method, somewhat like this:
return Redirect::toAction('settings')->addFlare('Settings changed');
I don't necessarily claim that it's better than the others, but it makes sense to have alternatives.
This article was originally posted with an incorrect timestamp, and has been updated to the current time as of June 2.
| « | January 2009 | » | ||||
|---|---|---|---|---|---|---|
| Su | Mo | Tu | We | Th | Fr | Sa |
| 1 | 2 | 3 | ||||
| 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 | 16 | 17 |
| 18 | 19 | 20 | 21 | 22 | 23 | 24 |
| 25 | 26 | 27 | 28 | 29 | 30 | 31 |