haggholm: (Default)

When comparing an expression with a bool, PHP does not perform type widening, but treats it as evaluation of a Boolean expression.

// true:
print_r((true == 8) ? 'true' : 'false');
// false:
print_r(((int)true == 8) ? 'true' : 'false');
haggholm: (Default)

I'm feeling very bloggy today.

Anyway, I'm getting more comfortable with PHPUnit, and although I've spoken of it in near-monosyllables here before, I haven't really written a ground-up post talking about what it is, what it does, and why you'd be a fool to write PHP code without it (or something like it; there are other PHP testing frameworks).


Unit testing theory, 101 )
PHPUnit )
Code coverage reports (with PHPUnit) )
Wrapping it up (in an XML configuration file) )
haggholm: (Default)

Once again, the weak typing and checks of PHP drive me to the brink of insanity (or still farther past it, depending on your definitions). Consider the following: I am picking a random element from an array and returning it, optionally removing it, to tweak test parameters. The array is associative, so I need to return a pair of $key, $value. I whip out iteration #1 of the utility function, which just returns array($key => $value). Makes sense, right? Key => value. Now I go on with my main testing function:

$arr = array(...);
list($key, $value) = $this->extractRandomArrayElement($arr);
// $key is null???
// $value is null???

If you've been paying attention, you should be laughing at me (in my lame defence, I haven't had any coffee yet today). Of course the list construct assumes that I'm returning array($key, $value) while I'm returning array($key => $value)—a significant difference, and I'm trying to use list with two elements to extract a single-element array. (list is meant for numeric arrays, anyway.) Of course this code should fail. But it fails silently. The standard modus operandi for PHP when you do something completely nonsensical appears to be not to throw an exception and die (as Python would) or a fatal error and die (as PHP at least does on static syntax errors), but to assign null to everyone concerned and go on as though nothing had happened.

This is not helpful in the least.

haggholm: (Default)

PHP can be kind of tortuous, and the interpreter did crash a lot when I installed the penultimate version of the xdebug module, but now I do have a setup with PHP, PHPUnit, and xdebug. Generating code coverage reports is excruciatingly slow (I need to improve my filters, but hopefully-exhaustive testing that has to go through the nice but sluggish PEAR::MDB2 will be slow whatever I do)—but damn, those reports are nice.

haggholm: (Default)

The company I'm working for is in a state of transition—actually, it's in multiple states of transition (or a transition point in more than one dimension), but there's only one I'm interested in writing about here and now. The company started as a small startup by a couple of guys at a pace of Oh, crap, we've got to push this new functionality out by Tuesday. This, needless to say, is not the ideal environment for well-planned over-arching project designs, in-depth testing, or detailed documentation, and accordingly, these things are lacking from the older parts of the system. I make this very explicit because I do not want to come off as disparaging the intelligence or skill of the people who wrote these things. (This is not mere self-preservation: They worked under a particular set of constraints, and did what they could with it. I look at things very differently and with a critical eye, but I also recognise that my constraints are radically different.) Some old code bits are crap, but they are crap for respectable reasons.

That being said, the codebase of the company is in long-term and ongoing transition. The old bits of the system are patched together on the basis of Get it working right now; nothing else matters 'cause we'll go out of business if we can't meet our deadlines. The newer parts of the system do increasingly have all the good things I want to see in a software product: Modular, object-oriented design, unit tests, and documentation. It's very far from perfect, but transition, like I said: That's the direction in which it is moving, and this is as it should be. It would of course be nice if it could have been thus from the beginning, but this is the real world and we work with the constraints we have.

Somewhat to my surprise, I find myself being proactive and pushing for improvements in some areas that I've never really paid much attention to before. Unit testing is one—I've known rationally how important they are, but I've never really written them before I started doing it at work (with PHPUnit). Documentation is another—part of me feels the hacker's stereotypical frustration at having to write it (Just read the damned code!), but here I am, writing beautifully formatted docblocks à la Javadoc for extraction and HTML presentation with PhpDocumentor, and urging people to do the same.

Reflecting upon these changes, I realise that three things pushed me to this place, in the form of good and bad experiences, or perhaps more poignantly, good and bad examples. First, during one of my graduate courses, I (along with another guy) did some hacking of one of the open source .NET frameworks—actually we looked at two, both Mono and (if I recall correctly) Portable.NET. One of these—I believe it was Mono, but I won't swear to it—taught me how catastrophic it can be to have poor documentation. Existing developers are easily blind to such things because the oft-used parts of the system become ingrained in working memory; things that are desperately non-obvious to newcomers have become obvious by the time they get to a point where they are in a position to make the changes necessary to make it obvious… In short, at any rate, we had to abandon one of the frameworks because no documentation made it clear how the system was organised, so we couldn't even locate the places where we needed to make modifications.

The second transforming experience was the coding I did while I worked on my master's thesis—modifying the Python interpreter to support distributed computation, remote calls, and so forth. This involved making some rather sweeping changes, such as modifying the PyObject_HEAD macro which is present in every Python object in the entire system. When modifying some of the lowest-level systems bits, and when modifying every single object the interpreter creates, there's an awful lot of room for things to go wrong, and of course, many things did. Some of them did so obviously, and some did so very obscurely. It very rapidly became clear to me that without CPython's comprehensive suite of unit tests, there is no way in Hell (or elsewhere) that I would have caught some of those errors, and ones that I did catch would nevertheless have been orders of magnitude more difficult to track down.

In other words, these experiences showed me that without documentation, is is extremely difficult to know where or how to tackle a problem, and learning curves become much steeper than they really need to be; also that a large application needs unit tests to be robust in the face of changes. It is not, after all, chiefly static correctness we test for with unit tests; rather, we want to establish that, given a correct application, when we change it, we haven't broken it. We developers want to find the bugs; we don't want users to point them out to us (especially not when they've paid for the software).

Of course, I already knew these things, in the sense that I'd read them, considered them, and agreed with the same conclusions I just presented, but they say you never truly learn to fear fire until you've been burned; so I never truly appreciated the value of these things until I experienced them for myself. Don't let this be an excuse, though; if you're working on an application, write unit tests and comprehensive documentation. There is time enough for you to learn the hard way later, when for one reason or another you don't have access to these commodities.

The third experience, of course, is work, where I have the opportunity to see both sides of these issues. A system in transition, part of our code base is properly organised, and parts are a mess. Some bits are unit tested, and some will break in silent and obscure manners. Some parts have excellent documentation, and other parts are so utterly undocumented that I don't even know they exist until I've reimplemented them and somebody asks me why I didn't use the existing code. This not only drives home old lessons deeper, it also gives me an opportunity to aid in transition, and it has taught me refactoring (not least my own code, which I rough out in a sort of draft form and rework in iterations until it has a design I am happy with—I have the luxury of working on a separate module; no one but I cares if I change my APIs).

This is very interesting, and a good learning and formative experience, and I am glad to be where I am, working what I am working on. All the same, I hope that the next project I work on (at some point I do want to give some attention to PhpDocumentor and its memory issues with large codebases…) has documentation and unit tests. I've learned the hard way; I don't need to learn the hard way again.

haggholm: (Default)

This is why PHP fails to make me happy. —Well, that's not quite accurate: Say rather that this is one of the many ways in which it so fails:

// This gives me one result...
$should_match = ( $desired_value and ($db_value == PREF_TYPE_YES)) or
                (!$desired_value and ($db_value == PREF_TYPE_NO));

// ...This gives me another result. Note the identical logic.
if (( $desired_value and ($db_value == PREF_TYPE_YES)) or
    (!$desired_value and ($db_value == PREF_TYPE_NO)))
{
	$should_match = true;
}
else
{
	$should_match = false;
}

What am I missing?

Update:

David poked at this after I threw my hands up in disgust. It turns out to be a good old precedence issue, since, sensibly enough, or does not have the same precedence as ||, and similarly, and is different from &&. Therefore,

$x = true or false;
// is not equivalent to
$x = true || false;
// but instead to
($x = true) or false;
// rather than the
$x = (true or false);
// that I expected, and that || provides.

I see my mistake, but whoever made this design decision: I hate you.

haggholm: (Default)
  • Unit testing is a very, very good thing.
  • PHPUnit is a decent unit testing framework (for those of us stuck with PHP).
  • Code coverage metrics comprise a very useful (critical?) part of unit testing: They can't promise that you've covered all possibilities, but they can tell you whether or not you've at least given all your executable LOC a run-through.
  • PHPUnit's code coverage collection requires xdebug.
  • xdebug appears to make my PHP interpreter (and Apache server) crash with no obvious pattern.
  • I've read some cautions to the effect that xdebug may crash when other debug extensions are loaded.
  • This is substantially irritating, but not mission critical for me: I don't need code coverage every time I run my tests; I just need the results. Code coverage is for when I sit down and decide what lacks proper test coverage (and right now the answer is, in any case, way too much, and I don't need metrics to find many places to work on). I won't bother to track down the specific error when I can turn off code coverage for the moment and work on the more important issue of writing some damned tests and worry more about the metrics when the test quality is such that I need tools to find the holes.
  • It's still bloody annoying.

Web stuff

Jan. 23rd, 2008 11:02 am
haggholm: (Default)

Microsoft has committed another instance of unutterable idiocy in declaring that the default behaviour of a standards compliant document is to render in quirks mode, unless you explicitly tell it not to (and even then they recommend that you specify not no quirks mode, please, but a specific version of quirks mode).

My immediate reaction was that their idea could have been great if they'd just turned it on its head (default to assuming that if I specify a standard DOCTYPE, I really do want a standard DOCTYPE; allow me to override). As I read a few articles and posts on the matter, I found one that expresses precisely my sentiment, but more articulately than I can be bothered to do right now. Go read this if you want to know what I think. If you don't—why are you reading my blog, by the way?—read it anyway to find out what that guy thinks.

Fortunately, I mostly care in a distant and academic sense. At work, I'm not the guy who has to worry about DOCTYPEs, and our tag rendering factories are maintained by the other team—I haven't had to write one yet, and if I ever do, it'll still be a minority of the labour. At home, I write pages that assume people use standards compliant browsers. I write for W3C standards first, Firefox second, Opera third, and IE a distant fourth: If I discover a problem (it will likely be in Firefox), I may choose a different, but still W3C standard compliant way of accomplishing what I want. I will not write code that contradicts standards to pander to a broken user agent, trusting rather that the UA developers will fix the bugs (or I shall use and recommend another UA); doing otherwise is tacitly writing known flaws in my documents.

haggholm: (Default)

When you use a database to store a representation of your objects (or if your objects are memory repersentations of database tables, however you prefer to view it), you may be prone to writing things like this, if you use PHP and PEAR::DB or PEAR::MDB2:

$q = "SELECT EXISTS(SELECT * FROM foo WHERE bar)";

if ($db->getOne($q))
{
    $db->exec("INSERT INTO foo ...");
}
else
{
    $db->exec("UPDATE foo SET ... WHERE bar");
}

Well, we just moved to MDB2 and I discovered the glory, so I thought, of the replace method. To quote the documentation's description of what it does, [sic]s and all,

Execute a SQL REPLACE query. A REPLACE query is identical to a INSERT query, except that if there is already a row in the table with the same key field values, the REPLACE query just updates its values instead of inserting a new row.

The REPLACE type of query does not make part of the SQL standards. Since practically only MySQL and SQLite implement it natively, this type of query isemulated through this method for other DBMS using standard types of queries inside a transaction to assure the atomicity of the operation.

Well, this sounds nice. In fact, it sounds really nice—no more conditional code, and a single method call instead of the three different SQL statements (EXISTS, INSERT, and UPDATE). However, there is one teensy little problem. According to the MySQL documentation,

REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.

Do you see? In both versions, it's effectively a replace operation, but they are semantically different, and the difference is not academic. The MDB2 docs would have you believe it performs an INSERT or UPDATE. MySQL actually does an INSERT, or a DELETE followed by an INSERT. This is significant, because if you have foreign keys on the table, and if you CASCADE on those foreign keys, then all the dependent rows will get wiped on the REPLACE of a row that already exists. Of course the row will be restored, but the dependent rows are gone forevermore. This would not be the case according to the MDB2 description.

The solution is, of course, quite obvious: Since a function that does what I thought replace does is so very nice, I wrote one. However, I am rather irritated that I lost time and wasted effort on tracking this down due to a blatant error like this. If they'd just said This emulates the MySQL REPLACE, go read about it here I wouldn't have had this problem.

Syndicate

RSS Atom

Most Popular Tags