Essays

Another Ubuntu release, another core regression

Sunday, May 3rd, 2009  

It’s business as usual over at Ubuntu headquarters. This time the “Root Terminal” menu item, installed in the system menu by default for at least the last few years, is suddenly broken. Irate users commenting on bug reports in Launchpad are dangerously close to starting a full-blown flame war:

Sebastien, your comment seems to imply that Launchpad bug reports are a waste of time. Is this really what you meant? I had been under the impression that Launchpad was intended to be a gateway/portal for bug reporting. If Launchpad reports do not get forwarded upstream automatically once triaged then what purpose does it have? -Russel Winder

and:

With all due respect Sebastien — I can hardly believe that
I’m reading this: “ubuntu only distribute it”.

(why even have a bug reporting system in the first place,
one wonders, btw.). -bjd

(That’s right, those are in response to the same Sebastien Bacher I took to task for unhelpful comments on other bugs last year.)

The bug itself isn’t Ubuntu’s fault, but the fact that the menu item survived intact in the default Ubuntu configuration despite being non-functional for (at least) the last four months speaks volumes about what passed for testing on Jaunty Jackalope1.

Temporary workaround, until Gnome fixes this regression and Ubuntu inherits it: change the menu item to gnome-terminal -e 'sudo -i'. It took me longer to write this paragraph than to change that.

  1. I’m not even going to get into how the “upgrade” process left my system unable to find the root filesystem and therefore unbootable. My memory, and a judicious application of grub-fu, saved the day, and since I’m unwilling to downgrade to Intrepid and then re-upgrade to Jaunty, this bug must remain un-duplicable and un-reported. 

Ignore your users’ needs. Call them stupid instead.

Tuesday, April 21st, 2009  

Bert Bos’s Why “variables” in CSS are harmful illustrates some all-too-common mistakes technologists make when considering feature requests from their users. It also indicates how deeply out of touch Bos (and possibly the entire W3C) is from people who actually have to read, write, debug, and use CSS on a regular basis.

It begins:

Constants have been regularly proposed and rejected over the long history of CSS…

Proposals for a feature indicate that a technology (whether it be a specification or an application) has pain points that are going un-addressed. When those requests are frequent, they indicate that the person (or organization, in this case, the W3C) in charge of the technology is out of touch with its users.

…so there is no reason why constants should be useful now when they weren’t before.

This claim that constants are not useful underlies the entire essay, but Bos fails to ever really justify it. Here he wanders around in pseudo-mathematical jargon instead:

[An implementation of costants in CSS written in PHP] proves that it is not necessary to add constants to CSS…. But the PHP implementation has the benefit of letting authors determine the usefulness for themselves, without modifying CSS on the Web.

It sounds like Bos refuses to consider an implementation of variables1 in CSS unless someone provides him with a mathematical proof of their utility. But utility is an opinion, not something that can be proven, like Turing-completeness or the irrationality of √2.

The existence of the PHP implementation Bos mentions, and of other implementations like the wonderful CleverCSS or Reddit’s vaporous C55, argues strongly that variables are useful — so useful that many people have implemented them on top of CSS. Of course, this does not prove usefulness any more than any other opinion can be proven.

Implementation effort

Next Bos considers implementation effort:

…extending CSS makes implementing more difficult and programs bigger, which leads to fewer implementations and more bugs.

Difficulty of implementation should never be a deciding factor in whether or not to address the needs of the users. This point is important enough that it bears repeating: implementation effort is not relevant when deciding what your users need.

Why not?

Technology exists to make users’ lives easier. As a technology evolves and matures, users express needs and the authors of the technology develop features to address those needs. It is the user’s needs, not easily implemented features, that drive development of a technology.

If two features serve the same need, then picking the easier-to-implement one is perfectly reasonable. And if the only way to address the users’ needs is with a feature that’s extremely difficult or impossible to implement, then a project might find itself considering whether some or all of it is still viable. But a difficult implementation is never a justification in itself for not addressing the users’ needs.

In the case of CSS, the users ask for variables because they need some way to stop repeating themselves when they encode colors, lengths, and other values in CSS.

Refusing to serve the users’ needs because the requested feature may be difficult to implement shows a lack of understanding of those users’ needs as well as poor judgment about how to handle feature requests in general.

There’s another subtle fallacy here too. When Bos worries about ease of implementation, it sounds like he’s trying to make browser authors‘ lives easier, as if they were the users that the W3C are working for. But browser authors aren’t the real users of CSS any more than, for example, the authors of a C compiler are the ultimate users of C. Web designers are the real users of CSS. They are the target audience whose needs should be considered.

Arguing from implementation effort, and talking about browser implementors instead of CSS authors illustrates how far out of touch Bos is with real web designers, doing real work.

It’s also questionable how truly difficult implementing global, un-scoped variables (or un-changing constants) in CSS would be, especially compared to other complex aspects of CSS like the cascade. But that’s a discussion for browser authors.

Maintenance of stylesheets

Next Bos argues that variables would make CSS less maintainable, not more. Bos presents two reasons that code is encapsulated behind a function in programming languages:

Dividing up a problem into smaller ones is only one reason for defining functions. Just as important is the fact that a function that fits on one screen is easier to write than one that needs scrolling.

Because CSS variables wouldn’t help divide up a problem into smaller ones or help CSS stanzas fit on the screen more easily, Bos argues, they aren’t helpful:

[Constants] would add a cost (remembering user-defined names) without a benefit (avoiding problems that are longer than one screenful).

Experienced programmers know that there’s a third benefit to encapsulating code or data behind a function, variable, or constant: not repeating yourself. This is why users keep asking for variables in CSS.  Bos goes on to say that variables would be detrimental to CSS because they would increase the length of stylesheets. However, not repeating yourself is much more important than just keeping your code short2, so this point too is moot.

This section concludes:

What remains is the cost of remembering and understanding user-defined names.

Of course, stylesheets are full of user-defined class names, and CSS authors seem to have no problem using and remembering those, so it’s hard to see how user defined variable names are going to be any more intellectually challenging for CSS authors than class names.

Reusing style sheets and Learning CSS

The next two sections, “Reusing style sheets,” and “Learning CSS” continue to conjecture that user-defined variable names would be a great hindrance to using and learning CSS. But the frequency of proposals to add variables to CSS suggests they are not difficult to understand, and including them would not significantly hinder learning CSS.

But there’s not much point in arguing over such conjectures. Arguing from the point of view of a theoretical group of users who have, and lack, certain skills, is a dangerous distraction. If you have data about your users, use it. If not, collect some before making your decisions, or base your decisions on what you know your users can already do.

For the sake of argument, assume Bos’ hypothetical group of users exists. Assume there is a subset of the CSS authoring population that can comprehend the CSS cascade, relative sizes defined in ems, hexadecimal RGB color codes, and user-defined class names, but are unable to grasp the concept of a user-defined variable in CSS. (It sounds bizarre, but that’s what he’s claiming.)

These hypothetical users could just refrain from using variables in their stylesheets whatsoever. Unlike hexadecimal colors, em units, and many other aspects of CSS, nothing about variables would force CSS authors to use them. Variables could be added to the CSS standard without increasing its complexity or the effort required to learn it.

Bos also claims that user-defined variables would break easy reusability of CSS:

CSS is fairly easy to learn to read, even if some of its effects can be quite subtle. When there is a page you like, you can look at its style sheet and see how it’s done.

Anyone who has tried to copy a CSS effect from one site to another knows how difficult it truly is.  To copy the visual appearance of a single element, you must understand not only the computed style of that element, but the computed style of all of its parent elements. You need at least a rudimentary understanding of both the CSS cascade and the structure of the HTML of the page.  To copy the look of an entire page, you have to copy all of the CSS files for that page and mimic the structure of the HTML exactly, or reverse engineer the entire thing from the ground up.

Beyond the issue of whether copying CSS effects is easy or not, however, the question is whether CSS variables would make the job more difficult.

Bos points out that in-browser debugging tools help you to copy CSS by showing you the computed style. Presumably if CSS contained variables, those debugging tools would show you the values computed using those variables, not the just the variable names.

And if you were just copying a site’s HTML structure and CSS wholesale, then there’s no reason why you would even need to read the CSS or figure out what the variables mean.

It is too difficult to look in two places at once, the place where a value is used and the place where it is defined, if you don’t know why the rule is split in this way

Of course, when reverse-engineering the CSS for a site, a designer already needs to look in multiple “places at once” — they look in multiple CSS files, match class names in the CSS to names in the HTML, and consider the effects of the cascade. Is Bos really suggesting that a person capable of doing that will be incapable of finding a variable definition in the same file where that variable is used?

Rather than showing us that CSS variables would make re-using CSS more difficult, Bos asserts that a difficult, complex process is simple, and that CSS authors already performing this task are too stupid to handle a much simpler one.

Bos also claims that figuring out what variable names mean will be difficult for CSS authors — even when debugging their own code. Most CSS authors use generally descriptive class names like green-button, huge, and floatleft3. High traffic sites run their CSS (and their HTML and JavaScript) through compressors/obfuscators, but most of those sites use descriptive names internally too. It’s hard to see how CSS variables would be named any differently than CSS classes, so it’s hard to see how CSS variables would be any more difficult for CSS authors to reverse engineer or remember.

Summary

None of Bos’s arguments against variables in CSS hold up. He claims CSS doesn’t need variables, but fails to recognize  CSS authors’ true need to avoid repeating themselves. He argues CSS variables would be too difficult to implement, but implementation difficulties are invalid grounds to justify leaving users’ needs unaddressed. He argues that CSS variables would add too much complexity to CSS and no benefit whatsoever, but overlooks a key benefit that CSS variables would provide. All his arguments about the complexity variables would allegedly add to CSS are difficult to accept given the current complexity of CSS.

A feature request is a need in disguise, and multiple, persistent feature requests indicate a serious need behind a very thin disguise. Rather than arguing against a feature, you should endeavor to understand the underlying need. Rather than arguing from implementation complexity, you should decide whether that need must be addressed. Rather than arguing from hypothetical, invented users, and speculating about complexity, you should collect real user data or look at the kinds of tasks your users already handle.

This entire article4 calls into question Bos’ ability (and, by association, the W3C’s) to identify and address the needs of real CSS users and choose features to solve real shortcomings of CSS. I hope my analysis of this article helps other technologists learn to understand and address their users’ real needs better, and avoid poor reasoning when arguing against, or for, a specific feature.

For more on problems with CSS, see CSS Considered Unstylish.

  1. For brevity I’ve chosen use just the term variable throughout this article, even though all the points I make apply equally well to constants
  2. Bos’ point about the “computer screen becoming an extension of the programmer’s memory” is bizarre in the extreme. Even the best programmers or web designers will quickly end up with a program or stylesheet that’s bigger than will fit on a screeen, when working on anything but the most simple projects, if for no other reason than the project being split up into multiple files. 
  3. Perhaps Bos does not always use descriptive class names; The stylesheet for his article uses the class names yves and coralie
  4. The very end of the article has a clever suggestion: that constants be implemented as an external module. I’m not sure how this would work, but if it meant that a single set of constants for a site would be accessible in the HTML, and all of the site’s stylesheets, and maybe in the JavaScript too, well, that would be pretty cool. 

The next big thing, part 3: Taking the relational out of relational databases

Monday, April 6th, 2009  

Part of an ongoing series.

The relational database is an extremely powerful tool. But sometimes data isn’t very relational, and sometimes transactional, relational, integrity is not as important as it is for, say, a bank. This is one reason why so many sites can get away with mySQL backed by myISAM tables — they’re fine if you’re read-heavy and data integrity is not mission-critical.

Some new projects have sprung up which provide key-value stores or simpler kinds of databases without all the overhead and inflexibility of a relational database.

On the other hand, sometimes data is way more interrelated than a traditional relational database is prepared to handle. Sometimes different kinds of items (i.e. rows) in a database can be related to many other kinds of items in that database, and sometimes end users can create not just new items or new relationships, but new kinds of relationships between items. This type of database is called a graph database, and there are also projects pushing the boundaries of relational in this completely opposite direction.

Pretty much everywhere I interviewed back in February 2008 was either building their own graph database, working on an existing one, or repurposing a relational database (or, in one case, a search backend), to kinda, sorta behave like one. The w3c, not one to be left behind when there’s a specification to be written, is even working on a SQL-inspired query language intended to search them1.

Most applications have some combination of totally un-relational data that can go in a key-value store, some strictly relational data that belongs in a SQL database, and some flexible, highly relational data that belongs in a graph database.

What will happen when these alternative databases start giving traditional relational databases a run for their money? Well, sharding, caching, and normalization all start to sound a lot more complex when the data is in a few different kinds of databases — but then again, maybe optimization won’t be as necessary if a single SQL database isn’t doing all the heavy lifting. Object-relational mappers (and the web frameworks that use them) might need to talk to, and abstract away from, different kinds of databases2.

And the different types of data won’t always be easily separated along table boundaries. Maybe these different types of databases will talk to each other, or maybe they will mature into über-databases that understand lots of different types of data relationships.

But the monolithic, strictly relational, master SQL database is eventually going to go the way of Cobol3.

  1. Of course, if it’s anything like other technologies designed by the w3c, it’s a steaming pile. 
  2. Some can already handle talking to multiple SQL databases, and of course there’s two-phase commit
  3. Or Kobol

The next big thing, part 2: Taking the web out of web applications

Monday, March 30th, 2009  

Part of an ongoing series.

A web application is just a stateless1 application that responds to various requests by performing actions and providing resources. There’s no fundamental reason an application must only communicate over HTTP. Web applications are going to start adding alternative methods of interaction, and I think the first common one will be email.

Perhaps an example will best illustrate this:

Like many web forums, posts to Mosuki’s discussion forums get mailed out in email. But, unlike any other web forums I know of, they also behave like mailing lists. All the emails have a reply-to header with an email address that identifies the message, the recipient, and the action to be taken if that email address is used. In this case, the contents of a reply email are posted to the forum exactly as if a reply had been posted via the website.

In other words, the action “post a message” can be accessed via a web page and a browser or via a reply-to header and your mail client.

There are other examples of this separation between input/output channels and the application logic. The most obvious is Twitter, which of course can be interacted with via HTTP or SMS2. And the Son of Sam project intends to let you “use modern concepts like handlers, requests, responses, state machines” to interact with email.

Confirm a Facebook friend request, RSVP to an Evite, revert a Wikipedia edit, or reassign a bug report, just by replying to an email or sending an SMS.

There are a number of technical issues inherent a system like this.  An application’s framework has to handle multiple input channels, and massage email bodies, HTTP requests, and other input into a least common denominator “request.” Authenticating a user via email, an intrinsically forgeable medium, and protecting against spam, are non-trivial challenges. And a suite of templates suddenly gets a lot more complex when it has to provide views for multiple types of interfaces3 .

This blurring of the line between email, HTTP, SMS, and other communications is not new, strictly speaking. But I think it will become commonplace and even expected. Rather than writing a modern (MVC, stateless, REST-ful, &c.) web application, people will be writing modern (MVC, stateless, REST-ful, blah, blah, blah) applications that have web interfaces, email interfaces, and whatever other interfaces they need.

Stay tuned for the next installment of The next big thing: Taking the relational out of relational databases.

  1. More or less stateless, that is, authentication tokens like cookies notwithstanding. 
  2. As well as more standalone apps than you can shake a stick at. 
  3. Generating text and HTML responses for email that look good and work well in the top 75% of desktop and web email clients is a lot harder than testing a site’s HTML in Firefox, IE, Safari and Opera. 

Why your all-graphic website sucks

Friday, March 13th, 2009  

Using only graphics to build a website is 1996’s version of using Flash to build an entire site. Why?

  1. Your users can’t copy and paste the text. You know, if you were, for example, promoting an event and someone wanted to copy the event description into an email or onto an events website.
  2. They can’t scale the text up — even though Firefox’s page zoom will scale the text-images up, they won’t get easier to read, just uglier.
  3. Inline search doesn’t work.
  4. Screenreaders and webcrawlers are out of luck.
  5. And the page takes forever to load. What’s that? Load times don’t matter so much anymore, now that most people are on DSL? Try loading this page on your phone, over Edge. Blazing fast.

The photo credits are text, not images. The author of this page can’t plead ignorance of how to put text into a web page.

An all-image website doesn’t get in the way of proper scrolling, UI widgets, and functioning URLs (although the URL to this one seems a bit redundant). So building an entire site out of Flash is dumber than using images for all your text. That’s really saying something.

P.S. At least their images are properly transparent PNGs.

P.P.S. At least they didn’t lay the page out using <table> tags. <div> and <span> FTW!

P.P.P.S. This post should not be misinterpreted as denigrating the venerable Cacophony Society or the Brides of March. All denigration is directed soley at their web design. Any failure on the part of the reader to not take this post seriously is not my responsibility.

The Fifth Bottleneck

Wednesday, March 11th, 2009  

CodingHorror points out that the game of “find the bottleneck” that is computer performance optimization is always looking for a bottleneck in CPU, disk, network, or memory.

But there’s a fifth bottleneck — a fifth resource most applications wait on. The user.

If an interface is too difficult to understand, or if an action takes too many clicks or keystrokes, the application will be stuck waiting on the user. If an interface is really bad, the application will sit idle while the user is searching for “how to do X in ProApp 8.0,” or reading the manual, or asking their friends for help, instead of working. And the ultimate interface failure, when a user decides to stop using an application, means, from the point of view of performance, that it will never complete — it’s blocked forever.

Sure, a bad interface won’t slow down a computer. But it does slow the user down. And that’s why programmers care about performance – because we humans want to complete our tasks faster, not because we want computers to complete their tasks faster.

What isn’t new in Ruby 1.9.1 (or in Python)

Saturday, January 31st, 2009  

Like Josh Haberman, I was excited to see the changelog for Ruby 1.9, but immediately disappointed by its vagueness and terseness.

This list is meaningless to anyone who isn’t already familiar with the changes that have been happening in Ruby 1.9.x.

For someone like me who tried an older version of Ruby, there’s nothing to read that will tell me whether it’s worth checking out again.

Take this example, from the changelog:

  • IO operations
    • Many methods used to act byte-wise but now some of those act character-wise. You can use alternate byte-wise methods.

That’s terrifying. If I’m switching to a new version, I need to know exactly which methods have changed and which ones haven’t. Saying that “some” have changed is almost less helpful than saying nothing at all.

Here’s hoping that “improved documentation” will make it into a future Ruby 1.9.x release.

In the same blog post, Haberman makes some inaccurate assertions about Python’s encoding support:

Python has taken an “everything is Unicode” approach to character encoding — it doesn’t support any other encodings. Ruby on the other hand supports arbitrary encodings, for both Ruby source files and for data that Ruby programs deal with.

Incorrect. For the last five and a half years, since Python 2.3, source code in any encoding has been supported, and Python 3.0 will expect UTF-8 by default. And of course, Python supports exactly the same wide range of encodings for data. Python’s approach can best be described as “Unicode (via UTF-8) is default.”

Eight Python warts

Thursday, January 15th, 2009  

I love Python, but a few things still bug me about it. I’ve bashed on several other technologies; here’s some Python bashing. In no particular order:

Update: This has started a pretty good discussion on Reddit. Many people correctly guessed that I’m using singleton in the mathematical sense, not in the sense of the programming pattern. The comments from Cairnarvon and tghw are particularly worth reading.

How to avoid sounding like a total asshole when talking about the internet, even though you actually have no idea how it works

Monday, December 1st, 2008  

Alternate title: I know it’s time to get out of the house when I start exercising my license to practice linguistics.

The mainstream media and people who are not internet-savvy have radically different uses of the verbs log in, log on, click on, download and upload.

The common, internet-savvy definitions for log in/on and click on are:

log in, log on: to identify yourself and gain access to a resource on a computer by providing authenticating information.

click on: to press the mouse button down and then release it, while the mouse pointer is over a visible link, menu item, or icon or other resource on the computer screen.

The non-internet-savvy use appears to simply be shorthand for the verb go or visit. For example, an announcer on the evening news might say:1

For more details on this story, log on to our website at www….

No identifying information is needed to allow access to the news’ website; the intended meaning is simply to visit. The particle constructions click on and click on to also are occasionally also used with this same intended meaning:

For more details on this story, click on to our website at www….

For more details on this story, click on our website at www….

The news announcer wouldn’t be reading out the website address unless the listener needed to type it in first. Since the address isn’t going to be visible on the listener’s screen until they type it in, there’s nothing to click on. And once the listener is finished typing it in, they’ll be automatically taken to the site when they push enter. And in no modern browser is the website address ever clickable. So this usage of click on has the meaning of go or visit as well.

The internet-savvy uses of download and upload are deictic, like the verbs send and recieve, or come and go. That is, their meaning is dependent on the location of the agent performing the action:

download: to cause electronic data, files, or other information to move towards the agent, generally over a network.

upload: to cause electronic data, files, or other information to move away from the agent, generally over a network.

For example, consider two people, Alice and Bob. Alice is working from home and Bob is at the office.  If Alice is going to transfer a file from her computer at home to a computer at office, where Bob is, they would both use upload, because the data is moving away from the agent, Alice:

Bob: Can you upload today’s TPS reports?

Alice: Sure. I’ll let you know when they’re done uploading.

Download in place of upload would mean that the TPS reports2 were on the computer at the office, and Alice was transferring them to her computer at home:

Bob: Can you download today’s TPS reports?

Alice: Sure. I’ll let you know when they’re done downloading.

Similarly, if Bob were the one initiating the action of obtaining the information from Alice’s computer, he would use download, and if he were sending the information to Alice’s computer, he would use upload.3

The non-internet-savvy usage of download simply means transfer. It has no deictic component and encompasses the net-savvy meaning of upload and download, as well as that of simple copying:

Once this file is finished downloading to my Hotmail I’ll download it to you in an email.

I downloaded the photos from the CD to my computer and now they won’t load.

Unsurprisingly, upload does not seem to be present in the non-internet-savvy dialect; possibly because download encompasses its meaning entirely.

Although there are probably more differently-used terms, these four usage differences are more than enough for internet-savvy speakers to identify non-internet-savvy speakers.  Generally the internet-savvy speakers then make the same kind of assumptions that adults make about children who mimic and misuse words they’ve just learned — specifically, that the speaker has no idea what the words he or she is using actually mean, and therefore no idea what he or she is talking about.

Sounding internet-savvy is easy; then, even if you’re not. Use visit instead of log in/log on when the user doesn’t actually have to identify themselves to use a website or other resource.  Don’t use click on unless there’s actually something to click on. And think about where the data is going and who’s making it go there before choosing between upload and download. Before you know it, people will start asking you for help debugging their IPv6 firewall rules and recompiling their embedded Linux kernels.4

  1. That’s right, Fox News: read up and find out how real live internet users talk about it. 
  2. The type of report is irrelevant to these examples. 
  3. Unless, of course, Bob or Alice were downloading something illegal with BitTorrent or other peer-to-peer software, in which case they would say something like: d00d eye m d0wnlo4d1n9 t|-|is t0rr3nt @ th3 k-rad sp33d of 7.3KB/s!!!1!!1!  
  4. Don’t actually help them do these things, though. You might break something. 

Worst website ever

Tuesday, November 25th, 2008  

Zzzphone.com. Excessive use of flash. Unsolicited, over-compressed, auto-starting audio (on the specs page; there’s no surer way to get users to leave a site and never come back than automatically assaulting them with unexpected audio when the page loads). Random fonts and sizes, and graphics containing text. Best part: click on the shopping cart link at the bottom of the page (obviously without ordering anything first), and this comes up:

Step 3 of 6? Am I allowed to skip steps like that? Buy another nothing? Can I get a discount on that? (via Max.)

Oh, and in case anyone out there needed reminding that you can design complex, interactive, highly graphical sites in HTML without Flash, check out Shiftn’s Obesity System Influence Diagram.