Skip to main content

Again, this is more for my benefit than for yours. If I don't write this down, I'll forget it.

Dive Into Python 3 was commissioned in January 2009 by Apress, who published the original Dive Into Python in 2004. Upon agreeing to contract terms, I registered a ten-year lease on and immediately published a draft table of contents.

The original DiP was written in DocBook XML. As I've mentioned before, I chose DocBook XML because I wanted to learn XML and XSL, and DocBook seemed to be Just The Thing for technical documentation. There was also a bit of self-grandeur involved. I was writing a book For The Ages, so it was important that it be in a Format Of Forever. And in the short term, I could transform The Format Of Forever into useful (but lowly) Output Formats, so I could do unimportant things like publish it online.

For The Ages turned out to be about 10 years. The Format Of Forever is still going strong, but Python itself changed so quickly that it didn't matter.

Oh, and there was one other little thing that happened between 2000 and 2009: search stopped sucking and took over the web. Kids today may not remember, but it used to be hard to find stuff on the web. Once you found it, you wanted to download it so you could read it offline.

Remember being "offline"?

Anyway, I now realize that there were some hidden assumptions behind my design decisions in 2000. Some of those assumptions turned out to be wrong, or at least not-completely-right. Sure, a lot of people downloaded DiP, but it still pales in comparison to the number of visitors I got from search traffic. In 2000, I fretted about my "home page" and my "navigation aids." Nobody cares about any of that anymore, and I have nine years of access logs to prove it.

So, I am writing DiP3 in pure HTML and, modulo some lossless minimizations, publishing exactly what I write. This makes the proofreading feedback cycle faster -- instead of "building" the HTML output, I just hit Ctrl-R. I expected it to make some things more complicated, but they turn out not to matter very much.

Some examples:

  • DocBook autonumbers callouts in code blocks. In HTML, if I insert a callout in the middle of a code block, I have to renumber the other callouts manually. This has happened a few times, but it's not something I do all the time.
  • When I used DocBook, I used custom XML entities for each code fragment in a code block. (i.e. I show a complete function, then I have separate code blocks to show and talk about individual lines of code.) In HTML, I copy-and-paste. This hasn't bit me much yet because I haven't had to refactor much code. Then again, the XML entities technique was also difficult after refactoring, and it required escaping to XML's weird quoting rules.
  • DocBook let me "chunk" the HTML output differently -- by section, by chapter, or everything in one big file. I published by section online and let people download as one big file. Nobody cared, and I have logs to prove it. Furthermore, the chapter was always the atomic unit -- I would take one concept or one piece of code and build lessons around it for an entire chapter. Splitting up a chapter into multiple pages was just annoying for people who came in from search engines and landed in the middle of a chapter wouldn't immediately understand the context. And everyone comes in from search engines.
  • Related to the previous point, HTML is not an output format. HTML is The Format. Not The Format Of Forever, but damn if it isn't The Format Of The Now. Looking back on it now, the HTML output from DocBook was atrociously inefficient. Tables-for-layout for callouts and tips, empty anchors everywhere, and lots of other markup cruft everywhere. Writing my own HTML, I can put in only the markup I need -- and it turns out I need very little. Callouts are now ordered lists, tips are now blockquotes, etc. Putting an entire chapter on one page sounds bloated, but consider this: my longest chapter so far would be 75 printed pages, and it loads in under 5 seconds. On dialup.

Furthermore, I am no longer under the illusion that this book will be useful forever. Python will either continue to evolve or it will die; either way, static documentation has a shelf life. Today's cutting edge code is tomorrow's mainstream code is next year's legacy code. DiP's shelf life was about 10 years. I am supremely confident that the HTML I'm writing today will still be readable 10 years from now, and after that it won't matter because I'll have to rewrite the whole damn book anyway.

See you in 2020 for Dive Into Python 4!



Welcome to the first annual "dive into mark" show! It's just like reading my blog, except it takes forever to download, requires an unwieldy array of third-party software, and it's not accessible to blind people, deaf people, or search engines.

Also, it involves shaving. I hate shaving.

In the "news I don't care about" department, a company I'll never work for has announced that it will not be shipping a new filesystem I'd never trust in an operating system I'll never use. The so-called "WinFS" filesystem was supposed to feature rich metadata and schemas to help you organize your ever-growing porn collection.

Joe Gregorio, seen here preaching the Gospel of Atom, predicted the non-shipping-ness of WinFS in 2003, saying "WinFS is the file system formerly called Cairo and has repeatedly not shipped since 1995. If it ever did ship it would be a complete failure because it does not solve a problem that anyone actually has."


And now, here's a handy phrase I've taught my two-year-old to say: "Mama, bite me."

Regular readers of my blog are painfully aware that I recently switched to Ubuntu after 22 years of servitude to Apple. Ubuntu is an ancient African word meaning "can't install Debian". One thing I neglected to mention about my switch is EasyUbuntu, an application designed to take all the pain out of violating patents, breaking laws, and compromising the very principles that led you to Linux in the first place, in exchange for being able to watch a recreation of the latest box office hit in 30 seconds with bunnies.


In personal news, my book Dive Into Python, which hit bookstores almost two years ago, has finally earned out its advance. "Earned out" is a technical term that means that enough suckers have paid money for a book they could have downloaded from free that I'm finally starting to get royalty checks. My first check was for $247. Here's a tip for aspiring young authors: don't quit your day job.


[Dive Into Python]

Please buy 4000 copies so I can pay back my advance. Thank you.


Dive into Python is almost finished. I still need to write one more chapter, but I've incorporated all the revisions from the technical reviewer (the ultra-talented Anna Ravenscroft). Now the copy editor is wielding her virtual pen and striking through every word I've ever written. Incorporating her revisions is simultaneously humbling, enlightening, and mind-numbingly tedious.

Here are the main things I've learned so far:

[Eats, Shoots & Leaves]

  • I use have to when I mean need to.
  • I misplace the word only. Instead of you can only walk through a stream once, the copy editor prefers you can walk through a stream only once.
  • I use lots when I mean a lot.
  • I use which when I mean that.
  • I overuse footnotes to be cute. This is a bad habit I picked up from the interactive fiction version of Hitchhiker's Guide to the Galaxy and the infamous footnote 12.
  • I use like when I mean such as.
  • I use then immediately after a comma, when I mean and then.
  • I overuse semicolons for no particular reason except that I've always liked them.
  • I use note when I mean notice, and vice-versa.
  • I use we when I mean you. As we saw in the previous chapter... We'll work through this example line by line. And so forth. Apparently we won't be working through this example. You will be working through this example; I will be in the Bahamas drinking my royalty check.

In related news, my copy of Eats, Shoots, & Leaves arrived yesterday. It is hysterically funny, if you like that sort of thing.

In vaguely related news, here is a tip for people who do a lot of work in Docbook. In Firefox, create a new bookmark for, and then give it a keyword (I used the letter d). Then you can type things like d xref into your address bar to go to the xref reference page. Go go gadget hyperlink.


Last weekend someone told me that there was no male counterpart to female intuition. i.e. There was no such thing as male intuition. Which is crap. Men may not be the brightest bulbs in the bunch, but we can sense one thing: when we are being introduced to our girlfriend's next lover. Trust me. I've been on both sides of this.

On today's hardware, I can build the HTML, PDF, Word, and plain text versions of Dive Into Python in 3 minutes. This took 12 minutes 2 years ago. Moore�s Law. Mooooooore�s Laaaaaaw.

Overheard: Oh, now I see why you're divorced.

I am strongly in favor of allowing heterosexuals to continue to marry. I don't believe, as some people do, that only homosexuals are capable of loving, committed relationships. I know, 50% divorce rate, blah blah. That doesn't mean people should be denied the opportunity solely based on their sexual orientation. It's not like being heterosexual is a choice.

I also feel strongly that sighted people should be empowered to use the Internet. I don't care how well-structured and semantic your markup is; if it doesn't look good on screen, you're discriminating against people who can't afford screen readers, and that ain't right.

BigCo wisdom of the day: the commute is hellish, and the bureaucracy is staggering, but sometimes there's free cake.

I'm having my eyebrows waxed on a regular basis now.

That's it, really.