Shortlog - a log of everyday things

Home

2012-02-17

On plaintext markup formats.

I can't say I'm satisfied with any of the current plaintext markup formats. All of them have warts. The most popular on the web at this point is probably Markdown. Github uses it. So do Reddit and StackOverflow. So do a variety of blog engines, including Posterous, Tumblr, and WordPress. And for good reason: it's probably the best one out there.

What do I think about it? I love some things, and I really hate some others. First, the good:

My dislikes tend to stem from a few things where I think the syntax either looks atrocious or strikes me as poorly designed:

Even with all these complaints, Markdown is the best we have. There are projects that have extended it to improve (among other things) the table situation. The alternatives fairly uniformly suck: reStructuredText suffers from about two-thirds of the same criticisms, lacks nearly any redeeming features, and requires way too many empty lines. AsciiDoc has the same atrocious header behavior, has horribly difficult to read plaintext, also wastes underscores on italics, and has horrid list nesting if you have list items longer than six characters. BBCode is practically HTML replacing < and > with [ and ]. Mediawiki is intended to have a richness of features that goes far beyond what most text-to-HTML engines require, but does some silly things with using nothing but 's for italics and bold, and the lists are ugly. Textile has a somewhat redeeming table structure, but the lists have the same issues as most other markup formats, code blocking and headings are so close to HTML proper as to be not worth using, and they use underscores for italics again.

Now, I happen to like a lot of things about dokuwiki's syntax. I find the list syntax of dokuwiki to be attractive. It's got its own heap of problems - lists can't contain a whole lot of other possibly interesting fields, nested lists behave inconsistently depending on how you indent them, and list items can't span multiple lines. That last criticism I see as a feature; if you're writing such long bullet points that you need to split them into multiple paragraphs, you're using bullet points incorrectly. Alone of the text markup formats I've seen, dokuwiki's lists require the indentation of the standin for the bulletpoint itself at the zeroth level.

Dokuwiki's table format is also terse, yet effective. It doesn't appear to support super-fancy use, but is simple and readable enough that I like it. Using the whitespace inside cells to determine alignment is clever and well-done. Dokuwiki also is rare in that it gets the use of underscores and asterisks correct as well.

Dokuwiki's linking system is subpar - it's clearly meant for a wiki, which makes it slightly harder to repurpose, and it's not clear how embedding of assets works if you're not using them in the context of a wiki. I'm not a fan of using \\ as a line break either.

The biggest problem with Dokuwiki's syntax, though, is that it appears to have exactly one implementation, and that's the (extensible!) codependent lexer and parser that sit in the dokuwiki PHP source. I cannot find any renderers for its markup in any other language. And that is a shame.

I'm in the process of making my own derivative plaintext markup language that takes the behaviors I like from markdown and dokuwiki. I will, when it's ready, use it internally on my blog when I write entries. I hate that it'll be Yet Another Plaintext Markup Format, but at least I'll be happy with it.

And that's just one of the ~six technical things I've been working on this week. If you're curious about the others, drop me a line, and I'll try to write up my progress on each of them a little more.


Comments:

avatar from Gravatar

Jono | 2012-03-19T14:35:28.727696

I'd be interested to hear about the other technical things!

avatar from Gravatar

Drew | 2012-03-19T15:54:15.789222

Will do!

By the way, if you're wondering why all these stories appeared suddenly in your feedreader: turns out Google-Feedfetcher stopped polling my feed in January when I migrated DNS to another machine but briefly failed to drop the old machine's authoritative claim for the zone. Since I've fixed that now and explicitly told Google-Feedfetcher to start polling again, the feed should be back and updating when I write things. :)

avatar from Gravatar

Jono | 2012-03-19T15:58:07.270735

Ah! I must admit, I simply assumed you had gotten behind on blogging and back-dated a series of posts :-)