apocryph.org Notes to my future self

16Mar/082

Requirements for Drupal migration tool

After my flash of insight I’ve decided to build a tool to help me migrate apocryph.org away from drupal.

Requirements are:

  • Work with my Drupal configuration
  • Not so tightly coupled to my Drupal configuration that no one else can use it
  • Output posts in a neutral format that users can post-process and import into other tools
  • Preserve all the important elements of each post, including:
    • Formatting. Most posts are in Markdown, and a few are in SmartyPants. Those must be converted to XHTML using the same rules which generate the markup in Drupal
    • Files. A few of my posts have files attached, usually images but sometimes other stuff. Those files must be preserved themselves, and any references to the files from within a post (like IMG or A elements) must be preserved as well
    • Metadata. The tags, author, timestamp, published/unpublished flags must be preserved
    • Links. I often link between posts, and the migrated output must preserve those links. They can’t go away or be 404
    • URLs (ideal but not required). Notwithstanding the above requirement to preserve intrasite links, it would be vastly preferable of the actual URL of each migrated posting could be preserved so any existing links to pages on Apocryph are preserved.
    • Comments. Existing comments attached to posts must be preserved in their entirety.

Right now I’m looking at the WordPress eXtended RSS (WXR) format. It build on the standard RSS format with additional WP-specific tags, which I think will accommodate my requirements. As an added bonus, WordPress has built-in import support for WXR files, so I can easily suck the resulting file into WordPress, which I’ve selected as the successor to Drupal.

17Feb/081

Lesson Learned: Face Detection is Hard

A while back I thought it would be a good idea to implement face detection in Gallery2, based on a similar feature in Facebook. I downloaded the OpenCV computer vision toolkit and ran the facedetect sample app against a collection of 60 photos from my gallery, some with no faces, others with a single face, still others with multiple faces, faces in profile, etc. No matter which training file I used, the face detection was horribly unreliable. I would estimate 10% of actual faces were detected, and of all the faces reported by the tool, maybe 5% were actual faces.

This was pretty disappointing, so I uploaded the same corpus of photos to Facebook to see how well it did with them. Imagine my surprise when it didn’t detect any! The face detection I was seeing on my sister’s profile was due to manual tagging of the photos from within the upload applet, and was not driven by computer vision algorithms. That explains that.

I still think a similar feature in Gallery would be nice, but with the CV aspect removed it doesn’t seem quite so fun.

28Jan/082

Project Idea: Face detection in Gallery2

This past weekend my little sister and I were going through the Facebook profiles of various cousins, and I noticed something about Facebook’s photo support that I somehow missed before: it automatically detects the presence of faces in each photo, and allows users to tag each face with the identity of its owner. Already-tagged faces have the owner’s name superimposed over the image.

That’s an awesome feature, and reminds me of the stuff Riya was working on a few years back (although FB doesn’t do facial recognition (yet, anyway), so you still have to tag everyone yourself). I was immediately jealous that my photo hosting software of choice, Gallery, didn’t have this feature.

I investigated this a bit, and I found that Intel’s OpenCV library includes open-source face detection code. Using Intel’s sample face detect app, I found it to be both quick (~150ms per photo) and accurate. I wonder how much work would be required to create a Gallery module that used opencv to detect faces in photos, and provided an AJAX UI for tagging the photos. It would certainly be cool.

1Jan/086

Proof of Concept: Bluetooth SMS Chat

As soon as I got my Nokia N810 Internet Tablet, I set up the WLAN and configured the built-in chat application to use my Google Apps for Domains GTalk accounts. Chatting on the slide-out keyboard from the comfort of a TV chair or couch was immediately sweet, and the automatic conversation archiving still allows me to go back and look at transcripts.

The problem is this doesn’t work when I’m out of range of an access point. Sure, I spend the bulk of my life within range of my home and office access points, but sometimes I’m traveling or out and about. I could of course use my Motorola K1M as a teathered EVDO device, but I’m not willing to pay VZW $60/mo for the privilege, nor am I willing to fraudulently use the EVDO without paying for it by manually modifying the Mobile IP profile. So, where does that leave me?

There’s only one phone service I can use without limit, and that’s SMS messaging. As it happens, the main app I want to use on the move is…chatting! So why can’t I use the N810 chat functionality with my K1M phone to chat via SMS? Indeed.

With the idea in mind, I set about looking for software to do what I want. I found gnokii, which supposedly supported all the necessary SMS functions (list, send, receive, register for receive notifications), but I couldn’t get it to run on the N810′s OS2008 to save my life. A few other SMS toolkits like SMSlib and Gammu weren’t any better.

Next I figured I could write something myself, at least as a proof of concept. So I started inquiring as to how one does SMS over Bluetooth. After much googling and gnashing of teeth, I determined that, in most cases, one connects to the phone over a serial link (USB, Bluetooth, or RS-232; the physical details don’t matter), and sends–get this–AT commands.

Unless you’re at least my age, you don’t know what AT commands are. Back in the dark ages, before broadband, we used devices called ‘modems’ to connect to other computers over regular phone lines. Modems were God-forsaken devices trying in vain to link the 1950s phone technology to the coming 21st century. One controlled a modem by sending ‘AT’ commands, so-called because each command began with ‘AT’. For example, ‘ATDT’ means ‘dial with DTMF’, ‘ATH’ means ‘hangup’, etc. Apparently, some European cell phone back in the 80′s used AT commands for its automation interface, and everyone ever sense has gone the same way.

Once I realized I was dealing with the AT commandset, I still needed to figure out what AT commands I needed. To start with, I took a guess that the COM port that appears on my laptop when I plug my K1M into the USB port (only because I installed the Motorola USB drivers; don’t expect shit like this to be plug-and-play) is the serial link suitable for using AT commands. To test the theory I plugged in my phone, fired up HyperTerminal, configured it to connect to the raw serial port used by my phone, and typed ‘ATZ’, which resets the state of the AT command processor. Lo! and behold, I got the ‘OK’ response back, confirming I was talking to an AT-aware device.

Once I had that working, it was time to figure out what exact AT commands to use. By chance I ran across gammu, a bit of Czech open source that supposedly exposes most of the data functions of different types of phones, in the form of a command line. After some fucking about I came up with a gammurc file that worked with my phone, gave it the COM port I found earlier, and had it do some commands like listing folders and such. The cool think about Gammu is it can generate a logfile of all the stuff it sends to and receives from the COM port, which is a great way for me to figure out what commands do what.

I was able to do a gammu text getsmsfolders successfully, but then I did gammu text getsms 1 1 and was nailed with Function not implemented. Help required.. Doh!

Getting desperate, I went to the Motorola developer’s site. I poked around their tech docs, and stumbled across a PDF document called “G24 AT Commands Developer Guide”. The G24 is a cellular card for use embedding cellular technology into electronic devices, but my hope was Motorola developers don’t reinvent the wheel (especially the AT commandset wheel) with every cellular device, and thus that the AT commands for the G24 resemble those for the K1M.

I skimmed the document, testing out various commands in HyperTerm. Some of them worked on my K1M, and others didn’t, then I got to the ‘SMS’ section of the document and hit paydirt. The document walked through the whole process of listing messages, choosing folders, writing messages, sending them, and deleting them. Yes!

You can get the details for yourself by downloading the G24 AT guide (motodev accounts are free), but here’s a sample (note this is a transcript from my test session; not all the commands are relevant to sending SMS):


AT+MODE=2
OK

+MBAN: Copyright 2000-2004 Motorola, Inc.
AT+CGMI
+CGMI: “Motorola CE, Copyright 2000″

OK
AT+CGMM
+CGMM: “CDMA800″,”CDMA1900″,”MODEL=K1mm”

OK
AT+CGMR
+CGMR: “24.0_00.26.0F”

AT+CPBS=?
+CPBS: (“ME”,”MT”,”ON”,”DC”,”MC”,”RC”,”AD”,”QD”)

OK
AT+CPMS=?
+CPMS: (“MT”,”IM”,”OM”,”DM”),(“OM”,”DM”),(“IM”)

OK
AT+CPMS=”IM”
+CPMS: 0,179

OK
AT+CMGF=1
OK
AT+CMGL=”ALL”
OK
AT+CPMS=”MT”
+CPMS: 4,179

AT+CMGW=”xxxxxxxxxx” <—– This is the destination number for the SMS

Fear is the mind killer <—– This is the text of the message, terminated by a Control-Z character
+CMGW: 121

OK

Pretty sweet! There’s also docs for call control, changing the character set, and GPRS stuff.

So, I had the basic AT commands needed to list, read, and send SMS messages. That’s great, but I was only able to do it with a USB connection to the phone, on a Windows machine with special Motorola drivers. I want it over Bluetooth, on my N810.

As a first step, I thought I’d try Bluetooth under Linux, which is basically what the N810 environment is. So I fired up thorby, my Ubuntu Feisty VM, and plugged in my Linksys USBBT100 USB Bluetooth adapter. Linux detected it right away, and I paired it with my phone in the usual way. Using these instructions, I setup a /etc/bluetooth/rmcomm.conf file mapping the DUN profile of my phone to /dev/rfcomm0. Since my phone doesn’t support the Bluetooth Serial Port Profile (SPP), I figured the only other profile that might allow me to issue AT commands is Dial-up networking, since after all it is semantically a dial-up modem device.

Once I had /dev/rfcomm0 mapped, I installed gtkterm, which is a GTK+ terminal program along the lines of HyperTerm. I pointed it at /dev/rfcomm0, 115200 bps, 8-N-1, and issued ATZ. Once again, I got back OK, and I was in!

I then issued more or less the same commands as above, and was able to send another SMS message, this time over Bluetooth!

So, at this point I now know that I can, in theory, use my N810 as a sort of SMS client via a Bluetooth connection to my K1M. I only today got the Maemo 4.0 development environment working, so I’m a bit behind the curve on building apps for the N810, but my first instinct is to try to get gnokii working, since it already has a bunch of shit in it to support a wide range of phones, and there’s really no reason to reinvent the wheel. However, if gnokii gives me any pushback, I’ll just roll my own and worry about supporting other phones if my prototype works to my satisfaction.

I must say, this little N810 has been just the thing to get me back to hacking on embedded devices; I forgot how rewarding it can be to slog through a jungle of underdocumented proprietary voodoo, and come out the other side with a neat solution.

26Aug/073

Rendering disabled ('grayed out') image buttons in WPF

Earlier today I was ranting about lack of built-in support for grayed-out image buttons in WPF. I’ve come up with two workarounds; one correct, and one workable.

The first one I figured out by looking at some C# code here. The idea is to use a FormatConvertedBitmap to convert the original image to grayscale. This works, but then kills the alpha channel so your transparent PNGs aren’t transparent anymore. Thus, I also use the OpaqueMask property of Image set to an ImageBrush based on the original image. It’s nasty, complicated, and (probably, in a real app) slow. Here’s the markup:

<TextBlock>
    This is a grayed Image loaded from a BitmapImage, with an opacity mask :
    <Button IsEnabled="false">
    <Image Height="16">
        <Image.Source>
            <FormatConvertedBitmap DestinationFormat="Gray32Float">
                <FormatConvertedBitmap.Source>
                    <BitmapImage UriSource="Images\OutdentHS.png" />
                </FormatConvertedBitmap.Source>
            </FormatConvertedBitmap>
        </Image.Source>
        <Image.OpacityMask>
            <ImageBrush>
                <ImageBrush.ImageSource>
                    <BitmapImage UriSource="Images\OutdentHS.png" />
                </ImageBrush.ImageSource>
            </ImageBrush>
        </Image.OpacityMask>
    </Image>
    </Button>
</TextBlock>

As you can see, the original image is read into FormatConvertedBitmap, which is used to convert to Gray32Float (Gray8 is probably better), and used as the Source of an Image. The Image‘s OpaqueMask is set to the original image via an ImageBrush. As you test this out, don’t use Expression Blend; the designer view renders the grayscale Image elements as blank, even though in IE or a desktop XAML app it looks fine.

I also have NFI how this solution could be generalized into a style so you don’t have to repeat this for every button. Obviously the original image and grayscale version could be expressed as resources (I did it my way for clarity), but it’s a long way from there to a trigger based on IsEnabled that replaces an image with a grayscale version of itself.

That leads me to the other solution, which blows but is workable. I found that here. The idea is that you don’t grayscale the image at all; you lower its opacity so it looks more faded, and in the case of a gray control background, grayed out. This happens to be easily expressed as a style, like so:

<Window.Resources>
    <!-- El Cheapo hack to make images within disabled toolbar buttons appear 'grayed out'.  This doesn't
    gray them at all, but lowers their opacity so the (usually gray) background of the button shows through.
    If WPF had a built-in facility for grayscaling images in disabled buttons, this kind of icky kludgery wouldn't
    be necessary -->
    <Style TargetType="{x:Type Image}" x:Key="toolbarImageStyle">
        <Style.Triggers>
            <DataTrigger Binding="{Binding RelativeSource={RelativeSource AncestorType={x:Type Button}, AncestorLevel=1}, Path=IsEnabled}" Value="False">
                <Setter Property="Opacity" Value="0.50"></Setter>
            </DataTrigger>
        </Style.Triggers>
    </Style>
</Window.Resources>

Any Images contained within a button and with Style set to toolbarImageStyle will have their opacity adjusted to 50% when the button is disabled. The original post called for an opacity of 25%, but that was too faded in my opinion. Adjust the value to taste, but don’t expect to exactly duplicate the true grayscale effect; colors are still visible in the buttons, they just look a bit faded.

So, there you have it. I’m going with the latter solution since it’s easily workable and almost right, but I’m still fuming over WPF’s inability to handle a feature that MFC has had for the duration of my programming career. Pretty soon I’ll be like those old mainframe curmudgeons who bitch about C++ and insist C and 64K of RAM is all anyone ever needed. Then I’ll know it’s time to quit the software industry and go into yoga instruction full time.

26Aug/077

What the HELL is wrong with WPF?

This weekend I’ve been working on one of my many self-edification projects, and against my better judgment I was implementing it with .NET 3.0/Windows Presentation Foundation, mainly because of the markup-like UI model which fit nicely with a few of my project’s idioms.

Anyway, I wanted to get a quick hit this weekend to keep momentum going, so I thought I’d put together the rich text interface of the project first. This should be stupid-easy, since WPF contains a RichTextBox class that does everything you could ask for with rich text, and the underlying content model is itself a XAML document, a FlowDocument. Too easy.

Unfortunately, RichTextBox doesn’t ship with the standard rich text toolbar, for doing things like bold, italics, etc. No problem; WPF has a toolbar element too. I wired up a bunch of buttons, each using the standard icon for its corresponding function, and set the buttons to invoke WPF commands like EditingCommands.ToggleBold, which the rich text box knows how to handle. “Wow”, I thought to myself, “this WPF thing isn’t half bad!”.

Then I noticed something. When commands are disabled (for example, the ‘Cut’ command is disabled when there’s nothing to cut), the toolbar button doesn’t respond to mouseovers or clicks just like a disabled button wouldn’t, but the icon on the button isn’t grayed out like one would expect. Then I tried something simpler; I created a Button with an Image in it, and explicitly set the Button‘s IsEnabled property to false. Sure enough, the image was still not grayed out.

Then I used Microsoft Expression Blend (a tool about which I could write a whole stream of angry rants) to show me the built-in XAML template for the Button control. Imagine my surprise when I discovered the only logic in the template that deals with rendering the disabled button is setting the foreground color to gray. That’s _it_. No wonder, then, that my images weren’t rendering differently; the foreground color has no effect on images.

You can see for yourself. Type this XAML code into XamlPad and see what happens:

<Page xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
  xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml">
    <StackPanel HorizontalAlignment="Left">
        <Button>
            <TextBlock>This button is enabled
                <Image Source="http://apocryph.org/themes/apocryph/logo.png" Width="32" />
            </TextBlock>
        </Button>
        <Button IsEnabled="false">
            <TextBlock>This button is disabled
                <Image Source="http://apocryph.org/themes/apocryph/logo.png" Width="32" />
            </TextBlock>
        </Button>
    </StackPanel>
</Page>

You’ll see two buttons, one with grayed-out text, but both with the same fucking image.

So let’s ponder this a moment. Back in 1995, I got my first copy of Visual C++ for free from a guy named Mike Strock (this was back when companies sold compilers for an outrageous sum, and thus I couldn’t afford them). One of the kickass features that blew me away was the MFC document/view model, where toolbar buttons mapped to commands, and would automatically disable themselves if the commands were disabled. Somehow, MFC automatically grayed out the toolbar button icons as well. It was incredible. “Surely”, I thought, “programming can’t get any easier than this”.

Apparently, I was right, coz here we are, 12 long years later, and we’ve taken a huge fucking step backwards. Now, don’t get me wrong. Using WPF I can easily make a button which contains a spinning 3D cube playing videos on each face that turns green when the mouse goes over it, all without any C# code. Unfortunately, no one would ever want to do that, while it’s easy to imagine a situation wherein you’d want toolbar icons to gray themselves (like, I dunno, every Windows application ever made).

This gives me a great startup idea. I’ll found a company in some tort-free jurisdiction like Costa Rica, called “FuckIt, Inc”. We’d offer an exclusive “Fuck This Shit(tm)” package, wherein programmers who can no longer bear the absurdity and kludge which is modern software engineering would be cryogenically frozen in a bunker on a secret island off the coast, to be revived 100 years hence or when the software industry gets its head out of its ass, whichever comes first. One of the test cases used to determine if the industry has its head out of its ass yet would be this one:

Using the latest development tools from the vendor of the most popular desktop operating system on the planet, is it easier to create toolbar buttons with icons that automatically disable themselves when the action they correspond to is disable, or to create a huge button containing a spinning 3D cube with video playing on each face? If the answer is the latter, as it is today, then the programmers stay frozen.

UPDATE: I’ve got a couple of solutions, here. Neither of them are optimal, but then, neither is life itself.

19Aug/070

Project Idea: metacortex, cognitive prosthetic for information storage/retrieval

Right on schedule, about once every quarter I feel the need to build an ubertool to capture, organize, present, and share all the information I keep around me, structured and unstructured, textual, audio, visual, etc. Each time it takes a different form, and each time I end up going nowhere with it, but I write it down nonetheless.

Now I’m thinking something (very) vaguely like the metacortex in Charlie Stross’ book Accelerando. In the book lots of SF liberties are taken, of course, and the metacortex includes all sorts of semi-intelligent autonomous agents tightly integrated with the wetware to expand its consciousness and abilities.

I’m thinking something a bit less dramatic. The idea came to me when I looked up the same Ruby syntax for the hundredth time, and wondered why I don’t keep notes on things like this that I keep needing. A cache, I suppose you could say. A tool I could keep running in the background, hit a key sequence to activate, and use a fast keyboard-based interface to query the information I want, then dismiss again. If the info isn’t found, I could go to Google or the Pickaxe Book or whatever, and upon finding it, easily in a few seconds add the information to my metacortex.

I know what you’re thinking. It sounds like a text file and grep. But it’s more than that. Yes, I could keep a ruby_cheats.txt file around and grep it when I forget how to write a migration or whatever, but that’s just one more silo of information I care about, on top of gmail, my bug list, my library of ebooks, my blog posts, my furl bookmarks, and my research notes. What I keep craving is a system that can aggregate my information, and allow me to capture, store, organize, and retrieve it in whatever form and format makes sense. A system that makes my information portable, easy to access and share whenever I want, and yet without giving up the specialized tools I already use for special-case types of information (eg Word, Except, Google Docs & Spreadsheets, Bugzilla, gmail, and on and on).

I suspect the reason I keep feeling the compulsion to build this tool is that I keep needing it. I can feel that I’m doing things The Hard Way, and intuitively I realize there must be a Better Way, if only it can be developed.

Some of the things I like:

  • The way Quicksilver gives Mac users easy, keyboard-based access to bits of information and actions on that information
  • The way Bugzilla organizes my and my team’s bugs by milestone
  • The way I can tag my blog posts by arbitrary named tags
  • The way Google Desktop Search gives me one place to search my email and local files

I have literally gigabytes of information I care about, taking the form of emails, chat transcripts, blog posts, bookmarks, feeds, bugs, account information, contacts, PDFs, images, MP3s, videos, etc. I don’t think the CS community yet has the sophistication to represent all this disparate information in such a way that is can on the one hand be readily accessible as a unified set of data, and on the other hand capture enough of the format-specific metadata as to make full use of each type of information individually.

For example, how can I build a system that lets me keep my emails, PDFs, ebooks, and photos in one logical collection of information, but still makes it easy to manage my email by itself, or lookup something in an ebook, or browse through my vacation photos? Obviously it’s not realistic to build a monolithic app that incorporates a feed reader, PDF viewer, image editor, and email client. Even if it where, there will always be superior one-function tools (like gmail or Picassa) that it would be preferable to use.

Of course, technically, the system I just described already exists, and is running right now on my machine. It’s called a filesystem, and it was invented decades ago. It’s a very simple interface to information, using a hierarchical organization structure mapping names to blobs of data (files) or other collections of names (directories). Its simplicity is probably why we still use it more or less unchanged from its original invention.

WinFS was supposedly going to set this idea on its head. Files wouldn’t be located at a specific path, but would be accessible through a number of paths, like the collection of all MP3 files, or files tagged with ‘my music’, or whatever. This always seems to get pushed back, and I think this is in large part due to the fact that the hierarchical filesystem model of information organization works fine for most uses, and consumers barely grok it as it is, with no hope of groking a multidimensional attribute-based filesystem.

I’ve had all sorts of ideas to solve this problem. I had envisioned a tool that would operate as the hub for all my information, with adapters into and out of info silos like email and feeds and bug lists. This way, there’d be one tool to use for all your information needs. Of course, this breaks down pretty quickly, since I still want to use Outlook to read and write email, Bugzilla to manage my bugs, and Google Reader to read my feeds. It’s not that I dislike those tools. It’s that I wish they had the ability to integrate more tightly, and that I could insert my own systems in the pipes between tools to munge things however I wanted.

And that’s the problem. There’s not a doubt in my mind that the answer to this problem is not a monolithic tool. It will inevitably be a decentralized, loosely coupled system with autonomous tools interacting with eachother. I just don’t know how.


I just read Dreaming in Code, which chronicles (in the incredibly annoying, breathless newspaper style) the first few years of the development of Chandler, aka Kapor’s Folly. By the sound of it, Mitch Kapor has had the same recurring need to build a metacortex tool as I have, only given his $100 million fortune, he founded a non-profit and launched a quixotic quest to build the software. Amazingly, it seems he and the crack former Apple, Microsoft, Netscape, and AOL developers he hired hadn’t learned any of the painful software engineering lessons of the preceding forty years. This, combined with the absence of corporate dysfunction and PHBs as a helpful constraint, seems to have bred disaster. Now, they’re excited to release a sort-of-working calendar app.

Along the way, the team had the same epiphanies I had. The fixation with object persistence, brief experimentation with RDF followed by horror and revulsion, fascination with peer-to-peer topologies, grappling with the Web vs. desktop app conundrum, and on and on. At no point did they (or, as far as I can tell, have they yet) make the leap from abstract things that should be easier to a list of concrete functionality. I have little hope that they’ll ever get there from what I’ve read.

By way of backstory, the book looks at previous attempts to construct what I call cognitive prosthetics, which apparently date back to the 60′s! In spite of all this, it seems no one has come up with a solution. It’s hard to believe the world’s best computer scientists have struggled with this problem for decades, that I discovered and grappled with it independently, and that we’re no closer to solving it now.

It seems none of the approaches that have been tried fit the requirements quite right. The Burners-Lee Semantic Web crowd are probably the furthest away from the right answer in my opinion. How you try to solve this problem with top-down elaborate ontologies and RDF is beyond me. I expected better from Sir Tim.

The rigid, statically typed approach currently taken by information management tools is also wrong, of course, but has the benefit of providing tools that work today, albeit imperfectly.

I suspect the right answer is to be found in even less order, tools that start with no structure at all, and add in only enough to power some basic computational primitives. Something closer to a spiral notebook than a spreadsheet or database, sort of an infinite scroll of digital paper with information atoms strewn about.

Another challenge to my mind is the legacy problem. Designing a new system of computing from scratch, a grand unified architecture of information-oriented computing primitives forming a cognitive prosthetic kernel upon which information stores can be built would be one thing. Sadly, untold petabytes of information are already stored in legacy silos, requiring the construction of a system that offers some migration path of sorts.


I’ve re-read my old OneNote notes on this subject. I think I’ve explored this problem from every imaginable angle, and inevitably I end up with something explosively complex and unworkable. I know I’m missing some key atomic concepts which will make this solution possible, but try as I might they continue to elude me. More basically, I’m missing some simple, easy to state, straightforward problem that I personally have that I want to solve, which can catalyze this work. I’m like Mitch Kapor trying to build Chandler based on little more than a sense that things are too hard and information should be integrated regardless of type.

I’ve tried to design an information tool that used OO modeling to capture information. I’ve tried using text with tags that fed into a processing pipeline to fire off actions in response to markup. I’ve tried building a content database from the filesystem. It never works. I’m trying to solve the wrong problem. I can make a list a mile long of unrelated things that the system should do, unified only by the fact that each thing involves the representation and manipulation of information.

Some of the recurring themes:

  • Capture information with only enough structure to enable computer processing. As much structure as possible should be inferred by intelligent software rather than stated by humans
  • The spiral notebook is a useful metaphor for the functionality I’m trying to find, as it’s infinitely flexible, offers random access, high usability.
  • Information modeling used to create simple information systems like requirements lists, bug lists, contact databases, etc
  • Cognitive tools like set theory, concept mapping, mind mapping, math, etc built into the system for easy, effortless cognition
  • Activities as varied as scheduling, email, blogging, feed reading, free thinking, software modeling, code reading, all supported
7Apr/070

Project Idea: Dynamic External Storage

Now that I’m getting a laptop again, I’ll face a situation I dealt with a few years ago: one or more external hard drives at fixed locations, with a laptop on the move. I want to take advantage of this extra storage when I’m at home or at work, but on the road when it’s not there I still want access to my important files.

Three years ago I had a firewire hard drive that I just carried around, and manually copied files onto my laptop hard drive if I needed them. Now, time has marched on and there’s a better option: high-end eSATA drives running off a fast ExpressCard34 bus.

What this means, essentially, is that I can position high-performance logical storage at my home and work desks, which will provide superior I/O performance to the built-in notebook HDDs, potentially turning an hour-long build into a 20 minute build. The problem then is that on those rare occasions when I’m not at either of these places, I still want all the files I put on those fast disks, and would rather have them on my slow notebook drives than not at all.

I think there’s an easy solution here, at least on Windows systems. Using NTFS reparse points, one can redirect a folder (say c:\work) to another disk. When the laptop is mobile, c:\work is mapped to d:`, another partition on the notebook's internal drive. When the laptop is at home or work,c:\workis mapped to the drive letter associated with the high-performance external storage, saye:`. That much is easily automated with the Windows APIs in any scripting environment.

Of course, the complication is keeping the external and internal versions in sync. The internal drive would be the authoritative copy, while the external drives would be fast secondary copies. rsync could push changes on the internal drive to the external one when it’s connected, but maintaining the internal drive as the external drive was changing would be a bit trickier.

One could run a constant rsync job in the background, but that would hit both disks hard and negate any performance benefit. Alternatively, one could make the sync process totally manual, and require a rsync to sync back up the internal disk before removing the external disk. That has the unfortunate effect of risking a loss of sync, and could also lead to Office Space-esque situations in which one is desperately waiting for a disk to sync before bailing early on Friday.

I recall from a Channel 9 video that Windows Vista has I/O prioritization, which could make the constant rsync solution viable if the rsync I/O priority was set to the minimum, keeping it from degrading other I/O performance on the drive. Alternatively, maybe an intelligent monitor could detect lulls in disk activity and trigger incremental syncing.

28Jan/070

MungeCap, a quick and dirty capture file merging/filtering tool

Lately I’ve been playing with wireless network monitoring, using kismet. Kismet produces dumps of all wireless traffic in libpcap-compatible packet captures, which is the same format used by Wireshark, tcpdump, and any other packet capture tool worth its salt.

The problem is that after a week of capturing, I have several gigabytes of capture files, though most of the captured packets are 802.11 beacons that have no information in them. Wireshark eats shit and dies on a 500MB capture file, so a 2GB one is out of the question. What to do?

Wireshark ships with some capture file munging tools, editcap and mergecap, but neither of them do exactly what I want, which is to take as input one or more capture files, apply an optional filter, and write the results to an output capture file, preserving the chronological order of the input packets.

mergecap merges files, but doesn’t let you apply a filter.

Thus, I give you mungecap, my lovingly hand-crafted solution to the problem. It uses libpcap (which is to say, winpcap if you’re on Windows), and is written in ANSI C++, so it should work fine on any platform for which there’s a sane C++ compiler and a libpcap port available.

Pull the latest from my SVN repository. There’s a Visual C++ 2005 project there, but you’ll have to tweak the include and lib paths to reflect the location of the WinPcap dev pack on your machine. Obviously UNIX users are on their own; have fun fucking with a makefile, trying to remember if you have to use spaces or tabs.

Remember, kids, intercepting the wireless traffic of others, though entertaining in a dysfunctional power-trip sort of way, is probably a violation of federal wiretap law, and thus you really shouldn’t be caught doing it.

UPDATE: I forgot to mention, due to the viral nature of the GPL, this software is released subject to the terms of the GPL, blah blah blah, no warranty including merchantability or suitability to a particular purpose, blah blah blah, will fuck up your computer and its not my fault, blah blah.

13Jul/060

Test harness for Win32 network and disk performance tests

My recent investigations into Win32 socket performance led me to a few performance measuring tools, like iperf and netperf. However, in my case I wanted some extra features:

  • Use of Win32 IO completion ports for disk and network IO
  • Use of Win32 TransmitFile/TransmitPackets high-performance socket routines
  • Benchmarking of disk read/write performance as a part of overall throughput

So, yesterday I threw together a quick-and-dirty test harness to exercise these features. The code isn’t written for maintainability or readability; the point was to get something out quick which I could use to explore the performance landscape.

The sources are in my svn repository, and I’ve attached a source and Win32 binary tarball based on a snapshot of the code today.

The code requires a client and a server at each end. It doesn’t use any particular wire protocol; just a stream of bytes followed by a connection close. In fact, you can reproduce its client functionality with a nc whateverhost 12345 < srcfile, and its server functionality with nc -l 12345 > destfile. This comes in handy if you want to test against a UNIX host on which AsyncIoTest won’t run.

The same binary, AsyncIoTest.exe, can run as a server or a client. In both server and client mode, you can opt to run only network, only disk, or network and disk (default) tests. In network mode, the client sends random bytes to the server, while the server reads bytes and ignores them. In disk mode, the client reads from a source file as fast as it can, then drops the resulting data, while the server writes to a target file as fast as it can. In combined mode, which is the default, the client reads from a source file and sends it over the network, while the server reads data from the client and writes it to a target file.

So, the basic commands are:

To run a server:

asynciotest -s

To run a client:

asynciotest -c serverhost -f sourcefile

Where serverhost is the hostname or IP address of the box running asynciotest -s, and sourcefile is the path and file name of the file you want to read from and send. The server writes to a hard-coded file name, which it creates in the working directory and truncates with each connection.

To put either a client or server in disk-only mode, add the -d switch. Note that, in -d mode, the client and server can be run independent of one another, with the client reading data and dropping it, and the server making up random data and writing it.

To use network-only mode, use the -n switch. Unlike -d, in -n mode the client and the server remain dependent upon one another; they just don’t do anything with files.

When the client is in -n mode, or the server in -d mode, you must also provide the -l length parameter, where length is the amount of data you want to send over the network (client) or write to the file (server). length can be followed by scale values k, m, g, K, M, G. k denotes a scale of 10^3, m 10^6, and g 10^9, while K, M, and G denote 2^10, 2^20, and 2^30, respectively.

You can force the client to use TransmitFile by passing it the -t switch when it’s in network-and-disk mode. If you pass -t when the client is in -n mode, TransmitPackets will be used instead. Passing -t in server mode or with the -d switch is an error.

You can set the size of the chunks used for reading and writing with the -k chunksize parameter; like -l you can use scale values for readability.

The TCP send and receive buffers can be set with -b bufsize; again, scale values are recognized.

The number of outstanding async ops to maintain in the queue is set with -o opcount. The default value is 2, depending on the speed of your I/O you may want to go higher, but keep it reasonable; 8 or 10 is probably the upper boundary.

If you must, you can override the port used with -p port.

The raison d’etre of this tool is to measure performance implications of various I/O subsystems with a few possible Win32 I/O API calls. Some good tests would be:

asynciotest -s -n and asynciotest -c whatever -n -l 500M

To measure raw network throughput. Try adding -k 256k and -b 64k to see if larger TCP buffers and chunk sizes impact performance. Then try adding -t to the client to compare the performance of WSASend with TransmitPackets.

Once you’ve a sense for the raw network throughput, add the filesystem into the equation. Start with the client only, removing -n -l 500M and replacing it with -f file where file is the path of a large, relatively unfragmented file. If you’re transferring over a LAN, expect the file reads to significantly reduce throughput compared to network-only mode. Also experiment with -t on the client side, which will use the high-performance TransmitFile API instead of repeated WSASend calls.

Also beware of multiple runs of the client test; the source file will be cached by the Win32 cache manager, so you can expect subsequent runs with the same file to perform a bit better as a result. To account for this disparity, reboot the client between runs (yes, I know, that sucks).

Next, remove the -f file and replace with -n -l 500M on the client, and remove -n on the server. This removes the client disk from the I/O equation, but adds the server disk corresponding to the working directory of the AsyncIoTest server process. Compare this with the results from the client I/O test, then add back in the client I/O as well and see what happens.

There are a number of permutations you might try, but these should expose the key corners of the performance space.

Delicious Bookmarks

Recent Posts

Meta

Current Location