Also Looking for a Mid-Level Web Developer
In addition to the senior developer position, my company is trying to fill an opening for a mid-level ASP.NET developer to take over and extend our public-facing web site. We’re looking for someone with two or more years of solid C#/ASP.NET experience developing standard web application components like content management, shopping cart, user management, etc. Certs like MCSD.NET or MCAD are a plus. Experience integrating ASP.NET sites with back-ends like SalesForce.com is strongly preferred. Page layout and graphic design skills are not required but will be utilized if present.
If this sounds like you, then we’d love for you to get in touch with us. We offer very competitive compensation; a casual work environment; the freedom and agility of a fast-growing young company; and the chance to work on a number of challenging tasks. To apply, email your resume, cover letter, and salary requirements to wdevjob (at) appassure (dot) com.
(Still) Looking for a Senior Developer
A while back I was looking for a senior-level developer to join our team at AppAssure Software. I’m still looking. I’ve rewritten the job description a bit in the hopes that will help me find the right candidates.
My company, AppAssure Software, is looking for another senior level developer to join our development team. AppAssure is a rapidly growing software company with consistent revenue growth building cutting-edge Windows software products used by organizations of all sizes to protect their most mission-critical IT assets. The Reston-based development team is lean and results-oriented. We’re looking to add another senior level, well-rounded developer with impeccable software engineering skills.
We’re trying to find an experienced software engineer with soup-to-nuts product lifecycle experience and strong C# and C++ skills. C++/CLI is a big plus, and the ability to assimilate new information and learn about new technologies quickly is most important of all.
So, are you:
- Hard-working?
- Smart?
- Able to work well in a team?
- Able to work well individually?
- Acutely attentive to detail?
- Interested in a variety of programming challenges?
- Authorized to work in the United States?
If so, then we want to hear from you. We offer very competitive compensation; a casual work environment; the freedom and agility of a fast-growing young company; and the chance to work on a number of challenging tasks individually and in teams.
To apply, email our development team lead at sdevjob (at) appassure (dot) com. We need to see your resume, but we’d also like to see that you’ve spent some time on our website, have a sense of what we do, and are interested in this job, not a job.
Please, no contractors, offshore firms, recruiters, visa sponsorship requests, or internship inquiries. We’re looking for a full-time developer able to commute to Reston, Virginia and authorized to work in the United States.
ATF Chicanery Brings Right and Left Together At Last
There’s a procurement order from the Bureau of Alcohol, Tobacco, Firearms, and Explosives (ATF) making the rounds on the gun boards which is sufficiently egregious as to piss off citizens across the political spectrum. I first learned of it from this arfcom post dated 19-March, but there are now angry threads on other Internet gun boards and at least a few left-liberal forums as well.
Apparently, the ATF have a unit dedicated to asset forfeiture, which is the tough-on-crime practice of seizing the assets of citizens if government agents suspect said assets were acquired with the proceeds of illegal activities. Some asset forfeiture seems reasonable enough (seizure of assets pending/after a felony conviction), but most asset forfeiture is so-called civil forfeiture, in which the owner of the asset must prove by the preponderance of evidence that the seized assets were NOT obtained with the proceeds of criminal activities. Google “civil asset forfeiture” if you still believe the War on Drugs has not brought with it dire civil liberties consequences.
Anyway, the ATF Asset Forfeiture unit apparently needed some Leatherman Micra multi-tools, which seems perfectly reasonable. I don’t seize property for a living, but I have found a multitude of uses for my Leatherman tool. But a stock Leatherman wasn’t good enough for the dedicated men and women of ATF Asset Forfeiture. After all, they put their lives on the line day in and day out to confiscate the property of evildoers; don’t they deserve a little extra? With that spirit in mind, DOJ (under which ATF has operated since 2003; before that they operated under Treasury as revenue agents, if you can believe the irony) issued a procurement order for 2000 very special Leatherman Micras.
It’s pretty dry procurement language; here’s the good bit:
Leatherman Micra Color: Blue – Part number 64340101K Engraved with: ATF-Asset Forfeiture AND “always think forfeiture” PLEASE REFER TO THE ATTACHMENT. NOTE: ATF MAY REQUEST A SAMPLE TO DETERMINE IF IT MEETS OUR REQUIREMENT. A picture of the item may be substituted in place of the actual sample.
That’s right. ATF needs special blue Leathermans engraved with “ATF-Asset Forfeiture” and “Always Think Forfeiture”. No, this isn’t a prank by some right-winger; it’s right there on the FBO site. The procurement was awarded to FREEDOM ENTERPRISES (oh, sweet irony!) of Spokane, Washington for the amount of $37,460.
So, here’s one of the multitude of federal, state, and local agencies involved in asset seizures, some of which are obviously legitimate and some of which are patently unjust, spending taxpayer dollars (and possibly seized assets; technically seizures are supposed to fund law enforcement operations) to customize a simple hand tool with language that on its face presents an alternative meaning of “ATF”, one which requires asset forfeiture agents to “always think forfeiture”.
Now I’m sure if asked ATF will insist this is harmless and all forfeiture agents are full compliance with all laws and regulations when seizing the property of American citizens. That may even be true. It is definitely irrelevant. Using the threat of government-sanctioned violence to forcibly seize the assets of Americans, criminal defendants or not, is a very serious activity which carries with it the potential for grave abuses. At the very least, those responsible for executing these forfeitures on behalf of one of the most abusive and irresponsible Federal law enforcement agencies must conduct themselves with professionalism and sobriety consistent with their awesome power. Instead, they use taxpayer money to buy trinkets to remind one another to remain ever vigilant for forfeiture opportunities with a clever play on their own organization’s name.
Maybe you don’t see the problem. Maybe you really believe ATF goes around taking machine guns away from gang bangers and white supremacists. Maybe you don’t believe ATF agents really shoot pet dogs and stomp pet kittens to death. Maybe you don’t see a problem with federal agents confiscating guns from their civilian owners. Then check out this IG audit report from September 2006. In particular, note this table of seized assets by type. Out of $24 million of assets seized in 2005, only $9 million were seized firearms (which would include seizures like the recent raid on Cavalry Arms), while another $9 million was cash and real property, $3.5 million was tobacco, $1.2 million was explosives, and $487,000 was ammunition.
You should also ask yourself how you’d feel if this were DEA’s asset forfeiture unit, or your local police department. Do you really want the people empowered to seize your assets making clever jokes about their job?
Apparently, plenty of mostly-right gun owners and mostly-left activists don’t. In addition to the ARFCOM thread, there are active threads on GunBroker, FreeRepublic, NorthEastShooters, and TheHighRoad. On the left, DemocraticUnderground seems none too amused, and TalkLeft has a post on a related bit of ATF asset forfeiture asshattery.
As this is coming out, foot-tappin’ Senator Larry Craig (R-ID) is taking the lead holding up confirmation of Michael Sullivan, Acting Director of the ATF, to the permanent director position. Sullivan has presided over ongoing ATF abuses like the persecution of Red’s Trading Post, and will clearly do nothing to reform the ATF if confirmed. Hopefully the Senate denies Sullivan confirmation, if for no other reason than ATF needs to get its comeuppance.
Requirements for Drupal migration tool
After my flash of insight I’ve decided to build a tool to help me migrate apocryph.org away from drupal.
Requirements are:
- Work with my Drupal configuration
- Not so tightly coupled to my Drupal configuration that no one else can use it
- Output posts in a neutral format that users can post-process and import into other tools
- Preserve all the important elements of each post, including:
- Formatting. Most posts are in Markdown, and a few are in SmartyPants. Those must be converted to XHTML using the same rules which generate the markup in Drupal
- Files. A few of my posts have files attached, usually images but sometimes other stuff. Those files must be preserved themselves, and any references to the files from within a post (like IMG or A elements) must be preserved as well
- Metadata. The tags, author, timestamp, published/unpublished flags must be preserved
- Links. I often link between posts, and the migrated output must preserve those links. They can’t go away or be 404
- URLs (ideal but not required). Notwithstanding the above requirement to preserve intrasite links, it would be vastly preferable of the actual URL of each migrated posting could be preserved so any existing links to pages on Apocryph are preserved.
- Comments. Existing comments attached to posts must be preserved in their entirety.
Right now I’m looking at the WordPress eXtended RSS (WXR) format. It build on the standard RSS format with additional WP-specific tags, which I think will accommodate my requirements. As an added bonus, WordPress has built-in import support for WXR files, so I can easily suck the resulting file into WordPress, which I’ve selected as the successor to Drupal.
Flash of insight: Why am I still using drupal?
After outsourcing comment processing to Disqus due to the lack of the Akismet module, fighting a buggy alpha version of the Views module for Drupal 6.0 to implement a front page that doesn’t use those damn teasers, and installing yet another security update using an update process that can best be described as a pain in the ass, I’ve finally asked myself why I use Drupal for what is basically textbook blogging.
The answer? Well, once upon a time, back in 2005, I had an HTML site that I’d painstakingly built, and I shopped around content management systems trying to find one flexible enough to let me reconstruct that look and feel in a theme. I looked far and wide, at WordPress, DotNetNuke, PHPNuke, MovableType, and just about every other CMS and blogging platform one can imagine. The only one with the flexibility I thought I needed was Drupal. I spent days getting my head around it’s ridiculously expansive surface area, cooked up a theme, and that was that.
Problem is, Drupal 6.0 is out and it feels like a step back. None of the modules I need work, and the system is unimaginably complicated, which is probably why lots of high-profile sites with unique requirements use it. However, apocryph.org is neither high profile nor possessing of unique requirements, and I’m now willing to accept a site redesign if it means I can use a blogging platform that Just Works.
Which brings me to the next problem: how in the hell am I going to get years of posts out of Drupal and into something else? Obviously the answer will involve me writing data migration code, which sucks and makes it less likely I’ll get around to it.
Still, I’m getting really tired of slogging through the Drupal mud.
How much strength does PKCS #5 password-based key derivation (PBKDF2) add to a key?
Last week I posted my tests with Markov chains as memorable passphrase generators. This weekend I’m exploring some ideas for introducing more entropy into the resultant chains so they can be shorter and hopefully more memorable. In the midst of that I remembered something from my previous life doing crypto stuff: brute-force attacks on passwords can be made more difficult by using a key derivation function that performs some computationally expensive operation on the password to produce a cryptographic key. If this is done right, it means the attacker must perform this expensive function for each password he wants to try, which can make an otherwise easily cracked password better and an already strong password stronger still.
The standard algorithm for this is specified in PKCS #5 v2, and is called PBKDF2 (Password-Based Key Derivation Function, version 2). You can see the spec for details, but it basically consists of a number of hash iterations which transform a password and salt value into binary key material. Depending on how many iterations you specify, PBKDF2 can be as fast or as slow as you need.
The beauty of this solution is that it makes brute-forcing keys orders of magnitude harder, without imposing a significant burden on legitimate users. For example, if the iteration count is so high that the PBKDF2 function takes one second to run (which is a ridiculously large number of iterations), it takes your computer an additional one second of computation to authenticate you based on your password. But an attacker trying to guess your password from a dictionary has to spend that additional second of computation once for each password in his list, which could easily be tens of millions of words long.
PBKDF2 is used in a number of real-world systems, most notably the WPA/PSK security specification in the 802.11(g) wireless networking (WiFi) standard. As a result, the best known attack on WPA/PSK-secured networks is to brute-force the key, which is a helluva lot stronger than WEP which can be cracked in a few seconds.
That’s all fine, but when I generate my Markov text I know how much entropy is represented in the string, and in my experience it’s not much (16 to 80 bits is typical for my test data). If I’m using this text to derive a 128-bit AES key, that weakens the resulting key considerably. But what if I could use a PBKDF to make it harder for attackers to brute-force the Markov text, making the cracking effort equivalent to brute-forcing a 128-bit AES key? That would be helpful as it would allow users to use shorter and less sophisticated passphrases without losing security. However, how many bits of security does the PBKDF add?
That was the question I sought to answer. The only real guidance I can find is in the RFC for AES in Kerberos, in which the author benchmarks PBKDF2 iterations on the his machine to determine how many can be done per second, then estimates how long it would take an attacker with the same machine to brute-force a list of 2^32 passwords (that’s over 4 billion for those of you on the decimal system). The number he came up with isn’t important, because it wasn’t particularly helpful. An attacker will have more than a single Pentium 4, and I don’t need an estimated time to brute-force the derived key, I need a number relative to the time to brute-force without the derived key so I can determine how strong the input key needs to be to satisfy my security goals.
Well, I couldn’t find anyone else asking that question, so I wrote a little benchmark tool to help me. It uses the superb LibTomCrypt cryptography library, and is in my SVN repository here. It figures out roughly how many AES ECB decrypt operations can be performed in one second, then how many PBKDF2 key derivation/AES ECB decrypt operations in one second as well. It compares these two numbers and translates the difference into the number of bits of additional AES key length the time difference is equivalent to.
If that all sounds a little speculative, that’s because it is. It assumes the ratio between an AES decrypt operation and the PBKDF2 function is the same on my machine as it would be for an attacker, but in reality a determined attacker would probably use an array of dedicated FPGA hardware crackers or some other implementation specifically tuned for brute-forcing which might have different AES/PBKDF2 performance characteristics. However, I think it provides a back-of-the-envelop estimate of the additional effort PBKDF2 requires of an attacker, and roughly how much more entropy in the input key would require the same amount of effort to crack.
Enough of the theory (such as it is); what were the results? Well:
| AES Key Size | PBKDF2 iteration count | PBKDF2 hash function | AES ECB decrypts/second | PBKDF2/AES ECB decrypts/second | Equivalent additional key bits |
|---|---|---|---|---|---|
| 128 bits | 2000 iterations | SHA256 | 1,139,930.82 | 72.62 | 13.94 bits |
| 128 bits | 4000 iterations | SHA256 | 1,144,402.52 | 36.05 | 14.94 bits |
| 256 bits | 2000 iterations | SHA512 | 838,315.36 | 35.81 | 14.51 bits |
| 256 bits | 2000 iterations | SHA256 | 833,425.22 | 34.96 | 14.54 bits |
| 256 bits | 4000 iterations | SHA256 | 845,963.13 | 17.60 | 15.55 bits |
| 256 bits | 4000 iterations | SHA512 | 831,825.83 | 17.49 | 15.54 bits |
So if my methods are to be believed, using a 2000 to 4000 iteration PBKDF2 to derive a key from user input makes the attacker do extra work equivalent to an additional 13 to 16 bits of entropy in the input key. That’s a huge increase in the attacker’s computational workload, and seems well worth the additional effort.
Desperately seeking comments alternative
The current situation with comments on apocryph.org is untenable. Before I upgraded to Drupal 6 I used the Akismet module to catch comment spam, which meant I could safely allow anyone to post comments without authentication or approval, with reasonable confidence that akismet would catch the spam before it got posted.
That worked great, but the Akismet module hasn’t been ported to Drupal 6, and I’m beginning to think it’s not maintained anymore since the dev snapshot hasn’t been updated since September. Right now I have comments enabled but they go to the approval queue, which I just spent half an hour cleaning out due to the torrent of spam comments various spambots keep posting.
I know there are hosted comment solutions with AJAX goodness and spam filtering, but I really prefer to host my own, plus I must be able to keep the existing comments already posted through Drupal. Oh, what to do!
Fog Creeks's shockingly proactive support
A while back I posted my thoughts on FogBugz On Demand, which were mostly positive but did complain about some deficiencies. Not a day later, I received the following email from Mike Pryor, co-founder of Fog Creek Software:
“What if I want to view cases in the 6.1 beta and the 6.1 gold release? Nope. How about cases in both the “Logging” area and the “Reporting”area?”
Easy! Just search for
Fixfor:”6.1 beta” OR fixfor:”6.1. gold release”
Or
Area:”logging” OR area:reporting
And then just save your search as a named filter.
(There are tons of axis project: area: assignedto: openedby: editedby: alsoeditedby: etc)
Truth be told, I had forgotten about the search axes in FogBugz, although using them in any complicated way does limit your ability to tune the filter with the GUI. But no matter; that’s not the point of this post.
The point of this post is to relate how astounding it is to have such a support experience. This must mean that someone at Fog Creek actually looks around for rants about FogBugz, or at least takes note when they run across them, and that founder-level employees take the time to address them personally. If I knew nothing else about FogBugz and Fog Creek, that would be enough to lead me to patronize them whenever possible. I urge you to do the same.
(Trying to have) Fun with Markov
This past weekend I dusted off my prototypical Ruby implementation of Markov chains for the purposes of generating sentences that bear striking similarities to a corpus of sample text, but are in fact random nonesense text. My first exposure to this idea was the implementation in Kernigan and Pike’s Practice of Programming, but I’ve run across it a number of times since.
Most Markov text generation schemes I’ve run across are just for fun, like mixing the text characteristics of the Bible and Dr. Seuss or whatever. My idea was to use Markov text generation to generate memorable, secure passphrases which resemble a familiar text. I figured Markov chains would generate structurally sound sentences which would enable users to remember the sentence in terms of its structure rather than a random sequence of words, which is a common cognitive trick to remember long strings. I’m not the first to have this idea; at least passkool implements a variation of the same idea.
Markov chains aren’t hard to implement, and after a few hours I had a working implementation and some unit tests. However, I wanted something a little different: I wanted to compute the information theoretic entropy of each generated string, so users could ensure the strength of their passphrase was commensurate with the key or data being protected.
Shannon’s theory of information established a formalized definition of information entropy, which allows us to determine exactly how many bits of information are encoded in a particular variable given the probabilities of each of the variable’s possible values. This adapts very nicely to Markov chains, which are themselves really just states linked by state transitions of various probabilities. Using this formalization, I can use Markov chains to generate some text, and determine how many bits of information are encoded in the text.
The reason this is cool is that it relates directly to cryptography. Since humans tend to be unable to reliably memorize long binary cryptographic keys, we’ve taken to using passwords (or, hopefully, passphrases) which can be cryptographically converted into long binary cryptographic keys and are usually easier for humans to remember. The problem with this approach is that, if you’re not careful, you’ll kneecap your encryption algorithm by using a week passphrase.
For example, let’s say you need a 128-bit AES key, and rather than remember the key (or even worse, write it down!) you derive it from a passphrase which you can remember. Once derived, you use the key confident that even the US government probably can’t break your 128-bit encryption. However, it’s quite possible you’re actually using what is effectively 32-bit encryption, or possibly less, depending upon your passphrase. Is your passphrase something obvious like your name, the word “secret”, or a dictionary word? Then it’s not as secure as the 128-bit key it’s being used to derive. A random 128-bit key has 2^128 (a huge number, believe me!) possible values, but your shitty passphrase could be guessed within maybe a few million tries, which is easy to brute-force with modern computers.
Security professionals who understand this problem give us rules of thumb to relate the length and composition of a passphrase with a corresponding key strength (for example, the 1.2 bits per character rule), but this is only a rough approximation of security. If you pick a quote from Roget’s, the quote might be 100 characters long, and thus 120 bits strong, but that’s only true for an adversary who will try to guess your passphrase at random based on English letter order probabilities. If the adversary knows or has reason to guess you took the quote from a quote book, the security of the key is considerably lower. If the book has 10,000 quotes in it and you picked one at random, that’s -(1/10000 * LOG(1/10000, 2)) * 10000 bits of entropy, or about 14 bits. Any attacker worth his salt can try all quotes from the quote book in seconds.
This is where Markov text generation comes in. Since the text is generated from a series of state transition probabilities, it’s easy to compute the exact entropy level (that is, key strength) of each generated string. That’s what the measure_entropy_for_tokens method of my Markov class does. With this measurement, you can be confident that an attacker who can precisely duplicate your training corpus and Markov chain parameters nonetheless must brute-force the phrase with a level of difficulty comparable to brute-forcing a cryptographic key with similar entropy.
Once I had this implemented, I started to generate sentences from all sorts of sample text from the collected works of Rudyard Kipling to a wide assortment of sci-fi. Whether I generated vaguely-pronounceable nonsense words or whole sentences, the length of text required to reach 128 bits of entropy was alot more than I expected. This was made even worse by the occasional word strings which had zero entropy (meaning there was no chance any other word would be selected) due unique combinations of words in the source corpus.
Here are some examples I generated using a collection of children’s books from Project Gutenberg. I find kids books have simpler sentence structure and a smaller vocabulary, so they make for easier to remember passphrases:
Bill got hurt in their banishment (16.01220550168 bits)
But if you wish, distribute this etext electronically, or by disk, book or any little girls their lessons, and then ventured to move some time before morning the good Saint come to you,’ said Fergus, ‘with greetings from Concobar the King likewise to Fergus, and he wasn’t black (84.2709242256745 bits)
I should use nought save a half-dozen jealously guarded little precincts of good cheer (25.6622696871773 bits)
I did seek it (17.3989261460847 bits)
You see Lightfoot has no hair on him (24.9873805812463 bits)
Shadow is the child, most fair (20.460032160771 bits)
Christmas is going to Johnny, rubbed her head on one of your making such a hard white crust on the shore (51.4191467788532 bits)
The white snow fell softly, softly, and then he sometimes does great damage (36.4897175557749 bits)
WHO IS THERE? she said she’d marry me (11.7074031814505 bits)
Here’s Martha, mother! cried the two big caterpillars, a lizard, a small gold ring began to fly at intervals, like a drill, or as if you have already enjoyed them–without knowing or wondering why (61.6564060837357 bits)
Yes, Mammy, said Epaminondas (12.5248260288151 bits)
*These Etexts Prepared By Hundreds of Volunteers and financial support to provide volunteers with the Mouse family (20.1629955228745 bits)
Martha didn’t like to feel just as useful as you can, and very pretty song (36.138788709601 bits)
Note the entropy measurements next to each phrase. These entropy figures are specific to the exact corpus I used to train the model.
Imagine you need 128 bits of entropy; you’d need to combine at least two and possibly six or more of these phrases depending on the strength of each one. I’m not sure I could reliably remember such a passphrase, and I certainly couldn’t accommodate a great number of them for various accounts.
I think the lesson here is that Markov text generation is a good approach to passphrase generation, and that 128 bits of entropy is alot of information for the human brain to contain. I suspect there are some optimizations to be had to pack more entropy into a more memorable package, but the real limitation here is human memory capacity.
The code for the Markov implementation, tests, and generator tool is on my SVN repository here. I didn’t upload my corpus to SVN since it includes some copyrighted works; I suggest Project Gutenberg as a great source of public domain text files.
Trying out disqus comments
After posting my lament about the shitty blog comment options in Drupal, I’ve rummaged around and decided to give Disqus a try. The comment system is hosted by them completely separately from Drupal, but their system allows me to export comments so if/when they go tits up I can at least save the comments off somewhere.
I’m not too crazy on their integration scheme, which relies on using AJAX goodness to insert the comments into a
divelement I provide on each page. This means search engines won’t index any comments in situ, though I imagine engines crawl the Disqus site and will find them that way.Anyhow, it’s a shitty implementation, but it was the best I could find. Hopefully the Akismet module for Drupal gets ported to 6.x or I find some other way to solve the comments problem.