06 December 2008

Regenerative Braking for Services

George Reese has a missive over on O’Reilly’s site about why auto-scaling your “cloud” application is a bad idea. He starts from the na├»ve case where scaling your computing without bounds leads to your expenses scaling without bounds as well. Okay, that makes sense. Then he goes on to explain that setting those bounds to do the right thing is too hard, and involves humans doing capacity planning, so you should just do better capacity planning with humans and leave the automation out.

Now I’m a big fan of robots and having machines do tedious work for me, so this claim holds little truck with me. Frankly, the words “too hard” translated as I read them to “you could have a strategic advantage over competitors if you do this well.” Unsurprisingly, I’m not the only one that feels this way, and, in fact, several people chimed in with refutations and examples of how they're already doing this today to great advantage.

A response by Sam Curren, Really Bad Reasons Not To Auto-scale, refuted most of the “it’s too hard to get it right” arguments. Adam Jacob had a good comment as well, if you’re monitoring the wrong things, it is in fact easy to get wrong. In fact, one can look to Don MacAskill’s post about smugmug on EC2 to see some examples of what measuring the right things can look like. Breaking things apart into pieces that are easier to measure is an implicit piece of Don’s discussion that probably warrants more discussion another time.

One thing that hasn’t been mentioned yet in this conversation is that if you don't degrade gracefully under pressure in any of these models you've already lost. If your service is starting to degrade (or know it’s about to) the only hard part is knowing whether to grin and gracefully degrade under the temporary pressure, or bring in more capacity. Thing is, humans are quite capable of making the wrong call here, and even if they make the right call, they’ll do it much slower and they won’t do it in the middle of the night when your service suddenly gets an unanticipated spike in popularity in Japan.

Back to the mental translation, if you can develop good algorithms (or even very simple ones) to better predict when to scale up and down, you save a lot of money that is traditionally blown on idle resources in slack times. Those idle resources can be turned off, or pressed into use for non-time-critical batch work, or even sublet them to someone else to do processing with. And in fact this last one is quite probably the business that EC2 and App Engine probably represent. “Here’s some spare resources let’s sell some usage on them rather than making $0 on resources that are continuing to costing money to run.” (That other large cluster players aren’t involved this market yet indicates they either don’t have enough capacity as it is, or they aren’t in a position where they care about that idle cost yet, or they just don’t get it. It’s another interesting conversation in and of itself.)

In any case, being more efficient about resource usage represents a competitive advantage that can make a big difference. It’s like the regenerative braking on hybrid cars. Many people just afford the cost of wasting that energy as heat, perhaps not even knowing that there is a better way. However, with some initial investment and knowhow you can capture some of it and realize greater efficiency and a cost savings to boot.

09 September 2008

Book Review: Life of Pi

An interesting story about trial through adversity. The story was reasonably good, but didn't particularly grab me and make me want to keep reading. I suspect this would've been a much more compelling tale if it had been told from the point of view of the tiger, and would've allowed the commentary on religion and the human condition to be somewhat less forced.

The story had a sort of disjointed episodic feel to it which came on rather quickly after the initial bit of character introduction. My mind is not yet made up on whether this style of storytelling helped relay the descent into madness from being trapped at sea. It would have worked better for me as a literary device if it had been used a bit more subtly. As it is, it seemed unintentionally scatterbrained.

For all the review commentary I read ahead of time about the religious message in this novel, it seemed tacked on in a very forced sort of way. The protagonist was confused about what to believe, tried to believe in everything in the same time, but largely just stood in awe of nature before him and the lucky breaks he got every now and again. The awe and luck was attributed to any random belief system that seemed to best fit the moment. And when there wasn't anything interesting going on this aspect was completely forgotten.

The questions about predestiny and what kind of benevolent god(s) kill your parents, dozens of innocent people and animals and leaves you alone on a liferaft with a carnivore were left largely unaddressed. In the end, the book left me wanting with its undirected episodic nature and failure to ask hard questions that might scare off some readers.

06 September 2008

Boggle Solver

I've been working on a pet project with App Engine to try and get a better feel for it. It’s a Boggle puzzle solver that does some AJAXy tricks to multithread the solving work. The source code is online as well for anyone that’s curious. (There is a popular knockoff called “Scramble” on Facebook that is either different enough to keep Hasbro from filing a lawsuit or Hasbro’s lawyers are waiting around for them to make some money before bothering.)

In the case of App Engine it's particularly sensitive to requests that take “too long” to process. This was a particular hassle when I was importing the dictionary that is used. In order to solve a Boggle puzzle fairly quickly you want to have all the possible words arranged in a trie. This way you can stop quickly if you’re following letters that will never spell anything as you traverse the puzzle board. It took splitting the dictionary into 5000 separate pieces to get it to load without pushing me over my quota for “long” requests. Luckily we only have to do that once.

Next came the challenge of the puzzle solving itself. Again, solving the whole board in one request takes a while to process, apparently more than is allowed without running into that “long” request quota. Even in an optimized form (see the links to Dan Vanderkam’s work at the end of this entry), the full dictionary trie is 3MB and that takes a non-trivial amount of time to load in when you’re trying to handle requests within a few hundred milliseconds.

The solution was to reload the dictionary again but this time to break it up by initial trigrams. For every initial three letters, I store the appropriate shard of the dictionary (all the words that start with that combination) as a trie. There is also a blacklist of trigrams that form no words (for instance “frw”).

The javascript calls in with a copy of the puzzle and a point on the board to solve from. The server code then finds all the trigrams starting from the specified point and loads the appropriate dictionary shards. Since we're only solving from one point on the board there won't be more than 72 shards to load for any javascript call. (9 directions from the point and then 8 directions from each of those points because we're not allowed to backtrack.) The server then traverses the board using the dictionary tries hunting for words. When it finds them it stores the word and the places on the board where the word was found.

This information is all reduced into JSON and returned back to the browser that made the javascript call. The javascript on the browser is then responsible for taking all the found words and locations and sorting them in a sane way and displaying them for the end user.

Dan Vanderkam has written some interesting code and blog posts about optimizations in solving Boggle puzzles.

20 July 2008

Seven our new Rottweiler

Our newest addition is Seven, a 4 year old Rottweiler that we adopted from King County Animal Control in Bellevue. She’s not exactly a petite dog at 105 pounds and her snorts and grunt and growls belie her girlish charm and friendliness. Here she is with her backpack full of water bottles and dog supplies ready to go for a tromp.



Our understanding of her history is that she was bred in order to sell the puppies and one day the breeder asked the neighbors to watch the dog for a week, while they were going to be out of town and they never came back. Unfortunately the neighbors also couldn’t keep Seven because they were moving to a city with a breed ban against Rottweilers. As an aside, that's a ridiculous law since a recent study showed that the most aggressive breeds are actually Dachshunds and Chihuahuas, with Jack Russell Terriers rounding out the top three. Rottweilers and Pit Bulls were average or below average in aggression towards strangers amongst the breeds shown. Apparently some people in Seattle are clamoring for breed bans, but not on aggressive breeds like Chihuahuas, but instead on Pit Bulls. Luckily some sane owners have banded together and Pasado’s Safe Haven has also gotten involved.

After recovering from her spay surgery Seven is doing well and seems to be enjoying her new home. She trained incredibly quickly to the invisible fence and gets along with one of our cats, Jasmine, who isn’t being an idiot around her. Odie still needs to learn that waiting until the dog gets close and then running away like prey is a bad idea.

Just yesterday Seven had her first class at Cascade K9. The object lesson right now is getting her “walk” under better control so she doesn’t try to run out ahead. We also need to get a bench for her to hop up on to practice “climb” at home. As always, the most important element in dog training is the humans, so we’ll try hard to be up to the challenge and we’ll report the progress we make…