a glob of nerdishness

May 25, 2007

Apache-less SVN: Subversion between two Macs via SSH

written by natevw @ 6:55 pm

My first Mac, an original G4 mini, hasn’t seen much use since my new iMac arrived last summer. Hoping to start “spare-time” developing again, and now addicted to version control through work, I made it my goal one afternoon to get Subversion running between the old and the new. Eventually I’d like to expand this beyond source code to a home directory or even a .Mac backend, depending on what Leopard ends up offering.

Apple has an article on compiling a full SVN stack on OS X for use with Xcode. This looked a little bit more painful than I was aiming for, for some reason involving hand-patching Apache 2.0 and plenty of configuration. That route is spelled out nicely (1), but I wanted something even simpler.

Enter Martin Ott and his ready-made Subversion installer packages. Here’s how you can set up your own low-carb Subversion system, hopefully in less than 30 minutes.

Install SVN

Download the SVN package and install it on both your server and your client machines. It will ask you for your password, and then install several binary utilities in /usr/local/bin. This does not include any Apache modules but it does include a standalone daemon named ’svnserve’. This means no WebDAV access, but you can easily use svnserve with ssh to access your data securely across a network.

Configure paths

Unfortunately, programs in /usr/local/bin aren’t easily used by default. The next step is to get the utilities into your Terminal path. On your client machine, just add export PATH=$PATH:/usr/local/bin to your ~/.bash_profile file. For access via SSH to work, the server needs that same line but added the machine-wide /etc/bashrc configuration instead of your user’s profile. For this, you will need administrator privileges, something like sudo pico /etc/bashrc should do the trick. (Alternatively, you could cd /usr/bin/; sudo ln -s /usr/local/bin/svnserve to avoid editing bash’s global configuration. In this case, also configure your user account on the server just as you did the client, because in the next step we use another utility via that path.)

Make a repository

The repository is where all your past and present data will be stored on the server, accessed via the SVN tools. Pick a spot where you don’t mind an extra folder sitting around, and execute on the server machine: svnadmin create ~/repository. Feel free to change the path, as given it will make a new “repository” directory in your home folder and fill it with files. These internal files should generally not be used directly, we’ll test the SVN commands in the next step.

Accessing the repository

Make sure you have “Remote Login” enabled in System Preferences > Sharing on the server machine. If not, enable it, and you should be ready to go back to the client machine. “Check out” your new (empty) repository with this command: svn checkout svn+ssh://your-machine-name.local/Users/yourname/repository new_local_folder. After giving your ssh password a few times, you should find “new_local_folder” copied into the current path.

Beyond…

If you’re not familiar with SVN commands themselves, the maintainers provide a top-notch Subversion Book that is both tutorial and reference. Also by now, you’re probably sick of typing your ssh password. You can improve this by using authorized keys and ssh-agent. The previously linked Apple article finishes with a section titled “Xcode and Subversion”, though there might be an incompatibility between Xcode and the latest SVN software. Let me know how it goes if you make it that far, or get stuck along the way!


  1. Unlike the error messages from Fink trying to update a previous Apache2/SVN install. I gave Fink the ‘rm -rf /sw’ treatment, and now I feel better.

May 23, 2007

Xcode Regular Expression replacement

written by natevw @ 7:55 pm

Thanks to “Search and Replace in Xcode” at Noodlesoft’s blog, I learned how to use RegEx grouping in Xcode’s Search and Replace. In short: \1 instead of $1. Now I can convert the wiki-formatted tables of Kyōiku kanji into a Quisition flash card pack.

May 19, 2007

Scratch

written by natevw @ 1:38 pm

Scratch, designed at MIT, has been making a splash in major news outlets lately. The Scratch project is basically Xcode meets YouTube for games. Sounds strange, but it’s a combination of a desktop-based integrated development environment and a community-based game sharing site. In order to pull this off, the designers developed a programming language that succeeds at being both simple and powerful.

When I saw the headline in the middle on BBC News, I was instantly suspicious:

Scratch front page BBC News

The picture was an instant reminder of Lego Programming hype. (Interesting aside: if you google lego block programming, the BBC article takes second place to that very post!) The article itself is less sensational, toning the headline down to “Free tool offers ‘easy’ coding” and revealing that it is actually a desktop application that allows creation of games.

Trying it out

Feeling that this had all been attempted before, I didn’t think much more of it until I got a note from Hannah, who had also seen the article: “Would this help me understand you?” I guess it’s worth a shot! She downloaded it, and together we figured out how to get the thing started. The cryptic “Open” dialog box was a bit rough, but after we figured out we needed to select the Scratch.image file, she was on her way.

As I watched her learn the “language”, I became impressed with its enlightened design. Scripts are indeed built from graphical pieces, but the process is much more powerful than a mindless succession of commands and much less tacky than drawing flowchart lines. The user can nest commands into control structures and insert test questions into conditionals. Like an open-ended puzzle, the shapes and colors of these “building blocks” give subtle clues as to what pieces may be missing.

Impressive simplifications

Programming in Scratch is a completely interactive affair. The user can attach a number of independent scripts to graphical sprites. Each script is started by one of many available triggers. These triggers are always active, despite the presence of a green flag (just another trigger option) and a stop sign (just an emergency kill switch) in the interface. The interactivity goes deep — there isn’t even an “initial stage”. What the user drags around with the mouse and what gets moved via a script are on equal footing. This is a daring move, forcing the explicit scene initialization in some cases, but greatly simplifying the user model. This is just one of many well-executed simplifications. Variables are also made accessible to beginners, being visible by default and having at most two scope options.

As mentioned before, scripts are attached directly to sprites, meaning that in lieu of a master “puppeteer” script, the sprites are essentially autonomous. All scripts run concurrently upon being triggered. This is the coolest simplification, and also means parallelization becomes an integral part of this beginner’s programming language! Concurrency is heavy wizardry when tacked into many programming models, but is very intuitive in the context of sprite objects. The cat has ideas and the mouse has its own. While shared variables can be used to communicate between scripts (and are an easy concept for beginners), there is also a global message broadcast system. Scripts can broadcast messages, asynchronously or synchronously, which are received by all scripts that are triggered by that particular message.

Wide audience

Scratch doesn’t just encourage algorithmic thinking with the usual control structure and variable concepts. It also teaches important parallelizing patterns that “real” programmers typically encounter only in still-esoteric languages. Scratch can help my wife understand my field, and I hope it can be effective for younger learners. (With hardly any clues from me, Hannah made a reasonably involved Island Rescue game after only two other experiments.) The publicity is justified. What came as further surprise is that it can inspire my programming patterns and design strategies too! Well done, MIT!

May 12, 2007

Subtle bugs stink

written by natevw @ 7:20 am

Consider the following piece of code, to process hex color strings:
sscanf(text, “#%2x%2x%2x%2x”, &rgbcolor.r, &rgbcolor.g, &rgbcolor.b, &rgbcolor.a);

Looks pretty standard, eh? Well, that piece of code cost a day’s work. I was getting segmentation faults in std::list’s iterator post-increment. The iterator had no clear reason to be invalid, although the debugger did show bad values (0×0 or 0xbf000000 for the _M_node internal pointer it was trying to dereference). I copied my project, and hacked it down to a smaller main code path. Strangely enough, the presence the constructor for an otherwise unused class was the shibboleth that determined whether my code crashed or not. Suspecting a compiler bug (how pride resists being wounded!), I tried all manner of settings (optimization, static vs. dynamic linking) before I finally went back to said constructor. Lo and behold, when the constructor called a code path that involved the sscanf above, std::list exploded. When not, it was fine. What on earth?

Well kids, this is why C-style formatters are considered harmful. Sure I had four %x’s and four variables, but “%x”es expect ‘int’ and my variables happened to be ‘char’s underneath a typedef. So the problem was not in std::list, not in the derived class, not in my main code and not even directly in the constructor that clued me in.

Using C++ stringstreams can be a pain, but so is debugging things like this. I wonder if that’s why C++ often gets a bum rap.

May 8, 2007

Rotating cylinder lift

written by natevw @ 9:17 pm

Did you know a rotating cylinder can generate lift? It’s not very efficient, but it has been used to propel a ship across the Atlantic. I learned this strange bit of trivia through an electrical engineering friend who is now a broadcast radio technician. Spending time climbing to the tips of tall radio towers on a regular basis, I guess any source of lift is just cause for excitement.

Lift of a Rotating Cylinder on a NASA education page has the real skinny, but from what I gather (and correct me if I’m wrong) the cylinder needs to be moving “into the wind”, relatively speaking. The key is that the moving surface of the cylinder tends to drag some air along with it. If the bottom is moving against the oncoming stream, it will be slowing the air below while the top stream is helped along. Slower on the bottom, faster up top? Mix in a little Bernoulli and it’s up and away!

A better plan for spam? [academic version]

written by natevw @ 11:36 am

I was going to also publish an even lengthier version of the “A better plan for spam?” article I wrote up. It tried harder to defend all it’s points, and to make sense to a wider audience.

However, after reading the nerd version, my wife said it made a whole lot more sense than did the windier “academic version”. I guess I’d make a better Dijkstra than Dooyeweerd. Fair enough! Since publishing the ‘nerd’ version, I’ve gotten some insightful comments — thanks and keep them coming as long as you’d like! Meanwhile, this is not the blog of schoolishness and the end of this post is right here ->.

April 28, 2007

A better plan for spam? [nerd version]

written by natevw @ 12:45 pm

I think that small tariffs on “published” messages could stop spam, and also reduce the need for advertising online. Users don’t pay to receive e-mail, but they do pay a little each time they send. Users don’t pay to read web pages, but they do pay a little to comment on them. This wouldn’t change the model of the Internet or Web greatly, but it could significantly change the result.

The need for a better solution

Stopping spam is a difficult problem. It could be tackled at the source and/or the destination. Politicians could criminalize spam, or at least tax it — ideally, stop spam at the source. Why anyone would want more regulation and more taxes is unfathomable to most computer programmers. We prefer to invent clever algorithms instead — ideally, stop spam at it’s destination.

There’s a problem with our approach. Spam isn’t caused by buggy machines, it’s caused by greedy people. If humans have trouble deciding whether an e-mail is in their best interest, how much more a disinterested computer! Even if you believe that greedy people are actually just buggy machines, are you willing to bet the spam battle on your ability to create a superior being? If you have an algorithm that takes text as input and gives a motive as output, there are 33 sets of grieving Hokies still groping for answers. (I’m not sure whether Cringley’s recent “search engine for hate” post is meant as some kind of sick joke to that affect, or what.) There’s no algorithm that can stop sin, except for a sacrificial love which overcame death.

The solution applied to e-mail

Don’t keep up the AI arms race, spammers are clever too. Don’t let the government start skimming more profits, spammers are expert evaders. Instead, let the ISPs charge to accept email on your behalf. This might require modifications to the inter-net email relaying protocols, but with things like DomainKeys and server-based spamboxes we’re doing that anyway.

The solution on the Web

There’s more to gain on today’s Web. Imagine leveraging an OpenID-like (or even OpenID-based) system in a micropayments context. Please don’t check out just yet if you think you’ve heard this all before.

Five years and who knows how many market cycles later, I still agree with most of Jakob Nielsen’s User Payments synopsis. Basically, paid content stinks, but ads and spam stink worse. “Information wants to be free” may remind me of “Arbeit macht frei”, but I still don’t want to pay to read your blog. There’s a lot of pressure keeping the Web from going back to a royalty model, perhaps rightly so. But we can flip micropayments around.

A practical scenario

The Web is currently infatuated with user-generated content, so there are plenty of other extrapolations, but weblogs are an obvious use. A successful blog currently tries to exploit it’s fixed content for money and protect it’s user feedback frome abuse. This is naïve and ugly: income via ads, protection via captchas/logins/AI plugins. Instead, why not install a micropayments plugin? Basically:

  1. The webmaster chooses a username/signature service from any number of providers
  2. These service providers act as brokers, deducting/depositing money into the user’s account
  3. The user’s info/balance is not stored with the service providers, but in an account with a larger entity

Basically, it’s PayPal for Web 2.0 — less centralized, more user friendly. The user doesn’t want to keep his credit card on file with every blog he users, but neither can one company be expected to provide plugins for every blog/forum/link-swamp-service out there. Hence the middle-man services providers. They provide an API specific to a problem-domain, and use a standard protocol to link up with any main account provider the user chooses.

Result: Blogs with active discussion have a reasonable, and practical, alternative to advertisements. Users think just a little bit before they respond. Most importantly, spammers get sick of making blog owners rich.

Potential drawbacks

Users could refuse to accept such a plan, preferring to leave the financial and spam-filtering burden completely on the content providers’ tab. But trading chaotic captchas and commercials for a painless just-charge-it is not really a bad deal. I’m paying about a tenner a month for the privilege of writing on my own site, and I’d gladly support your site with my literal 2¢. This does mean speech for the end-user would no longer be free-as-in-beer, but if adequate attention was paid to privacy concerns there would be no harm done to free speech. If a user doesn’t have a credit card, they can sign up with an account provider that foots the tab in exchange for completing Mechanical Turk-style tasks.

Spammers may not give up so easily, though. If they can’t afford to pay microtariffs on everybody’s blog, the most likely alternative is to instead target only relevant blogs. How this differs from AdSense, I don’t know. The only serious concern left then is security. Spammers must not be able to phish or botnet their way into anyone else’s main account. But this is a concern for countless other scenarios as well, and is by no means an excuse not to try. And if time is money, the time we save by tackling the problem of spam as capitalists rather than technologists will more than make up for any spurious charges encountered as the system is gaining experience.

The plea

Spam is a people problem, as is making a microtariff system acceptable. I am a programmer. If spam was a big interest of mine, I might be immersing myself in Bayesian Filtering and Neural Networks in a technologic attempt to save the world. But this series has been immersing enough, and I have no interest in becoming the PayPal of the New and Amazing Web. Currently this site is low bandwidth and SpamKarma hasn’t swatted too many of my dad’s comments. However, when this site’s readership and “spammer”ship grow beyond what my budget can bear, I would much rather join a privatized microtariff economy than Google’s ginormous AdSense scheme. If you know someone who’s looking for a get-rich-and-save-the-world,like,yesterday scheme, I’d love to hear what they think of this plan. The comments are open and, for as long as practical, still free!

April 21, 2007

Refinance our survey for free smilies

written by natevw @ 6:24 pm

It is unfair to discuss the problem of spam without bringing up its archetype. Spam is the barbarian offspring of advertising, itself an uncouth tradition. Marketing, and often with it some advertising, is important to good business. That said, it’s disgusting how prevalent and tasteless advertising has become.

The relevance of marketing

The best advertisements still serve their victim: Hey, maybe you haven’t heard of us, but we do X which might help you Y. But even the “spend time with your kids, don’t start forest fires” advertisements are still distractions from the main course.(1) As far as living goes, is spam really much less relevant to me than the hundreds of other ads I see online everyday?

In a sense, web advertising is highly relevant to my online life. Without it, countless online destinations would only be hurt by my tourism (the previous post explains). Google gets about 90% of its incredible income (note those figures are in thousands of dollars!) from advertisment. It’s hard to imagine the “google.com” domain name becoming a home for squatters. Given the fortune Google has amassed, I don’t doubt they could keep paying their top-notch engineers long past finding alternate revenue sources, but many other domains would be in dire straits if it weren’t for advertisements.

The need for advertising

Web sites do need to pay their expenses. There’s not really a good reason for CLICK THE SCROLLING FLAMING MONKEY!!!, or the weird dancing business guy(2). In case you want your memory refreshed, Jakob Nielsen has listed some other sketchy advertising techniques that are still pretty common nowadays. As he points out, nothing good comes out of making users annoyed. There is, however, certainly a place for paying the bills through corporate sponsorships, intelligently targeted graphical links, and even computer-placed text ads with all their mechanised “Sense”itivity. I just don’t see an enduring economy for so many sites getting into the ad-supported model, to say nothing of the aesthetics in that situation.

Getting away from advertisements

To be honest, it was primarily my aggravation with advertising that led me to start this series. I think the Web, perhaps even the whole Internet, is ripe for an idea that can reduce the motivation for all forms of advertising: everything from the spam kings’ unfeeling e-mail blitzes to Google’s sentient AdWords revenue scheme. In my next post, I hope to finally elucidate such an idea. We’ll be back…


  1. Although Wikipedia’s eye-opening article ontelevision ads notes that commercials have all but become an essential part of the television experience. I can hardly stand television, but I understand a great many people do. Hopefully this diatribe doesn’t come across as overly cantankerous to my more experienced readers.
  2. Do you get those? I keep seeing job site ads with this white-collar worker who was photographed rolling up his sleeve and in other quasi-professional poses, yet is animated to slowly move his legs as if he were proud that they’re put out of joint.

April 17, 2007

Who pays?

written by natevw @ 7:41 pm

You are paying to read this post. In some way, quite direct if you’re on a cell phone and rather indirect if you’re in a public library, you paid money so that you could view this Web page on your screen. This seems fair enough, especially if you are finding valuable info or entertainment [infotainment?] here. But did you know that I am paying for you to read this post, too? To make matters worse, as I continue to make this website more valuable to more people, the cost for me to give them this content will go up. On the Internet, we pay proportional to how much we’re being served, but also how much we are serving others. It wasn’t always this way.

Last fall, I was watching a Cringley interview with Brewster Kahle, the founder of the Internet Archive and other endeavors. He revealed that just before the Web really started taking off, “in the ’80’s and [early] ’90’s…, AOL would charge people about $6.00 an hour to be on their service, and 10% to 15% of that gross revenue went to the publishers that were making the experiences that the people wanted.”(1) He called this a “royalty model” — just like a book, the more popular a site was, the more money it earned its creators. But eventually, those in power realized that they controlled access for enough “eyeballs” that they could start charging the content providers for the attention they attracted. And so today’s Web model was born: guests pay for access, hosts pay for guests and we need to somehow raise enough money to host over 65 million websites.

In addition to the cost of bandwidth, spam has further raised the price of hosting a website. Why should, say, a game company have to take time from answering user questions to fix the tool they use to do so, just because spammers have moved into the boards? Multiply that by countless others, who each put up a website allowing user-feedback and a month or two later are frantically searching for a solution that will keep the crap at bay. The only reason it is affordable is because every single community-oriented site has a share in the snowballing(2) costs.

There are two problems that face community-oriented endeavors on the Web and on the Internet: how to make money, and how to keep the service from abuse. Typically the former is solved by advertising and the latter by advancing technology inspired by artificial intelligence research. I’ve discussed the trouble with the AI-based technology approach in my last two posts, and I intend to discuss the problems with advertising in my next. After that, we should be all set to look at a solution.


  1. A written transcript is available of the entire interview, which is a good view if you’ve got the time.
  2. Someone thinks of new way to have the computer decide whether a chunk of data is spam or not, hundreds have to figure out how to implement the idea in their respective contexts, and thousands, sometimes millions, in each context have to figure out what’s the best blog comment plugin, what login/captcha system will best serve their guests, and what e-mail program or filtering service won’t junk too many of their friends’ random emails. Add some old-fashioned hand deletion of the ones that still get through, and we’re finally back where we wanted to be, only with an added annoyance or two. There aren’t many people whose time is best spent on this particular problem, but there are plenty who must deal with spam anyway.

April 14, 2007

Fighting evil with things, loving people with robots

written by natevw @ 6:25 pm

Though it’s really not an interest, I’ve been thinking about our response to spam a lot lately ever since I started seeing it as sin in the raw form of the word. It shouldn’t be a surprise, as even many who might laugh at the concept of “sin” describe spam as evil. (The author of SpamKarma, the tool I am currently using to shield you from endless pharmaceuticals, has laughed such laughs.) Thinking of spam as a sinful-human problem instead of a rogue-machine problem can change the way we approach the solution. I have a few more posts prepared related to this particular topic, which in the interest of not overwhelming you with my prolixity, I plan to spread out over the next week or so. Hopefully, talking about spam doesn’t bore you as much as it does me.

I spent the day writing up a solution that I think addresses some of these concerns. Why? I understand computer-based spam filtering has made definite progress. Yet I think AI-based technology will continue to be a battle we can only half win, analogous to many wars the United States has taken on in the last half of a century(1). Guns and computers are both highly inadequate tools when it comes to solving human problems. Yes, we have defended life with weaponry and we have facilitated community with machinery. But it’s not ideal. To think we can discourage spammers with computers is a man-versus-nature story to which most authors(2) would write a different ending.

To believe that technology can save us from spam is just as idolatrous as believing technology can save us from any other consequence of our sinfulness. We’ve relied on technology to do what we can’t for so long that now we are beginning to rely on it to do what we won’t. Our world would rather research and develop a cute and cuddly “Mental Commitment Robot”s than spend time with the sick and the elderly. If the pictures at the bottom of that linked page don’t break our heart as Christians who are supposed to be the power of Christ and the love of God, I don’t know what it will take to wake us from our technicism.


  1. It is not my intent to make light of the current, or any previous, war. I am primarily a programmer. I am not a historian, not a strategist, not God. For the sake of those in the Middle East and for the sake of our troops, I hope and pray that Operation Iraqi Freedom will turn out as just that: political, economic and spiritual freedom. Sadly, I’ve been feeling more and more like one side has gotten us sorely off on the wrong foot, and the other side is intent on not letting any mistakes get fixed regardless of cost. All for what? [Very naughty word] politics. Please consider reminding a soldier of the joyful, wondrous and beautiful sides of the life they fight for, even if you must set aside some cynicism to do so.
  2. Yet it would be a disservice to you as a reader to pretend that everyone accepts the man/nature or the man/machine dichotomy. Much to the contrary. Perhaps this is why technologists get so excited about robots. Feeling they have breathed life into a machine gives them hope that computers can not only solve our homework problems, but also the problems we have on the playground.
« Previous PageNext Page »