Archive for April, 2008
History Blog Meme
After Leon posted tonight, figure I will follow in kind.
ken@thinky:~$ history 1000 | awk '{a[$2]++}END{for(i in a){print a[i] " " i}}' | sort -rn | head
93 ls
82 vi
55 cd
48 git
47 grep
25 gedit
21 sudo
20 merb
19 exit
9 spec
MacDaddy:~ ken$ history 1000 | awk '{a[$2]++}END{for(i in a){print a[i] " " i}}' | sort -rn | head
50 merb
48 git
46 sudo
45 rake
42 cd
36 ls
29 spec
24 mate
23 vi
12 cp
Surprised sudo is on there so much, but likely from installing a lot of gems lately, and constantly running “sudo rake install” on DataMapper and Merb. Need to move my gems to home directory sometime.
Not overthinking a simple problem
One little project I had taken up recently was to get Thoughtbot’s Paperclip plugin working under Merb and DataMapper. Initially, what I wanted to do was to was implement it completely as a DataMapper custom type. Basically, I wanted this as the class definition:
class User include DataMapper::Resource
property :id, Fixnum, :serial => true property :username, String property :avatar, Paperclip::Attachment, :styles => { :medium => "300x300>", :thumb => "100x100>" }end
With this, it would all be contained as a single column in the database and it would serialize the attributes to JSON (though initially used YAML).
The problem? Initially, a custom type didn’t know about how it was being used at all, such as what the model’s name was and what the property’s name was. The object model basically went Model <-> Property -> Type. You could get to the model from the property, but couldn’t get to the type. I changed that with my recent contribution and was well on my way to finishing. I got it so you could save, post-pone thumbnails til saving, validations, and all. But the one thin gI couldn’t do was locate the file when it was retrieved later. You see, the custom type doesn’t have access to the instance at all, so it had no way of knowing the ID of the record. When saving, it dynamically added an after-save handler which let it gain access to the ID, but when retrieving it later, didn’t quite have the ability.
A hack was to make it so the custom type’s load method would redefine the getter in the method to pass in the ID, but that only worked on the 2nd call to the object. So you couldn’t do this:
user = User[123]user.avatar.url
The redefinition was only in effect on the second call, so it was rather hacky. Any changing so the custom type could get to the instance really muddies up the whole object model.
Additionally, as I was contemplating ways around it, I realized another negative of my method: it couldn’t import a database from the ActiveRecord usage of Paperclip. Sure, you can use ActiveRecord within Merb, but if someone wanted to move from ActiveRecord to DataMapper with an existing database, they wouldn’t be able to.
So, in following with the Occam’s razor / KISS ideology, I went back to simply update the original implementation and within 15 minutes, I had it working, completely with retrieving and all. Granted, not all the work was wasted. I still needed the rewritten validations, and was good to add to DataMapper (as I know I will use that later on). Additionally, I now know how to develop custom types inside and out.
So now, the class definition looks more like:
class User include DataMapper::Resource include Paperclip::Resource
property :id, Fixnum, :serial => true property :username, String has_attached_file :avatar, :styles => { :medium => "300x300>", :thumb => "100x100>" }end
Even better, it should be compatible with existing databases that might want to migrate from ActiveRecord to DataMapper.
First open source contributions
Probably nothing to really go screaming about, but with wanting to become more publicly active as a developer, part of that involves getting code out there, and also contributing to other projects.
At Telligent, we do yearly performance reviews and goal setting as way to grow as a developer in experience/knowledge and bring it back into the company. One of my goals for this year was to contribute to open source as a way of expanding outside of our own internal stuff. Part of it is while I enjoy Community Server, I’ve been looking at that code for about 3.5 years. Open source is a great way to get involved with other initiatives and make contributions where you can, or to write some small piece of code and put it out there for others to consume, contribute, and grow. Can be small, short departures from the norm, and gain experience that you can roll into future works.
As a comparison, it is interesting the difference between the Ruby world and the .NET world when it comes to open source. There are tons of components/extensions in both worlds, but in the .NET world, the majority are commercial components, from companies like DevExpress, Telerik, or Xceed. While in Ruby, almost all are open source. .NET does have excellent open source projects, such as the first one to come to mind being SubSonic, but not nearly as many.
Some others like Scott Hanselman, with his Weekly Source Code postings, will use/work with other existing code to better their own knowledge. I’d much more like the approach of writing code to put out there and contribute to other projects. In this way, it is more actively writing code instead of working with code.
At any rate, I recently made my first open source contribution. I’ve been doing some work with Merb and DataMapper, and was working on a little project to port the Paperclip extension by Thoughtbot to DataMapper. While working on it over some PTO time I had, I needed to add a feature to DataMapper so I could make a custom type more contextually aware. So if your attachment is in a user model and named avatar, you want to know that when saving it onto the filesystem. And just yesterday, my fork got merged back into the core.
As for the Paperclip port (I’ve called DM-Paperclip), I will hopefully get it polished off this weekend.
Moving into the cloud
Not too long ago, some others like James and Rob had mentioned using Google Apps as a way of moving service into the “cloud”, so to speak. Google Mail is one that has intrigued me for some time. I have a Gmail account, though rarely use it. I knew Google had Apps For Your Domain, but never took the plunge to try it, even though it is free. With some of the more recent chatter about it, I figured it was about time I took the plunge.
The main reason I hadn’t before was because I run my own server, and did some hosting for others, so naturally, I had a mail server installed on it and it worked quite well. I had spam filtering on it, webmail, and IMAP, so it filled all my needs, but one thing that I hate about most email is the build up in the inbox, and then the collection of folders you inevitably get trying to organize it.
One key advantage I’ve noticed so far with Google Mail is how it threads conversations. You no longer need to go hunting back for previous responses. Secondly, the simple “archive” button allows your emails to disappear from the inbox, reducing clutter, without having to necessarily file them in some arbitrary folder. Want to find it later? No problem, got a great search right there. For the things you do want to organize more, can use labels and filters to make things work flawlessly.
The big advantage to Google Mail for me? Finally I have an easily tamable inbox. I get less spam than I did before, a clean organized inbox, and avoid the folder nightmare. Ohh, and I can ditch archive PSTs which love to get corrupted (thanks Outlook for losing 2 1/2 years worth of mail not too long ago).
So I’ve been using Google Mail for my personal email for about a week now and it is a total success so far. Now, if only Telligent would move its email infrastructure onto Google, I could avoid Outlook all together.
Now with a tumblelog
I now have a tumblelog as well: tumblr.invalidlogic.com
Or linked in the title as well.
I subscribe to a couple people’s tumblelogs since they’re a great way to get quick little tidbits like links or the latest youtube crazes. I figured I’d start one for the stuff that is kind of in between Twitter and a blog post. It will likely be a mixture of quick technical links, pictures of my son (got to post those!), or just short random thoughts. In my effort to keep my blog more technical, the general stuff would likely be on my tumblelog.
Also, Tumblr is a great platform for that kind of thing. Sure, Graffiti could do it, as it is basically a blog with a very minimal theme, but Tumblr has great quick post methods to easily add a picture, video, quote, etc.
Site has moved!
After my post yesterday, I think I came up with a domain name that I like: invalidlogic.com
You got to look at it as a bit of a play on words. It could be invalid logic, or the way I have the title, In Valid Logic.
I think I’ll stick with it. Better than some of the other ones I was considering. At any rate, hop on over to the new site, and don’t forget to update your subscription!
As of now, I will no longer be blogging on qgyen.net. It has been a good run.
All my old content will remain live though.
Subversion+Apache virtual host mods
On my servers, I recently added some support for a few others to centrally store some Subversion repositories. Instead of each of them having to install the Subversion server on their own, I figured I’d set up a basic shared one, hosted through Apache.
One thing I was after was virtual host/host header support in the way it served repositories. I wanted to be able to do krobertson.svn.domain.com, or jdoe.svn.domain.com, etc. By default, Subversion will only look for repositories in the specified directory with SVNParentPath, and the directive for specifying an authz access file worked similarly.
I decided to dust off some of my C skills and take a look at making the modification myself. Admitedly, it wasn’t some master work, but someone else might like to do the same thing.
Code is available here: http://github.com/krobertson/tidbits/tree/master/svn_mods
A fresh start
For a while now, I’ve been wanting to find a new domain name for my personal use. Previously, I was at qgyen.net, and have had that domain for about 8 years. It has all of my Google juice. If you Google for "Ken Robertson", it is #1 (woot!).
The problem? Just try to pronounce it. It is just a word I made up, and made up on the computer. For a long time, I wasn’t even sure how one would pronounce it… finally figured it was basically q-gen. But seriously, imagine you are out want to give someone your email address or your blog or something and you tell them "go to q-gen.net"…. the response is always "q-what?"
Sometimes I think I should just use my Telligent email address, since it is simpler, but I try to separate personal and work email as much as possible. Work stuff is usually more urgent stuff, and I don’t want it to get dulled down by mixing it with personal email. Additionally, it doesn’t solve the issue if I want to give out my blog address. I still get the "q-what".
So I am starting fresh, and I have a goal: more technical content. I don’t think I post enough technical content. Often times, always thing it needs to be very in-depth and insightful, but really, simple stuff and observations can be just as useful.
I also want to make a dedicated effort to actually release some public code. All of my older Community Server modules have kind of fallen behind, either lack of necessity due to adding stuff to the core or what, but I’d like to do more general purpose stuff rather than stuff product-specific. Plus putting code out there to the public is a good exercise. It helps others who might need something similar, it toughens your skin if it is critiqued, and builds confidence and comfort as often developers can feel embarrassed of their code. For someone who works mostly closed-source, showing your code is kind of like flashing someone.
Finding a new domain is so hard
One thing I’ve been trying to do for a long time is to find a new domain name to move my blog too. One that is catchy, easy easy to remember, short, .com, et all. It is way harder than it sounds, really.
Qgyen is kind of old and out dated. It has all my Google juice, but that is mainly because I’ve had it for ~8-9 years. But it is just a word I made up. No one knows how to pronounce it, I don’t give it out to people via word-of-mouth since it sounds dumb when I say it, and no one can spell it. Looking at my Google Analytics, often times people get to my blog by going to Google and searching for “ken robertson”. Even Rich Mercer has told me he gets to my site by Googling for my name.
If people you know (friends/coworkers) get to your site by Googling your name, you need an easier domain!
But finding a good domain these days isn’t easy. First, my name is too long, and even then, the .com is taken (I have the .name though, but who uses .name?). I had thought of a few others, but when thinking about the names later, they sounded too corny, or just didn’t seem fitting.
The two main contenders I came up with were linkedlabs.com and explosivethoughts.com. Linked Labs just sounded kinda catchy, but I am not a lab. I’m not linked to any labs. But they sound good together. Then I got explosivethoughts.com… it sounded catchy at the time. Kind of like “this idea is so hot, it’ll explode” or the tagline I came up with and put on the stub site was “lighting the fuse on bright ideas”, or if I wrote some program I posted, could make the credits as saying “an explosive thought”… this domain was basically the result of staying up too late one weekend watching the movie Accepted where it had the guy who wanted to learn how to blow stuff up with his mind (hence, explosive thoughts). I looked up the domain and got it, but then a day or so later, was thinking that maybe I wouldn’t want my blog/site correlated with word explosive.
Still trying to come up with something.
Google App Engine and its impact
I’ve been doing some reading about Google’s announcement of their “App Engine” platform for hosting scalable applications. Some of the reactions so far are pretty interesting. Partially of note, I see a number of places saying it will be the “Amazon killer”, or “hosting killer”, and even that it is “Geocities 2.0″. In my opinion, all of them are wrong.
Google isn’t going to kill anything. The hosting market is very vast, with diverse offerings, new consumers every day, and players always coming and going. They’ll add a new dynamic for sure, but they’ll be perfect for some, and won’t be a good fit for others.
Easy scalability is now an issue. Up until recently, sites that needed to scale usually followed this (or a similar route): come up with idea, start site, begins to catch on, get VC (or money from somewhere), build (or pay someone for) a scalable infrastructure. Now, a super small (even single developer) can build an application that goes from zero to hero in mere weeks, especially with the advent of sites like Facebook.
Google vs. Hosting Companies
Google’s target is different than your average hosting company. Google’s specifically called out they’re targeting developers and scalable web applications. At your regular $3.95/mo hosting company like GoDaddy, Dreamhost, or other small companies, this isn’t something their customers are after. Sure, Google has some nice offerings for free, but many of them are not developers with those kind of needs. Many of the customers aren’t developers, or have more generic needs like wanting a small site, blog, forum, etc. Google could make it easier for them to deploy, but I don’t foresee Google targeting this. Common hosting is often high support and low profit. They don’t really need BigTable, cloud computing, or anything like that.
Then there are your larger scale hosts who do big deployments, managed support, etc. Places like Rackspace, OrcsWeb, Engine Yard, or BitPusher (the main ones I know of). Often times the people who go to them are larger scale sites who might benefit from what Google offers, but they could also be turned off by the restrictions or need functionality outside of them. Some of the constraints like no direct disk access, no background processing, locked into BigTable/GQL, etc., could be too much. Google understandably places restrictions, since that is how they will offer the cloud they are, but any time you place a restriction, you eliminate some people who can use it. It is a game of give and take. High end managed hosting will still thrive. Often, their biggest asset is the level of personal support they can offer, especially when they do targeted expertise (OrcsWeb does Windows, Engine Yard primarily does Ruby). I don’t see Google being able to match that.
Google vs. Amazon
Some people have said Google App Engine could kill Amazon EC2/SimpleDB/etc, but the reality is they focus on different markets. There is some cross over, both have their own restrictions, but one fills needs the other can’t. Google is only web apps and specifically says they aren’t offering virtual machines or grid computing. Amazon offers virtual machines and can do web app hosting, but does have restrictions on guaranteed availability and storage. Google promises simple web apps, Amazon doesn’t, but you get a full system with Amazon. Companies that need background processing will definitely still use Amazon, and that likely represents a good portion of their usage. Companies like SmugMug offload a lot of their image processing to EC2, and something like that could not be solved by GAE. With Amazon’s recent availability zones and elastic IPs, it is certainly possible to have high availability web apps. Google is more technically restrictive, while Amazon has a higher technical barrier. Additionally, Amazon is more diverse, with EC2, S3, Simple DB, and SQS. They can be used together, or separate.
Someone in a blog comment claimed uptime with Google would be better, but that is pure speculation. All systems fail, in one way or another. Making uptime claims for GAE when it is out less than 24 hours is very optimistic. EC2 has its issues, but Amazon makes that clear, and well designed systems on EC2 should adapt just fine. GAE will almost certainly, at one point or another, have an issue, degraded performance, momentary outage, etc.
On going impact
Probably the most powerful offering with GAE that hopefully develops further is their database. Scaling the database is often the biggest issue for large sites. The web tier almost never the bottle neck, but the database. Scaling a database can be very costly, especially with traditional RDBMs. In the future, we’ll likely see much more with BigTable and products targeting distributed scalable databases made easy. Microsoft recently announce SQL Server Data Services, but I expect much more to be coming.
There will likely be other “cloud computing” offerings coming from others. Some speculate Microsoft will be, others have thought HP. I think there will still be more to come.