<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
 
 <title>In Valid Logic</title>
 <link href="http://invalidlogic.com/atom.xml" rel="self"/>
 <link href="http://invalidlogic.com"/>
 <link rel="license" type="/application/rdf+xml" href="http://creativecommons.org/licenses/by-nc-sa/3.0/rdf" />
 <updated>2012-01-22T11:25:37-08:00</updated>
 <id>http://invalidlogic.com</id>
 <author>
     <name>Ken Robertson</name>
     <email>ken@invalidlogic.com</email>
 </author>

 
 <entry>
   <title>Jekyll on PaaS.io with Cloud Foundry</title>
   <link href="http://invalidlogic.com//2012/01/06/jekyll-on-paasio-with-cloud-foundry/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2012-01-06T00:00:00-08:00</updated>
   <id>http://invalidlogic.com//2012/01/06/jekyll-on-paasio-with-cloud-foundry</id>
   <content type="html">&lt;p&gt;Recently I moved my blog over to the service I am currently working on building, &lt;a href=&quot;http://paas.io&quot;&gt;PaaS.io&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Running Jekyll on PaaS.io isn't all that different from running it on other services, though I had a few other goals in mind.  The way I wanted it set up was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stop using &lt;code&gt;rack-jekyll&lt;/code&gt;.  Its a nice gem, however it is locked to an older version of Jekyll.  And the current gem is on an even more outdated one.  Currently, Cloud Foundry doesn't support pulling bundler sources from git too.&lt;/li&gt;
&lt;li&gt;Have a &lt;code&gt;public&lt;/code&gt; folder for static content like CSS and images.&lt;/li&gt;
&lt;li&gt;Have the &lt;code&gt;_site&lt;/code&gt; folder for generated content&lt;/li&gt;
&lt;li&gt;Don't have it copy the &lt;code&gt;Gemfile&lt;/code&gt; and the &lt;code&gt;config.ru&lt;/code&gt; into the &lt;code&gt;_site&lt;/code&gt; folder (annoys me)&lt;/li&gt;
&lt;li&gt;Redirect &lt;code&gt;www.invalidlogic.com&lt;/code&gt; to &lt;code&gt;invalidlogic.com&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Low foot print&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;First, the &lt;code&gt;Gemfile&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;n&quot;&gt;source&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:rubygems&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;gem&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;rack-contrib&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:require&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;rack/contrib/try_static&amp;#39;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;gem&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;rack-redirect&amp;#39;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;gem&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;thin&amp;#39;&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;group&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:development&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gem&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;jekyll&amp;#39;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gem&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;RedCloth&amp;#39;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gem&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;rdiscount&amp;#39;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;The main gems being used are &lt;code&gt;thin&lt;/code&gt;, &lt;code&gt;rack-contrib&lt;/code&gt; (for TryStatic, note on that later), and &lt;code&gt;rack-redirect&lt;/code&gt; (for www redirection).  I also include some of the gems I use for Jekyll in the development group.  That way they are available locally but not loaded when I deploy (lower footprint... and yes, it is minor).&lt;/p&gt;

&lt;p&gt;Now, the &lt;code&gt;config.ru&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;rubygems&amp;#39;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;bundler&amp;#39;&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;Bundler&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;require&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Rack&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EY&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Solo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;DomainRedirect&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Rack&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;TryStatic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;:root&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;_site&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;:urls&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;sx&quot;&gt;%w[/]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;:try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;.html&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;index.html&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;/index.html&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Rack&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Static&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;:root&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;public&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;:urls&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;sx&quot;&gt;%w[/]&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;404&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;Content-Type&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;text/html&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;Not Found&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;There are 4 rack components here.  First, &lt;code&gt;Rack::EY::Solo::DomainRedirect&lt;/code&gt; is the &lt;code&gt;rack-redirect&lt;/code&gt; gem and handles the www redirection.  Next, is &lt;code&gt;Rack::TryStatic&lt;/code&gt;.  It is used to access files from the &lt;code&gt;_site&lt;/code&gt; generated content directory.  It gives a couple different &lt;code&gt;:try&lt;/code&gt; values for different ways to find the intended file.  Then is the &lt;code&gt;Rack::Static&lt;/code&gt; which gets static content from the &lt;code&gt;public&lt;/code&gt; directory.  No need to try different combinations.  And last is a generic lambda that will return 404.&lt;/p&gt;

&lt;p&gt;Next, want to avoid duplication.  With things as they are, when I run &lt;code&gt;jekyll&lt;/code&gt; it will copy the &lt;code&gt;public&lt;/code&gt; and other items into the &lt;code&gt;_site&lt;/code&gt; folder duplicating it.  To resolve that, in our &lt;code&gt;_config.yml&lt;/code&gt;, can add an exclude line:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;yaml&quot;&gt;&lt;span class=&quot;l-Scalar-Plain&quot;&gt;exclude&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p-Indicator&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;public&amp;#39;&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;Gemfile&amp;#39;&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;Gemfile.lock&amp;#39;&lt;/span&gt;&lt;span class=&quot;p-Indicator&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;config.ru&amp;#39;&lt;/span&gt; &lt;span class=&quot;p-Indicator&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;And with that, we are set!  All of our goals are met.  Commit and push to deploy!  Currently I have &lt;a href=&quot;http://cloudfoundry.org&quot;&gt;Cloud Foundry&lt;/a&gt; set up with a Rack framework defined (will be sending a pull request with it soon) and also have my blog set to use Ruby 1.9.3 as well.&lt;/p&gt;

&lt;p&gt;Soon I'll be providing some more details on &lt;a href=&quot;http://paas.io&quot;&gt;PaaS.io&lt;/a&gt;, so stay tuned and click over to it and sign up to get access to the beta.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>iPhone vs Android is the new Mac vs PC</title>
   <link href="http://invalidlogic.com//2011/10/18/iphone-vs-android-is-the-new-mac-vs-pc/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-10-18T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/10/18/iphone-vs-android-is-the-new-mac-vs-pc</id>
   <content type="html">&lt;p&gt;After spending &lt;a href=&quot;/2010/04/06/one-week-with-a-droid-and-a-few-hours-with-an-ipad/&quot;&gt;18 months with Android&lt;/a&gt;, I am now back to an iPhone and
likely here to stay.  While Android isn't necessarily bad, it more boils
down to iPhone simply being better.  After a while, I started looking at
iPhone vs Android as largely a repeat of the Mac vs PC comparisons.&lt;/p&gt;

&lt;h3&gt;Mac vs PC&lt;/h3&gt;

&lt;p&gt;Just look at it.  Apple is doing what they've always done.  They control
the hardware and the software.  They have a solid, unified experience.
You can pick up any iPhone, old or new, and still be at home.&lt;/p&gt;

&lt;p&gt;Android is repeating the history of PCs.  Google makes the OS, as
Microsoft did with Windows.  Then OEMs offer a wide variety of different
hardware, with their own shitty customizations layered on top of the OS.
Even worse, you have the carriers layering on their customizations and
restrictions.&lt;/p&gt;

&lt;p&gt;You can go from one Android phone to another and have it
be completely foreign.  Even baseline apps can have different names and
a completely different look and feel, such between the &quot;Email&quot; vs &quot;Mail&quot; of vanilla
Android and HTC Android.&lt;/p&gt;

&lt;h3&gt;Shelf Life&lt;/h3&gt;

&lt;p&gt;With Android, the phones have a very short shelf life.  I bought a
Thunderbolt from Verizon back in April, just 6 months ago.  About a
month after I bought it, it was no longer the hot model they were
pimping.  In fact, there have been 2-3 phones to come out since then
that became &quot;king of the mountain.&quot;&lt;/p&gt;

&lt;p&gt;With iPhone, they release a new phone about once a year, and that one
stays the current phone.  Older phones are still pretty well supported.
The iPhone 3GS is over 2 years old, still got updated to iOS 5, and
likely will until iOS 6.  My original Motorola Droid that is
nearing 2 years old is pretty much forgotten already.&lt;/p&gt;

&lt;p&gt;Updates with iPhone?  Available right away to everyone.  AT&amp;amp;T or
Verizon, you get the update when Apple makes it generally available.&lt;/p&gt;

&lt;p&gt;Updates with Android?  Have fun.  Google has to release it, your OEM has
to customize it, then the carrier gets to tweak it, and decide when to
finally roll it out.  Google released Gingerbread back in December 2010,
nearly 10 months ago.  Verizon just started rolling out the update last
month, &lt;a href=&quot;http://www.gottabemobile.com/2011/10/17/htc-thunderbolt-gingerbread-update-said-to-be-returning-soon/&quot;&gt;then pulled it&lt;/a&gt;.  Even then, they decide when you can upgrade with a phased roll out.  And since it was pulled, looks like they seem to skip over adequate testing.&lt;/p&gt;

&lt;p&gt;Most Android users who want recent releases end up rooting their phones
and use unofficial ROMs put together by an informal group of people.
Have an issue with your phone?  Limited options.&lt;/p&gt;

&lt;h3&gt;Marketing&lt;/h3&gt;

&lt;p&gt;I definitely agreed with &lt;a href=&quot;http://www.slowcookedbacon.com/&quot;&gt;my friend Joe&lt;/a&gt; in his &lt;a href=&quot;http://www.slowcookedbacon.com/the-droid-bionic-shows-why-android-isnt-domin&quot;&gt;post about the Droid Bionic&lt;/a&gt;.  Android phones are being pitched like PCs.  They give a bunch of technical specs that are meaningless to consumers.  It echos the &lt;a href=&quot;http://www.ted.com/talks/simon_sinek_how_great_leaders_inspire_action.html&quot;&gt;&quot;Golden Circle&quot;&lt;/a&gt; idea by Simon Sinek.  Apple is still doing what they do best.  Verizon sells like PC manufacturers.  Lacking inspiration.&lt;/p&gt;

&lt;h3&gt;False Sense of Market Share&lt;/h3&gt;

&lt;p&gt;You've probably seen the headlines of &lt;a href=&quot;http://www.dailytech.com/Android+Outsells+iPhone+5to2+Has+Nearly+50+Percent+of+the+Market/article22326.htm&quot;&gt;&quot;Android sales outpacing
iPhone&quot;&lt;/a&gt;
or &lt;a href=&quot;http://www.dailytech.com/Android+Market+Share+Reaches+56+Percent+RIMs+Microsofts+Cut+in+Half/article22852.htm&quot;&gt;&quot;Android market share to surpass iPhone&quot;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Overall the numbers are comparing apples to oranges.  They are comparing
the sales figure of a &lt;em&gt;single phone&lt;/em&gt; to a whole &lt;em&gt;group of phones&lt;/em&gt;.
There are many different Android phones, and individually, iPhone is
spanking them in sales figures.  No single Android phone has a chance of
elipsing the iPhone or gaining any meaningful market share, especially
with their overall short shelf life.  I'd be very interested to see
sales figured of individual phones and see how long they really last.
All the manufacturers are still struggling to get a sliver of what the iPhone is capable of.&lt;/p&gt;

&lt;p&gt;The numbers may hold some weight for developers because it represents
the size of your audience.  But at the same time, Android is many phones
vs iPhones entire existence is now just 5 phones.  Less &quot;your app
doesn't work on my Verizon Droid Mumbojumbo&quot; and you don't have a Droid
Mumbojumbo to test on.&lt;/p&gt;

&lt;h3&gt;Falseness of Open&lt;/h3&gt;

&lt;p&gt;Many people tout Android as being open, however the actions of Google
with &lt;a href=&quot;http://www.businessweek.com/technology/content/mar2011/tc20110324_269784_page_2.htm&quot;&gt;Honeycomb's source code&lt;/a&gt; seem to be heading in the direction of more closed.&lt;/p&gt;

&lt;p&gt;While Google said part of their goal was to try and unify the platform
more, as Honeycomb was intended for tablets and not handsets, the
ability to control a platform is difficult while also keeping it &quot;open.&quot;
I think Google is right in closing it, since in order to further the
platform, they will need to have some control in order to maintain a
consistent direction.&lt;/p&gt;

&lt;p&gt;However, the &quot;openness&quot; usually just comes from developers.  What do most of then
define the &quot;openness&quot; as?  Being able to write apps and put them on
their phone without paying $99.  They tout the source being open, but
the truth is that isn't what they really care about.  Very few Android
developers likely dive into the OS code, they just want to install their
own apps for free.&lt;/p&gt;

&lt;p&gt;Personally, if I prefer one platform over the other, a $99 fee to build
apps isn't going to a stopper for me.  But I also
don't mind paying for the tools I use in my craft if they are worth it.&lt;/p&gt;

&lt;h3&gt;Working&lt;/h3&gt;

&lt;p&gt;Most average people care more about how well the phone works.  iPhone
simply works better.  Since software and hardware are more married, the
experience is more consistent.  In my own experience, apps crash less on
iPhone.  The phone lags less.  Scrolling and browsing is more graceful.
My wife is still on her original Droid for another month, and every day
I see her struggle with the phone.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.changewaveresearch.com/articles/2011/smart_phones_20110718.html&quot;&gt;iPhone has better customer satisfaction ratings than Android.&lt;/a&gt;  While the reasons aren't mentioned, I wouldn't be surprised that part of it stems from &quot;it just works&quot;.&lt;/p&gt;

&lt;p&gt;Apple has a strong emphasis on usability.  Google and the OEMs aren't as
much so.  This particularly stuck me with a &lt;a href=&quot;http://dinnerwithandroid.tumblr.com/post/8838035574/dual-wielding&quot;&gt;post I read&lt;/a&gt; about a guy talking to a girl who had an Android phone and an iPod Touch.  In particuar, &quot;nothing happened when I plugged it into my computer.&quot;  To the average user, the simple integration is important.  They don't know why the phone shows up as a &quot;Mass Storage Device&quot; when they plug it in.&lt;/p&gt;

&lt;h3&gt;Future&lt;/h3&gt;

&lt;p&gt;The future for Android will probable improve.  The OS is maturing and
hardware getting better, however the ecosystem of manufacturers and
carriers will likely stay the same.  The reality is that Apple is doing
what Apple has always done and they've gotten really good at it.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Cooking with Chef slides</title>
   <link href="http://invalidlogic.com//2011/07/25/cooking-with-chef-slides/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-07-25T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/07/25/cooking-with-chef-slides</id>
   <content type="html">&lt;p&gt;A little late, but I have posted my slides from my talk at the &lt;a href=&quot;http://www.meetup.com/EBRuby/&quot;&gt;East Bay
Ruby Meetup&lt;/a&gt; in &lt;a href=&quot;http://www.meetup.com/EBRuby/events/16505489/&quot;&gt;June&lt;/a&gt;.&lt;/p&gt;

&lt;center&gt;&lt;div style=&quot;width:425px;margin: 10px&quot; id=&quot;__ss_8668127&quot;&gt;&lt;strong style=&quot;display:block;margin:12px 0 4px&quot;&gt;&lt;a href=&quot;http://www.slideshare.net/krobertson2/cooking-with-chef-8668127&quot; title=&quot;Cooking with Chef&quot;&gt;Cooking with Chef&lt;/a&gt;&lt;/strong&gt;&lt;object id=&quot;__sse8668127&quot; width=&quot;425&quot; height=&quot;355&quot;&gt;&lt;param name=&quot;movie&quot; value=&quot;http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=cheftalk-110722221912-phpapp02&amp;stripped_title=cooking-with-chef-8668127&amp;userName=krobertson2&quot; /&gt;&lt;param name=&quot;allowFullScreen&quot; value=&quot;true&quot;/&gt;&lt;param name=&quot;allowScriptAccess&quot; value=&quot;always&quot;/&gt;&lt;embed name=&quot;__sse8668127&quot; src=&quot;http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=cheftalk-110722221912-phpapp02&amp;stripped_title=cooking-with-chef-8668127&amp;userName=krobertson2&quot; type=&quot;application/x-shockwave-flash&quot; allowscriptaccess=&quot;always&quot; allowfullscreen=&quot;true&quot; width=&quot;425&quot; height=&quot;355&quot;&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;/center&gt;


&lt;p&gt;Check them out and feel free to ping me if there are any questions.  At
the meetup, was also talking about doing a bit of a blog series on
getting started with Chef and posting some of the scripts and baseline
setup I have used before.  Hope to start forming some simple getting started resources.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>New Opportunities</title>
   <link href="http://invalidlogic.com//2011/07/19/new-opportunities/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-07-19T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/07/19/new-opportunities</id>
   <content type="html">&lt;div style=&quot;float:right; margin-left: 10px; margin-bottom: 5px&quot;&gt;&lt;img src=&quot;http://invalidlogic-blog.s3.amazonaws.com/demandbase_logo.gif&quot; width=&quot;200&quot; height=&quot;32&quot;/&gt;&lt;/div&gt;


&lt;p&gt;Finally posting about it a little bit late, but last week was my first
week in a new position with a new company.  I truly enjoyed my stay at
&lt;a href=&quot;http://www.involver.com/&quot;&gt;Involver&lt;/a&gt;, but saw an opportunity pop up and decided to take it.&lt;/p&gt;

&lt;p&gt;This new opportunity is to be one of the first in-house Ruby developers
at &lt;a href=&quot;http://www.demandbase.com&quot;&gt;Demandbase&lt;/a&gt;.  Demandbase specializes in a
B2B analytics API for reporting and logging information about visitors
hitting your business's website.&lt;/p&gt;

&lt;p&gt;That all sounds fancy, but the gist is I'll be helping to grow and form
the team, scale up and expand the platform, and able to leverage both
skillsets as a developer and in operations.  They've been working with
the awesome guys from &lt;a href=&quot;http://highgroove.com/&quot;&gt;Highgroove&lt;/a&gt;, utilizing
Chef in a continuous deployment setup, doing geographic load balancing, and have a pretty agile and
responsive process setup.&lt;/p&gt;

&lt;p&gt;I had an amazing time at Involver and grew so much personally and
professionally.  I was a part of some major projects and instrumented a
lot of change.  Our Operations team was top notch and put out of a lot
of stuff considering it was only 3 people.&lt;/p&gt;

&lt;p&gt;Going forward, have some goodies I'll hopefully be working on and open
sourcing.  There are a lot of interesting things in the works and hope
to post about them over time.&lt;/p&gt;

&lt;p&gt;And by the way, Demandbase is hiring.  Hit me up
&lt;a href=&quot;http://twitter.com/krobertson&quot;&gt;@krobertson&lt;/a&gt; or shoot me &lt;a href=&quot;http://scr.im/2e1e&quot;&gt;an email&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Announcing Gemstache</title>
   <link href="http://invalidlogic.com//2011/06/30/announcing-gemstache/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-06-30T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/06/30/announcing-gemstache</id>
   <content type="html">&lt;p&gt;Pleased to announce &lt;a href=&quot;http://gemstache.com&quot;&gt;Gemstache&lt;/a&gt;, a new service to enable teams and companies to build and distribute their own private gems with security and ease of use in mind.&lt;/p&gt;

&lt;p&gt;Gems are an incredibly simple way to package up code and distribute it, making it available to multiple code bases or just a great way to encapsulate functionality.  However, the way gems are traditionally distributed doesn't lend itself well to securing the gems and controlling access.  The default &quot;gem install&quot; command you run doesn't provide any support for authentication against the gem source, so most sources are open to the public.&lt;/p&gt;

&lt;div style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://invalidlogic-blog.s3.amazonaws.com/2011-06-30-gemstache.png&quot; width=&quot;399&quot; height=&quot;323&quot; /&gt;&lt;/div&gt;


&lt;p&gt;Gemstache works by giving you your own private gem source you can use.  In order to download any gems from it, you must have the accompanied &quot;gemstache&quot; gem installed locally or on your servers.  When talking to your private gem source, it interlaces to add authentication to the request as well as ensuring it is happening over HTTPS.  With this, your gems are only downloaded over SSL and any interaction requires authentication.&lt;/p&gt;

&lt;p&gt;To make managing your gems easier, the gem adds the ability to easily upload a gem to your private gem source and make it available.  From your command line, just run &quot;gem stache [your gem file]&quot; and it is uploaded and ready for use.&lt;/p&gt;

&lt;p&gt;Within the service, you can give users varying roles to define whether they can just download gems, are able to publish new gems, and are able to yank/delete gems.&lt;/p&gt;

&lt;p&gt;Gemstache was born out of a need we had been facing at &lt;a href=&quot;http://www.involver.com&quot;&gt;Involver&lt;/a&gt;, where some of our own internal code is enclosed in gems and used between a few of our codebases, as well as we have some gems we have forked to either add customizations or our own fixes.  Traditional methods left them exposed and also more painstaking to release.&lt;/p&gt;

&lt;p&gt;We're just about ready to launch a public beta.  Visit &lt;a href=&quot;http://gemstache.com&quot;&gt;gemstache.com&lt;/a&gt; and sign up to be notified.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Beware of error handling with fibers in an evented context</title>
   <link href="http://invalidlogic.com//2011/05/26/beware-of-error-handling-with-fibers-in-an-evented-context/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-05-26T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/05/26/beware-of-error-handling-with-fibers-in-an-evented-context</id>
   <content type="html">&lt;p&gt;One of the more interesting developments in Ruby lately has been the interesting things being done with Ruby 1.9.2, Fibers, and EventMachine.  Whether you look at it being applied to existing libraries with &lt;a href=&quot;https://github.com/mperham/rack-fiber_pool&quot;&gt;rack/fiber_pool&lt;/a&gt;, or &lt;a href=&quot;https://github.com/postrank-labs/goliath&quot;&gt;Goliath&lt;/a&gt;, it packs a punch and proving interesting for high performance evented applications.&lt;/p&gt;

&lt;p&gt;Recently when reworking one of our backend applications to use rack/fiber_pool with EventMachine, I ran into a gotcha when it came to exception handling.  Asynchronous applications makes heavy use of callbacks, where the context of when an operation is completed is not inline with when it began.  Fibers work to make it easier to write asynchronous code look like synchronous, where you make a call, and it returns a result.&lt;/p&gt;

&lt;p&gt;The problem is that this can play a trick on your eye.  You think you are executing something in line, but the context it is operating within jumps without you being necessarily aware of it.  Take the following example:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;eventmachine&amp;#39;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;em-synchrony&amp;#39;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;fiber&amp;#39;&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;doit&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Fiber&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;EventMachine&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;uhh ohh&amp;#39;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;Fiber&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yield&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;EM&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;synchrony&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;begin&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;beginning&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;doit&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;end&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;rescue&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;ERROR: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;ensure&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;Have a nice day&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;EventMachine&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;From looking at the main block, you have the call to doit wrapped in a begin/rescue, so it seems logic that if an error was raised within it, the rescue block would capture it and handle the error.  Unfortunately, the error it being raised outside the context of the fiber, so it isn't caught.  You are outside the context of the fiber in the time between when &lt;code&gt;Fiber.yield&lt;/code&gt; is called and &lt;code&gt;f.resume&lt;/code&gt; is called.&lt;/p&gt;

&lt;p&gt;Why is this context important?  Because when we built applications, we want them to be resilient.  For instance, if a web request encounters an error, we want it caught, reported either to a log or a service, a response returned to the user, and then it be ready to handle the next request.  However outside the context of the fiber, the error handling we've built in won't get called and the application can exit.  Even if the application is monitored with god, existing connections will get closed out and connections won't be served while it is restarting, and the valuable information within the error is lost.&lt;/p&gt;

&lt;p&gt;Once the fiber is resumed, the error handling functions as expected.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;eventmachine&amp;#39;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;em-synchrony&amp;#39;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;fiber&amp;#39;&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;doit&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Fiber&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;EventMachine&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;resume&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;Fiber&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;yield&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;EM&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;synchrony&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;begin&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;beginning&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;doit&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;uhh ohh&amp;#39;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;rescue&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;ERROR: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;ensure&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;Have a nice day&amp;quot;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;EventMachine&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;In this case, the exception will be caught and handled, and you will make it to the &quot;Have a nice day&quot; message.&lt;/p&gt;

&lt;p&gt;In order to handle the error, you can add some exception handling within the main context, or the root fiber.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;k&quot;&gt;begin&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;EM&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;synchrony&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;begin&lt;/span&gt;
      &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;beginning&amp;quot;&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;doit&lt;/span&gt;
      &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;end&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;rescue&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;
      &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;ERROR: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;ensure&lt;/span&gt;
      &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;Have a nice day&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;no&quot;&gt;EventMachine&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stop&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;rescue&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Exception&lt;/span&gt;
  &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;We caught something outside a fiber!&amp;quot;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;However, I haven't yet figured out how to have the same thing done in the context of a web app.  So far, I've just ensured I have proper error handling in the callbacks before the fiber is resumed.  Trying to put a begin/rescue block in the config.ru doesn't seem to have any effect on rescuing an error.&lt;/p&gt;

&lt;p&gt;The moral of the story is when using fibers, be conscious of error handling outside the context of the fiber.  Unhandled exceptions there can cause the entire application to exit.  Be aware you have proper error handling when doing something that yields a fiber, and whether they fiber-enabled libraries you are using have error handling outside the context as well.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>MySQL Read/Write Splitting in JRuby</title>
   <link href="http://invalidlogic.com//2011/05/16/mysql-read-write-splitting-with-jruby/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-05-16T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/05/16/mysql-read-write-splitting-with-jruby</id>
   <content type="html">&lt;p&gt;I was rather surprised about the lack of information out there about how to do read/write splitting with MySQL and Rails on JRuby.  This is a pretty key part of our infrastructure, and has been a major point of our attention when performing large platform upgrades.&lt;/p&gt;

&lt;p&gt;Having recently gone through one such upgrade and subsequent breaking of our read/write splitting, I had to dive into the code that was doing it more and get a better understanding of how it works.&lt;/p&gt;

&lt;p&gt;With JRuby, there are two things to be aware of when trying to enable read/write splitting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does it route the queries to the right box?&lt;/li&gt;
&lt;li&gt;Does it do it without sucking all available connections up on MySQL?&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The second one sounds weird, however we have actually seen it and had it cause major issues.  I am not totally sure what to attribute it to, but somewhere between ActiveRecord, ActiveRecord-JDBC, and the MySQL JDBC driver, there is some weird connection handling going on where a connection will still be established, however viewed as dead and a new connection will be opened up.  Eventually, it just sucks up all available connections to MySQL.&lt;/p&gt;

&lt;p&gt;This is a pretty big problem, but can work around it.  Most of the existing information out there about how to do read/write splitting with ActiveRecord doesn't always apply cleanly to JRuby.  In particular with one, we ran into the connection-sucking issue while in the benchmarking process.&lt;/p&gt;

&lt;p&gt;Fortunately, there isn't a whole lot that needs to be done to enable splitting.  The MySQL JDBC driver &lt;a href=&quot;http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-replication-connection.html&quot;&gt;already has support for it&lt;/a&gt; with its ReplicationDriver, you just need to enable it when using Rails.  Then you can leverage a plugin which sets a &quot;read_only&quot; property on the raw connection when it recognizes a 'SELECT' query.  When read_only is flagged, the driver knows it can send the query to servers other than the master.&lt;/p&gt;

&lt;p&gt;Steps to configure read/write splitting:&lt;/p&gt;

&lt;h3&gt;1) Get activerecord-jdbcmysql-adapter&lt;/h3&gt;

&lt;p&gt;If you're going to use MySQL on JRuby with ActiveRecord, this is rather a given.  We're currently using v1.1.1 and happy.  All the JDBC adapters are basically rolled up in &lt;a href=&quot;https://github.com/nicksieger/activerecord-jdbc-adapter&quot;&gt;one repo&lt;/a&gt; which produces multiple gems.&lt;/p&gt;

&lt;h3&gt;2) Get my version of the active-record-jdbc-mysql-master-slave plugin&lt;/h3&gt;

&lt;p&gt;There is a plugin that takes care of setting the read_only property on the connection for 'select' queries.  The &lt;a href=&quot;https://github.com/mccraigmccraig/active-record-jdbc-mysql-master-slave&quot;&gt;original version&lt;/a&gt; of the plugin was originally written for Rails 2.2 and hasn't been updated in 2 years.  We originally used it on Rails 2.2.2 and had a lot of success with it.&lt;/p&gt;

&lt;p&gt;It doesn't work on Rails 2.3 through, primarily because of the load order changing.  It previously worked by alias_methoding the initializer on JdbcAdapter, and in its initialize, would alias_method the _execute method if using mysql.  However, load order changed in Rails 2.3 and the database connection is initialized before the plugin is loaded, causing its own version of initialize to not run.  I've &lt;a href=&quot;https://github.com/krobertson/active-record-jdbc-mysql-master-slave&quot;&gt;updated it&lt;/a&gt; to use a slightly different method to ensure it happens regardless of load order.&lt;/p&gt;

&lt;p&gt;To install the plugin, can just use ./script/plugin:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;text&quot;&gt;$ ./script/plugin install git://github.com/krobertson/active-record-jdbc-mysql-master-slave.git
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;h3&gt;3) Ensure active connections are reset in ActiveRecord&lt;/h3&gt;

&lt;p&gt;You can still experience some issues with stray connections.  The best way we have found to combat this is through an after filter on controllers that does some cleanup.  Add this portion to your ApplicationController:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;n&quot;&gt;after_filter&lt;/span&gt;  &lt;span class=&quot;ss&quot;&gt;:clear_database_connections&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;clear_database_connections&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;ActiveRecord&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Base&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clear_active_connections!&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;By calling &lt;tt&gt;ActiveRecord::Base.clear_active_connections!&lt;/tt&gt;, it ensures connections are reset.  It is best to try and ensure it is the last after_filter defined.&lt;/p&gt;

&lt;h3&gt;4) Configure your database.yml&lt;/h3&gt;

&lt;p&gt;The key in the database.yml is to set up activerecord-jdbc to use MySQL's replication driver and to configure the &quot;url&quot; for it.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;erb&quot;&gt;&lt;span class=&quot;x&quot;&gt;production:&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;  adapter:  jdbcmysql&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;  username: myuser&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;  password: secret&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;&amp;lt;%&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;vg&quot;&gt;$0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=~&lt;/span&gt; &lt;span class=&quot;sr&quot;&gt;/(rake|irb)/&lt;/span&gt; &lt;span class=&quot;cp&quot;&gt;-%&amp;gt;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;  host:     masterdb&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;  port:     3306&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;  database: myapp&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;&amp;lt;%&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;cp&quot;&gt;-%&amp;gt;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;  driver:   com.mysql.jdbc.ReplicationDriver&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;  url:      jdbc:mysql://masterdb:3306,slavedb:3306/myapp?roundRobinLoadBalance=false&amp;amp;autoReconnect=true&amp;amp;failOverReadOnly=true&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;&amp;lt;%&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;cp&quot;&gt;-%&amp;gt;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;You may be looking at it and thinking I've got erb in my yaml... and yes.  Rails lets you mix the two, and the reason in this case is because we opt to disable the read/write splitting in the Rails console (irb) and when running rake.  Two very important reasons to. First, in console, we've had it suck up all connections just when doing simple debugging.  Not fun.  The normal adapter doesn't do it.  And second, we want it off in rake for when we deploy and it does db:migrate.  We don't want to allow replication delay in there, such as when querying the schema_migrations table.&lt;/p&gt;

&lt;p&gt;For the read/write splitting part, the bulk of the settings are in the &quot;url&quot;.  The first host specified is the master, and any others are slaves.  It is &lt;em&gt;supposed&lt;/em&gt; to support multiple slaves and have the option to load balance between them, though we found in practice it never really worked.  Instead we direct our web app and background processing to separate slaves.  After the hosts comes the database name, and the query string parameters are a number of the driver options.&lt;/p&gt;

&lt;h3&gt;How the stuff works&lt;/h3&gt;

&lt;p&gt;At first, looking at the code for the plugin seemed rather confusing.  Less than 100 lines of injecting into ActiveRecord did our read/write splitting?  But the truth is it is fairly simple, as most of the work is handled by the underlying driver (MySQL's JDBC driver).  All we really need to do is populate that read_property on the driver's connection.  Shortened up, it does this:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;n&quot;&gt;alias_method&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:_execute_without_master_slave&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:_execute&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;alias_method&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:_execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:_execute_with_master_slave&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# if we&amp;#39;re in auto-commit mode and about to execute a select statement, &lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# then set the connection in read-only mode for the duration of&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# the query... which will permit the query to be load-balanced&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# amongst the slaves by the mysql connector/j ReplicationDriver&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;_execute_with_master_slave&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sql&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kp&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# Need to set the read_only option on the raw connection&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# to tell the underlying driver whether the request can&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# go to slaves.&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;cro&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raw_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_only&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;begin&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;raw_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_only&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 
      &lt;span class=&quot;n&quot;&gt;raw_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;auto_commit&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt;
      &lt;span class=&quot;no&quot;&gt;JdbcConnection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;?(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sql&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;nb&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_execute_without_master_slave&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sql&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
     
  &lt;span class=&quot;k&quot;&gt;ensure&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;raw_connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connection&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_only&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cro&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;To sum it up, the adapter has an _execute method which does the real execution and is passed in the raw SQL.  The plugin injects itself in the middle.  First, it captures the current read_only value, because after it executes the given query, it wants to ensure it is set back to that.  Next, it sets the read_only property based on whether MySQL itself has auto_commit on and if it is a 'select' query.  It then calls the original _execute method, and after it runs, sets read_only back.&lt;/p&gt;

&lt;p&gt;The actual plugin has several additional things, like allowing you to run some statements within a block to ensure it goes to the master, and to only work with the MySQL adapter.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Transparency and status</title>
   <link href="http://invalidlogic.com//2011/04/29/transparency-and-status/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-04-29T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/04/29/transparency-and-status</id>
   <content type="html">&lt;p&gt;One thing that I think can really be highlighted from the recent AWS issues is the value of a status dashboard for your platform/service..  However, simply having one isn't quite enough.  Plenty of services already do.  What really matters is how effectively it is used.&lt;/p&gt;

&lt;p&gt;Just think why a customer would be coming to your status dashboard and what they'd be looking for.  If they're coming to it, it is likely because they're having a problem.  What they'd be looking for is confirmation that the company is aware of the issue and a rough idea of when it would be fixed.&lt;/p&gt;

&lt;p&gt;This sounds quite simple and plainly put, however rarely done.  I don't know if its some management/PR intervention of not wanting to sound unstable or unreliable, or if it is just that level of transparency isn't encouraged/practiced at these companies, but simple information often never makes it public, or seems to after the fact.&lt;/p&gt;

&lt;div style=&quot;float:right; margin-top: 5px; margin-left: 10px; margin-bottom: 5px&quot;&gt;&lt;a href=&quot;http://invalidlogic-blog.s3.amazonaws.com/2011-04-28-aws-large.png&quot;&gt;&lt;img src=&quot;http://invalidlogic-blog.s3.amazonaws.com/2011-04-28-aws-small.jpg&quot; width=&quot;68&quot; height=&quot;400&quot; title=&quot;holy cow that sucker's long&quot; border=&quot;0&quot; style=&quot;border: 0&quot; /&gt;&lt;/a&gt;&lt;/div&gt;


&lt;p&gt;One our major gripes with Amazon when we were hosted with them was how we'd become aware of an issue, but in some cases it would be hours before an &lt;em&gt;acknowledgment&lt;/em&gt; of it went onto the status dashboard.  The 1am EBS outage of last week has repeated itself before, and interestingly around 1am several times.  It affected Reddit just a month or so ago, and it affected us back in October.  One of our other guys was up at 1am, aware immediately of the EBS issue since it was impacting our master database server.  It took Amazon &lt;em&gt;two and a half hours&lt;/em&gt; to post about it on their status site, even stating it began the same time we were paged.&lt;/p&gt;

&lt;p&gt;An additional aspect often missed is the value of the post-mortem.  When you have a major issue, customers lose confidence.  The post-mortem is a chance to try and gain some of that back.  It doesn't need to divulge every single detail and shouldn't look at it as an embarrassment.  When I think of a post-mortem, I look for 3 things: what happened, why it happened, and how it won't happen again.  It can very effective at reassuring users if you can clearly show that you understand what went wrong and have a solid plan to ensure it won't happen again.  Glossing over the chance by simply saying &quot;we've fixed &lt;em&gt;it&lt;/em&gt;&quot; (without saying what &lt;em&gt;it&lt;/em&gt; is, in some fashion) can simply detract users even more.&lt;/p&gt;

&lt;p&gt;Example of a post-mortem done wrong?  Look at &lt;a href=&quot;http://staff.tumblr.com/post/2127872280/downtime&quot;&gt;Tumblr's downtime in December&lt;/a&gt; or their &lt;a href=&quot;http://staff.tumblr.com/post/3959106211/update-regarding-security-issue&quot;&gt;security leak in March&lt;/a&gt;.  Look at the majority of AWS issues.  Typically, they don't do them.  They only do them when they &lt;a href=&quot;http://aws.amazon.com/message/65648/&quot;&gt;majorly&lt;/a&gt; &lt;a href=&quot;http://status.aws.amazon.com/s3-20080720.html&quot;&gt;screw up&lt;/a&gt;.  And when they do, there is an air of &quot;architect for us&quot; attitude.&lt;/p&gt;

&lt;p&gt;Here are a few quick things that would make any status dashboard serve its purpose well.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Update early, update often.&lt;/li&gt;
&lt;li&gt;Don't downplay the severity.&lt;/li&gt;
&lt;li&gt;Give ETAs on resolutions.&lt;/li&gt;
&lt;li&gt;Be very generous with ETAs (go worst case).&lt;/li&gt;
&lt;li&gt;Choose a good layout!  Look at the image to the side for Amazon's site during their issue.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Above all though, look at the status site from the perspective of their users.  Think about why they would come there and what information they're looking for.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Talk Cloudy To Me</title>
   <link href="http://invalidlogic.com//2011/04/27/talk-cloudy-to-me/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-04-27T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/04/27/talk-cloudy-to-me</id>
   <content type="html">&lt;div style=&quot;float:right; margin-top: 5px; margin-left: 10px; margin-bottom: 5px&quot;&gt;&lt;img src=&quot;http://invalidlogic-blog.s3.amazonaws.com/2011-04-27-talk-cloud.jpg&quot; width=&quot;180&quot; height=&quot;119&quot; /&gt;&lt;/div&gt;


&lt;p&gt;I will be talking at the &lt;a href=&quot;http://www.meetup.com/cloudcomputing/events/16701362/&quot;&gt;Silicon Valley Cloud Computing Meetup&lt;/a&gt; this Saturday, April 30th.  If you are interested in the topic and in the area, please come out!  So far there are about 500+ people plus registered to view panels discussing a wide variety of cloud-related topics throughout the day.&lt;/p&gt;

&lt;p&gt;I will be speaking on the panel titled &lt;strong&gt;&quot;What are public clouds still missing?&quot;&lt;/strong&gt;  Will be talking about what areas the public clouds still fall short in and where the road ahead will hopefully take us.&lt;/p&gt;

&lt;p&gt;Other speakers who will be there include notable people from VMware, including some of the &lt;a href=&quot;http://www.cloudfoundry.com/&quot;&gt;CloudFoundry&lt;/a&gt; engineers, Netflix, Cisco, Microsoft, and Rackspace.&lt;/p&gt;

&lt;p&gt;Definitely come and check it out!  Saturday starting at 1pm, hosted at the Microsoft Mountain View offices.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Amazon and what it means to you</title>
   <link href="http://invalidlogic.com//2011/04/22/amazon-and-what-it-means-to-you/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-04-22T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/04/22/amazon-and-what-it-means-to-you</id>
   <content type="html">&lt;p&gt;Unless you were under a rock, you probably heard about the major issues in Amazon's US-East region yesterday.  Needless to say, a lot of businesses had a rough day yesterday.&lt;/p&gt;

&lt;p&gt;One of the most interesting aspects of yesterday was realizing how big Amazon's reach is.  Numerous well known startups and businesses were down, but beyond that, services not even on Amazon themselves were affected.  We &lt;a href=&quot;http://invalidlogic.com/2011/02/11/how-we-did-a-datacenter-migration-with-no-downtime/&quot;&gt;moved off EC2&lt;/a&gt; back in December, however we integrate with APIs from several other services, including some that are running on EC2.  We still felt the impact and were closely watching Amazon tracking status.&lt;/p&gt;

&lt;p&gt;The event lead to some interesting discussions though.&lt;/p&gt;

&lt;div style=&quot;width: 100%; text-align: center; margin: 10px 0;&quot;&gt;&lt;a href=&quot;http://twitter.com/#!/rtomayko/status/61132582489817088&quot;&gt;&lt;img src=&quot;http://invalidlogic-blog.s3.amazonaws.com/2011-04-22-rtomayko-tweet.jpg&quot; width=&quot;588&quot; height=&quot;260&quot; border=&quot;0&quot; style=&quot;border: 0&quot; /&gt;&lt;/a&gt;&lt;/div&gt;


&lt;p&gt;There has been a lot of blame going back and forth, with some leaning on the side of &quot;you should have planned for it&quot;.  While it can be particularly harsh, in that most best practices for Amazon stressed leveraging multiple availability zones, while yesterday, it was nearly an entire region that was out.  There is an element of truth in the argument... &lt;em&gt;have a plan&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Typically that is known as a Disaster Recovery plan.  They easy to be overlooked until you realize you need one.  They're always one of those &quot;we should do that someday&quot; or maybe even something a company doesn't think its big enough to need.&lt;/p&gt;

&lt;p&gt;I am no expert on disaster recovery (and there are plenty of consultants out there to help build a professional one), but there are some pretty basic things you can look at to at least have &lt;em&gt;a plan&lt;/em&gt; rather than &lt;em&gt;no plan&lt;/em&gt;.  Some services recovered pretty well yesterday, albeit with partial functionality, but they got something back up.  Others realized they were at the mercy of another provider and were stuck until then.&lt;/p&gt;

&lt;p&gt;There are some simple questions you can begin to answer for yourself for a simple plan:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What data is important?&lt;/strong&gt;  Your service runs on code, but it is powered by data.  What data is it dependent on?  Image assets, file uploads, databases?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where does my data live, and how can I have it in multiple places?&lt;/strong&gt;  Take regular database backups and store them off site.  Keep a backup of assets.  It is important for them to be in more than one location since if that location disappears, you are SOL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How much data loss is acceptable?&lt;/strong&gt;  If stuff goes down, you may lose some data.  If you want to limit the loss, you need to keep fresher copies of data.  More frequent backups, perhaps a live DB slave hosted somewhere else.  You need to either accept that some data loss is acceptable, or architect so that you won't have any data loss.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How fast can you provision servers and where?&lt;/strong&gt;  Geo distribution all the time can be expense and not necessary (unless you deem it is), but being able to have somewhere to quickly set up emergency servers is a major plus.  If you use AWS East, then maybe use AWS West for DR.  Make sure your account limits give you enough capacity, and have scripts and basic stuff in place and ready to be used.  We leverage Chef for all our systems, so we can quickly launch some new servers and get right to work configuring them pretty quickly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is your order of priorities?&lt;/strong&gt;  A service is often comprised of many smaller components.  Know which ones are the highest priority.  Know which functionality without your app is most important.  You'll be racing against the clock and have only so many people.  This lets them focus their attention to help get core functionality up as quickly as possible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do a fire drill.&lt;/strong&gt;  This is something I think we'll be doing pretty soon.  Out of the blue one day, do a mock fire drill.  You are down, your datacenter disappeared.  Get the service back up in an alternative place.  How long does it take you?  What were the pain points?  Beyond simply having a plan, you need to know the plan will work.&lt;/p&gt;

&lt;p&gt;And lastly, &lt;strong&gt;know your application&lt;/strong&gt;.  Know its requirements, constraints, and understand its bare minimums, most important pieces, etc.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Power through environments</title>
   <link href="http://invalidlogic.com//2011/04/14/power-through-environments/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-04-14T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/04/14/power-through-environments</id>
   <content type="html">&lt;p&gt;At Involver, we have a pretty rich deployment environment that I think isn't very common.  It is normal to hear about places having a separate staging environment from the production environment, and sometimes having a staging and test environment.  We actually go a step further and have a total of 8 non-production environments.  They all serve different purposes, but are all heavily used.&lt;/p&gt;

&lt;p&gt;Our environments are broken up like this:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;Production: duh, live stuff&lt;/li&gt;
&lt;li&gt;Staging: Used for QA, close to production, no experimental stuff, pretty good data set and history in the DB.&lt;/li&gt;
&lt;li&gt;Demo: Used for client-facing stuff that is under development.  Somewhat experimental, but expected to not be breaking or of a major impact.&lt;/li&gt;
&lt;li&gt;Perf: This environment is regularly repurposed and is used for benchmarking, testing infrastructure changes, and other large scale changes (Rails upgrades, DB upgrades, core app perf testing).&lt;/li&gt;
&lt;li&gt;Test1-5: These guys are pretty special.&lt;/li&gt;&lt;/ul&gt;


&lt;p&gt;The test environments are something special and are meant to be highly flexible.  They are used to deploy separate git branches to an environment that has all the different elements of our application fully orchestrated.  They are useful for testing volatile code, experimental code, or for performing QA on features or feature branches before the code gets merged into master.&lt;/p&gt;

&lt;p&gt;The deployment to these environments works by specifying the branch on deploy, and it will create the MySQL/Mongo databases, setup the DB schema, bootstrap the DB with some initial data, and then configure all the configuration files and even config files for other applications that it interacts with.  Each branch on each environment gets its own database.  And between deploys, the data for your branch is fully preserved.&lt;/p&gt;

&lt;p&gt;So say you deploy &quot;dev-fix-booboo&quot; to test3 for the first time.  It sets you up with a fresh dataset and are good to go.  You do your thing, then later on, someone deploys another branch.  They get their own DB, not mucking up the stuff you previously had done, and when you deploy there again, you are right back where you left off.&lt;/p&gt;

&lt;p&gt;Since they are specific to the environment too, we often times assign a long-running feature branch to an environment, so that the data used by QA sticks around more, and smaller test branches can use them here and there.&lt;/p&gt;

&lt;p&gt;The full power of these comes form:&lt;/p&gt;

&lt;ol&gt;&lt;li&gt;Flexibility.  I have a branch I want to deploy, but not much other environments.  Sweet.&lt;/li&gt;
&lt;li&gt;Some element of persistence.  I can use it for a few days, come back, and be right where I left off.  No risk of someone polluting my stuff with bad data.&lt;/li&gt;
&lt;li&gt;QA can actually deploy to them and pick which is available.  For non-feature branch testing, this gives them great flexibility.&lt;/li&gt;&lt;/ol&gt;


&lt;p&gt;Of course, there are still some pitfalls:&lt;/p&gt;

&lt;ol&gt;&lt;li&gt;They don't mirror the full production setup.  They're usually one box and not spec'd for any kind of load.  They run good, but not going to handle lots of people.  Can't perf test here.&lt;/li&gt;
&lt;li&gt;There is the data persistence between switching branches, but often no rich DB history.  It is harder to test large datasets of legacy type setups.&lt;/li&gt;
&lt;li&gt;Can't test infrastructure changes.  If deploys don't switch out gems or system configs, so harder if you want to test a change like that, you need to lock the environment til done.&lt;/li&gt;&lt;/ol&gt;


&lt;p&gt;All of this makes for a very flexible deployment process.  It allows devs to easily get code off of their machines and in a more controlled environment.  It allows us to have high confidence in code before it makes it to master and often even before the feature branch.  It also allows QA to do their testing without potentially deploying code they reject to staging or demo, where it might be customer facing or impact other development.&lt;/p&gt;

&lt;p&gt;Big props for it go to &lt;a href=&quot;http://twitter.com/mikewadhera&quot;&gt;@mikewadhera&lt;/a&gt;, who initially wrote the deployment process for it.  Since then, we (infra/ops team) have worked on making the environments single-system, which allow us to have more of them, as well as iterate on making them more streamlined with developer and QA feedback.&lt;/p&gt;

&lt;p&gt;This kind of builds up to something I'll be talking about soon too.  Since we have a total of 9 different environments, this can lead to a very interesting chef workflow.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Serving Static Content Via POST From Nginx</title>
   <link href="http://invalidlogic.com//2011/04/12/serving-static-content-via-post-from-nginx/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-04-12T00:00:00-07:00</updated>
   <id>http://invalidlogic.com//2011/04/12/serving-static-content-via-post-from-nginx</id>
   <content type="html">&lt;p&gt;Recently, a change was made on Facebook's end where their &amp;lt;fb:iframe&gt; FBML tag started doing POST request to get content.  Their documentation seemed to indicate their &quot;Post for Canvas&quot; change wouldn't affect the &amp;lt;fb:iframe&gt; tag, however our error logs spoke quite to the contrary.  We saw an influx in errors in places where we used &amp;lt;fb:iframe&gt; to pull in static content.  Nginx would simply return a 405 error for &quot;Method Not Allowed&quot;.  Nginx can't serve static content on a POST request.&lt;/p&gt;

&lt;p&gt;Some quick googling seems to reveal some quick answers, but the problem is none of them worked.&lt;/p&gt;

&lt;p&gt;The most popular, &lt;a href=&quot;http://forum.nginx.org/read.php?2,2414,47301&quot;&gt;this thread&lt;/a&gt;, talks about a work around by manipulating the error page for nginx.  First, there is some mixed syntax depending on the nginx version.  Some references seem to be to old 2007 syntax which doesn't work on the nginx 0.7 build we use.  But they all indicated that this &lt;em&gt;should&lt;/em&gt; have worked:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;nginx&quot;&gt;&lt;span class=&quot;k&quot;&gt;error_page&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;405&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;200&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;@405&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;location&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;@405&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;root&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;This basically tells nginx to change the response code to 200 for 405 messages, and to use the @405 location entry to handle it.  Setting the root it is supposed to pull the requested document.  It didn't work though.  At best, I got it to return a 200 code, but no response body.  Essentially, it was still doing a POST request and it still didn't like it.&lt;/p&gt;

&lt;p&gt;I could however, get it to be served by proxying it to our Rails application, if I enabled serving static content from Rails.  But that was less than ideal.&lt;/p&gt;

&lt;p&gt;But then, I was scanning through nginx's &lt;a href=&quot;http://wiki.nginx.org/HttpProxyModule&quot;&gt;proxy documentation&lt;/a&gt; and found what seemed like an ideal solution.  I could configure nginx to serve static content on another port, have the main server proxy it to the other port, and when proxying, change it into a GET request:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;nginx&quot;&gt;&lt;span class=&quot;k&quot;&gt;upstream&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;static_backend&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;server&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;localhost&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;89&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;...&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;error_page&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;405&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;200&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;@405&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;location&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;@405&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;root&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_method&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;GET&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;proxy_pass&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;http://static_backend&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# POST STATIC CONTENT&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;server&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;listen&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;89&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;server_name&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;root&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;



</content>
 </entry>
 
 <entry>
   <title>Our pain points with EC2 and how our moved solved them</title>
   <link href="http://invalidlogic.com//2011/02/16/our-pain-points-with-ec2/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-02-16T00:00:00-08:00</updated>
   <id>http://invalidlogic.com//2011/02/16/our-pain-points-with-ec2</id>
   <content type="html">&lt;p&gt;As I &lt;a href=&quot;http://invalidlogic.com/2011/02/11/how-we-did-a-datacenter-migration-with-no-downtime/&quot;&gt;mentioned in my last post&lt;/a&gt;, we made the move off of EC2 in December to our own cluster of machines.&lt;/p&gt;

&lt;p&gt;While EC2 had served us very well and helped get our systems to the
place they are today, we started reaching a point where we were
running into three main bottlenecks with the EC2 ecosystem and we couldn't find any way around them.&lt;/p&gt;

&lt;p&gt;Will start off by saying the AWS is an awesome service and we couldn't
have grown to the level we are at today without them.  Amazon should
be a easy contender for any service just starting out.  Some of the
services they provide, especially things like Elastic Load Balancer
(ELB) and Elastic Block Storage (EBS) are services that are simple to
start using, help tremendously as you grow, and simply aren't common
with traditional VPS hosting.&lt;/p&gt;

&lt;p&gt;However, here are some of the bottlenecks we faced and how we've
solved them in our new setup.&lt;/p&gt;

&lt;h3&gt;IO Performance&lt;/h3&gt;

&lt;p&gt;Anyone who is doing anything intensive on EC2 will tell you that IO
performance is horrendous.  We were battling EBS constantly on our
database servers and had serious questions how we could continue to
operate yet alone scale our DBs while in the cloud.  We met with AWS engineers to review our configurations, but
were already doing all the best practices.  We had tuned MySQL
configuration as much
as we could, had the &quot;high-memory&quot; instances and higher IO priority
and newer CPUs,
and were running 4 EBS volumes in RAID10.&lt;/p&gt;

&lt;p&gt;They tried to push Amazon RDS, however RDS is simply EC2 and EBS with
the same best practices.  While it might be slightly better, it wasn't going to
totally solve our problems.&lt;/p&gt;

&lt;p&gt;The biggest issue was IO latency.  We had database servers where it
was normal for them to average 20-30% IO wait.  We frequently had
spikes upwards of 60% to sometimes 90%.&lt;/p&gt;

&lt;p&gt;And it wasn't always disk performance, but IO as a whole.  We would at times provision a
new database server that would simply struggle to stay fully
replicated with no query load on it.  Either struggling with IO
latency, or sometimes even keeping a connection up to the master.&lt;/p&gt;

&lt;p&gt;There is unfortunately no solution.  Everything in AWS is a shared
resource.  With that comes variable performance and bad spikes due to
hot-spots.  It is difficult to predict or ensure consistent
expectations.&lt;/p&gt;

&lt;h3&gt;Support&lt;/h3&gt;

&lt;p&gt;At one point, we encountered an issue where we were seeing extreme IO
latency on our EBS volumes on our master MySQL DB server.  One of our
other ops guys was up for several hours in the middle of the night,
along with our CTO and Director of Product.  Unfortunately it was an
EBS issue throughout the entire zone and we had a large footprint in
that zone, including our master database server.&lt;/p&gt;

&lt;p&gt;The biggest thing we took issue with was that it was almost 3 hours
before it actually made it to Amazon's status dashboard.&lt;/p&gt;

&lt;p&gt;After that incident, we started looking at Amazon's Premium support,
however the support packages are really quite weak when you dig into
them.&lt;/p&gt;

&lt;p&gt;First of all, Premium is a straight 20% added to your bill.  You can't
select it per instance or anything.  We didn't care about premium
support on our staging systems, so we'd have to move those to another
account if we signed up.&lt;/p&gt;

&lt;p&gt;But the big thing... you get 24/7 phone support and a 1 hour
acknowledgement SLA.  Meaning they'll confirm you are having an issue
within 1 hour.  However, resolution is another thing.  Because
everything in AWS are shared resources, they can't offer any sort of
priority resolution at all.&lt;/p&gt;

&lt;p&gt;So that EBS latency issue?  It took 3 hours to make their status
dashboard, and a total of 11 hour from when we noticed it to them
posting it was totally resolved.&lt;/p&gt;

&lt;p&gt;Paying for Premium support wouldn't have helped much.  It would still
be fixed for us at the same time.  At telling the CTO/Director &quot;AWS
has acknowledged it, it'll be fixed when its fixed&quot; isn't the type of
message we like to deliver.&lt;/p&gt;

&lt;h3&gt;Scaling Costs&lt;/h3&gt;

&lt;p&gt;Amazon is great for services just starting out.  It is simple pay-as-you-go, you get availability of a lot of nice functionality with S3,
Elastic Load Balancer, and others, and it is quick to get bigger instances
as you grow and need more.&lt;/p&gt;

&lt;p&gt;However there comes a threshold where the costs start growing more
quickly than your scale.&lt;/p&gt;

&lt;p&gt;Want support?  Bam, extra 10%-20%.&lt;/p&gt;

&lt;p&gt;IO bottlenecks?  Amazon's answer is to use bigger instances (higher IO
priority) and spread the load across multiple instances.  Both of those mean
more money.&lt;/p&gt;

&lt;p&gt;Once you start reaching bottlenecks, the only answer within AWS is to
work around them, and that usually means more costs.&lt;/p&gt;

&lt;p&gt;Additionally, since it is so easy to provision new systems, it is easy
to get carried away without realizing how much the changes are costing you.&lt;/p&gt;

&lt;h3&gt;Insight&lt;/h3&gt;

&lt;p&gt;This is actually a bottleneck we didn't really realize until after we
moved.  It is amazing how little insight you have into your
architecture until you have full control over it.&lt;/p&gt;

&lt;p&gt;For instance, on the new setup we had nice zone separation between
systems and were seeing massive spikes in concurrent connections
through the firewall, 3x more than we expected.  They all ended up
being DNS traffic.  Since it was UDP, the &quot;connection&quot; count was
skewed, but we hadn't realized how much DNS traffic we generated.  In
AWS, we couldn't monitor low level traffic metrics in the same way,
track our aggregate DNS traffic, or anything.&lt;/p&gt;

&lt;h3&gt;Our Solutions&lt;/h3&gt;

&lt;p&gt;Our solutions to all of these were actually quite simple and are a
nice blend between old school architecture and modern pragmatism.&lt;/p&gt;

&lt;p&gt;Nowadays, you start to hear all this stuff about &quot;private clouds&quot;.
First of all, drop the bullshit.  &quot;Cloud&quot; is slapped on anything it
can be these days.  There is nothing new to the concept of &quot;private clouds&quot;.&lt;/p&gt;

&lt;p&gt;All it is is a virtualization cluster on equipment dedicated to you.
Its been around for 10+ years.  Clouds caught on because they were
simple to get started with and great at the small scale.  Now its just
a marketing buzzword.&lt;/p&gt;

&lt;p&gt;Our approach was simple.  We separated our environment out into a
hybrid of dedicated systems for some things, and a large SAN backed
virtualization cluster for everything else.&lt;/p&gt;

&lt;h4&gt;Databases&lt;/h4&gt;

&lt;p&gt;Dedicated hardware is the perfect solution for IO intensive
workloads.  With this, we could actually specify what we wanted.
Types of disks, RAID configuration, capacity, etc.  We could go as
simple as RAID 10, up to SSD or even FusionIO.&lt;/p&gt;

&lt;p&gt;We moved all our MySQL and MongoDB systems to dedicated hardware, each
built to their own needs and planned ahead for a certain timeline.  We
then looked at several growth metrics.  For instance, MySQL would
likely need more CPU or disk capacity in the future.  MongoDB would
likely need more RAM in the future.&lt;/p&gt;

&lt;h4&gt;Virtualization&lt;/h4&gt;

&lt;p&gt;Almost everything else was virtualization.  We got several very big,
bulky virtualization systems.  12 cores, 24 threads, 128gb RAM, ohh
yeah.  However we slice and dice it is up to us.  We have everything
configured for full failover of the base chassis.&lt;/p&gt;

&lt;p&gt;At the SAN level, we left that up to the guys at Network Redux.  They
monitor the workload levels and plan for it accordingly.  The IO
doesn't scale infinitely, but rather we try to maintain an average
expectation so that growth is predictable.&lt;/p&gt;

&lt;h4&gt;Support&lt;/h4&gt;

&lt;p&gt;Support is probably one the biggest and most dramatic changes.  AWS is
a complete black box.  Now, we have two people dedicated to our
account, as well as a 24 support desk we can call and have a tech pull
up all the details on our account.  We have the cellphone numbers of
the people dedicated to our account.  We have yet to call them at 2am,
however having it is more than we ever had with Amazon.&lt;/p&gt;

&lt;p&gt;We now have people proactively working to help us.  If IO load is
growing on the SAN, they reach out to us ahead of time, before it
becomes a problem on our end.&lt;/p&gt;

&lt;p&gt;If a hardware node throws an alert, they get paged at the same time
and will hop online to help us to investigate.&lt;/p&gt;

&lt;p&gt;This is probably one of the biggest pros and our new provider, &lt;a href=&quot;http://networkredux.com/&quot;&gt;Network Redux&lt;/a&gt; has been truly awesome in that regard.  They are proactive, we deal with the same people over time so they are familiar with our deployment, our change history, etc.&lt;/p&gt;

&lt;h4&gt;Costs&lt;/h4&gt;

&lt;p&gt;Costs though... surely all this raw hardware goodness, scalable IO,
and choice in your deployment costs a whole lot extra, doesn't it?
Not so.  You might actually be surprised how much cheaper it is.  I
can't give numbers, but I will say it was &lt;em&gt;significant&lt;/em&gt; and after the
move one of the main bullet points added by the CFO was &lt;em&gt;how much&lt;/em&gt;
cheaper it was.&lt;/p&gt;

&lt;p&gt;To make the CFO even happier, we had planned ahead for growth, where
we could handle spikes and add some additional systems without extra
costs, and had solid figured on how much the costs would increase.
&quot;When we need more capacity here, the next step is X, and the cost
will be $Y.&quot;  As more time goes on, we're getting better at predicting
when the growth will be needed, and showing the benefits.&lt;/p&gt;

&lt;h3&gt;Conclusions&lt;/h3&gt;

&lt;p&gt;Overall, Amazon is great service.  Don't get me wrong.  It is truly
excellent for companies/services just starting out and has an awesome
growth path for scaling early on.&lt;/p&gt;

&lt;p&gt;However nothing scales infinitely.  Not your applications, and not the
benefits of a service like AWS.&lt;/p&gt;

&lt;p&gt;Once you each a certain threshold, you benefit more from full control
over your environment and you get to the point where it is actually
cheaper than the cookie cutter services AWS provides.  Where that
threshold is depends on each case.  But it is important to know where
it is, even if you aren't there yet.&lt;/p&gt;

&lt;p&gt;For us, we improved our performance several times over, have more
control than ever, and did it all while actually saving money.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>How we did a datacenter migration with no downtime</title>
   <link href="http://invalidlogic.com//2011/02/11/how-we-did-a-datacenter-migration-with-no-downtime/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-02-11T00:00:00-08:00</updated>
   <id>http://invalidlogic.com//2011/02/11/how-we-did-a-datacenter-migration-with-no-downtime</id>
   <content type="html">&lt;p&gt;Starting in October, we at Involver decided to take on a project migrate our entire infrasture off of Amazon EC2.  AWS had served us well, however there comes a point where the limitations of some of their services were becoming a major bottleneck.  The bottlenecks and how we've resolved them in our migration is a big enough topic for a follow up post, but we recognized that we needed more control over our environment and the specifications of all the systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; Have posted the follow up on the &lt;a href=&quot;/2011/02/16/our-pain-points-with-ec2/&quot;&gt;bottlenecks we were facing EC2&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;After nearly a month of evaluating providers, doing benchmarks on test systems, and milling over all the details, we picked our provider.  We then spent another month refactoring various systems, re-evaluating all machine roles, and system configurations.  Since we were no longer constrained by EC2 instance types, we started looking at how many CPUs and how much RAM is really best suited for each system.&lt;/p&gt;

&lt;p&gt;For our hosting provider, we ended up going with &lt;a href=&quot;http://networkredux.com&quot;&gt;Network Redux&lt;/a&gt;.  Of all the others, both big names you'd recognized and some smaller ones you wouldn't, they proved the best fit.  Go to any provider and they can all get you the same server.  That is easy.  Its the other services, the skills of their staff, and quality of the total, big picture architecture that matters.  With Network Redux, a lot of it came down to the fact they they worked like us.  They hired top-notch people, they moved quickly on things, and had a short decision tree.  Case in point, when the President is on your sales call and can say you'll have a test system with set specs in two days, vs the sales person needing to pull out an org chart and &quot;get back to you.&quot;&lt;/p&gt;

&lt;p&gt;After our month of refactoring, we finally flipped the switch the night of December 23rd.  We did an entire datacenter migration with no user downtime.  Even more, we didn't even have all of our production systems until that morning (thanks Dell! &amp;lt;/sarcasm&amp;gt;).&lt;/p&gt;

&lt;p&gt;So how did we pull it off?&lt;/p&gt;

&lt;p&gt;It boils down to an excellent team, delicate planning, and an awesome toolset.  Our operations team is only 3 people, but we manage what is now 8 environments where our code lives and about 100 systems.&lt;/p&gt;

&lt;p&gt;Everything in our environment is fully configured with chef.  With the migration, we refactored a lot of our recipes put a strong emphasis on everything in chef.  Each system was provisioned, cooked and recipes tweaked, then destroyed, reprovisioned, and recooked.  Repeated until it was ready to go right after cooking.&lt;/p&gt;

&lt;h3&gt;MySQL&lt;/h3&gt;

&lt;p&gt;Our production database servers were the first to get set up on the production side.  We'd already validated all our chef recipes for both our MySQL and our MongoDB clusters on our staging environment beforehand.  With MySQL, we run a master-master setup with multiple read-only slaves.  We cooked each of the boxes and configured all of the slaves to point to what would be the new master.  We made a change to ensure that the new master and the EC2 master had different auto-increment offset and we also ensured log_slave_updates was enabled.  We then imported a backup of our production database onto the new master.  It automatically replicated out to the passive master and read-only slaves.  We then set it up so our new master would replicate from our EC2 master.&lt;/p&gt;

&lt;p&gt;With this configuration, writes would be coming in on our EC2 master, get replicated to the new master, and since it has log_slave_updates on, those changes would then replicate our to the other database servers.  This also enabled us to switch over the site and easily allow writes to start coming in to the new environment.  If there were some lagging requests to the old site, they would replicate to the new environment without any collisions.  We also documented the position of the new master when the switch happened, so if we needed to revert, we could potentially set the EC2 master to replicate from the other master.&lt;/p&gt;

&lt;h3&gt;MongoDB&lt;/h3&gt;

&lt;p&gt;The MongoDB databases were migrated in a different fashion.  We were moving from a single replica set configuration to a sharded setup with two replica sets.  Because of this, we could do the same early replication process we did with MySQL.  Mongo is primarily used for our analytics data, so with it, there is a primary collection that is most important and then others used for the calculations.  We cooked the database servers ahead of time and got the sharding configuration and everything fully setup before hand.  Then the data migration was scripted out so that with a quick command, we would pull the most important data over first by exporting the data, transferring it, then importing it.  This way, it was migrated and sharded at the same time and fully automated.&lt;/p&gt;

&lt;h3&gt;Our Application&lt;/h3&gt;

&lt;p&gt;The morning of the 23rd, we came in, having just had the bulk of our production virtualization cluster setup the night before (thanks again for the delay Dell).  This is where the awesomeness of chef and our team truly shined.  We set to work and provisioned all of the production systems, cooked them, set them up in load balancing, verified security rules, and completed some of the final code tweaks and testing.  We didn't even deploy our final monitoring systems until that morning.   We did a few test deploys, verified all the parts of the app would come up, and that monitoring on them was accurate.&lt;/p&gt;

&lt;h3&gt;The Switch&lt;/h3&gt;

&lt;p&gt;Finally, around 10:45pm, we made the switch.  The actual cut off boiled down to just a DNS change.  TTLs had been changed to 60 seconds weeks earlier.  We just repointed the necessary hosts and within 60 seconds, we saw the majority of traffic cease on EC2 and come in on our new systems.&lt;/p&gt;

&lt;p&gt;Right after we made the DNS change, we also removed all of our old app servers from ELB (Elastic Load Balancer) except for two.  We turned nginx off on those two and turned on &lt;a href=&quot;/2010/02/12/migrating-datacenters-how-to-forward-traffic&quot;&gt;rinetd&lt;/a&gt;, which was already configured to forward any traffic to port 80 or 443 to our new environment.&lt;/p&gt;

&lt;p&gt;The benefits of the migration were almost instantly visible.  Noah, our CTO, was in the office with us to do acceptance testing right afterward.  We made the switch, verified DNS had updated, and when he loaded the first page, his first reaction was &quot;Holy shit... this is so much faster!&quot;  That there was all the validation we needed for the 2 months of work we had put in.&lt;/p&gt;

&lt;p&gt;Most impressive was looking back at data in New Relic from before and after the migration.  It was quite surprising how drastic the improvement was.&lt;/p&gt;

&lt;div style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://invalidlogic-blog.s3.amazonaws.com/redux-switch.jpg&quot; width=&quot;647&quot; height=&quot;264&quot; /&gt;&lt;/div&gt;


&lt;p&gt;And its gotten even better.  Having to no longer fight fires with AWS anymore, we've actually made tons of other enhancements.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>How a deprecated method can take down a process</title>
   <link href="http://invalidlogic.com//2011/02/07/how-a-deprecated-method-can-take-down-a-process/"/>
   <link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/3.0/" />
   <updated>2011-02-07T00:00:00-08:00</updated>
   <id>http://invalidlogic.com//2011/02/07/how-a-deprecated-method-can-take-down-a-process</id>
   <content type="html">&lt;p&gt;At Involver, we have a pretty active background processing tier that handles a variety of things from long running tasks we don't want in the web app, pulling in content from external sources, and simple recurring maintenance tasks.&lt;/p&gt;

&lt;p&gt;Previously, we ran a single worker thread per JVM, but that setup lead to a low worker per system density since it sucked up a lot of memory.  A few months ago, we came up with a way to run multiple workers within a single JVM and have it nicely thread safe by using a small embeddable JRuby VM started off the main process.  This worked excellent and doubled the number of workers we could run per machine, since they could share heap space.&lt;/p&gt;

&lt;p&gt;But then we recently began experiencing an issue where all the threads would just halt processing.  No errors, seemingly nothing wrong.  Heap usage was fine and a thread dump showed them seemingly ok, but they just sat.  We tried attaching a profiled, however it ended up just output garbage when we tried to have to dump its data after it got stuck.&lt;/p&gt;

&lt;p&gt;After 2 days of investigation, finally managed to tracked the cause down to the fact that a change a few days before had caused us to start running over a function that used a deprecated method call, and the change meant we were hitting that line 500-1000 times per minute.&lt;/p&gt;

&lt;div style=&quot;float:right; margin-top: 5px; margin-left: 10px; margin-bottom: 5px&quot;&gt;&lt;img src=&quot;http://invalidlogic-blog.s3.amazonaws.com/gru-light-bulb.jpg&quot; width=&quot;400&quot; height=&quot;231&quot; title=&quot;not a light bulb, but still very shiny&quot; /&gt;&lt;/div&gt;


&lt;p&gt;At first, we suspected maybe JRuby had some sort of thread concurrency issue with calling deprecated methods, but that made no sense.  All calling a deprecated method does is write to the console!&lt;/p&gt;

&lt;p&gt;As Gru in Despicable Me would say... &quot;Light bulb!&quot;&lt;/p&gt;

&lt;p&gt;When running a single worker in console, we saw the warnings, but when running the multiple workers per JVM, we saw none.  The rabbit hole gets deeper and fortunately lead us to the solution and were able to confirm the issue and resolution.&lt;/p&gt;

&lt;p&gt;The way we spawned the child threads, we found stdout/stderror was not attached to anything.  They make use of buffered IO though.  There is a fixed buffer size, and if you try writing more than that, it will block on writing to it until the buffer is read from and space is freed.  Since nothing was reading from it though, the threads would simply block indefinitely on writing out the deprecated method calls.&lt;/p&gt;

&lt;p&gt;So we set up a test.  We created the following class on one of our boxes and spawned up the multi-worker process.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Awesome&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;kill_me_softly&lt;/span&gt;
    &lt;span class=&quot;kp&quot;&gt;loop&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
      &lt;span class=&quot;nb&quot;&gt;puts&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;a&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1024&lt;/span&gt;
      &lt;span class=&quot;no&quot;&gt;Rails&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;logger&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;info&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;Alive!&amp;#39;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;It came up, but would get hung in less than a second.  Simply do no more.  One threads would lock up independently, one by one.  So the whole JVM wasn't frozen, just that child thread.  Hence why they all died at different times, but very close together.&lt;/p&gt;

&lt;p&gt;Then to confirm even more and fix it, we changed the multi-worker process to reopen the stdio handlers on child threads:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;no&quot;&gt;STDIN&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reopen&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;/dev/null&amp;quot;&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;STDOUT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reopen&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;/dev/null&amp;quot;&lt;/span&gt;
&lt;span class=&quot;no&quot;&gt;STDERR&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reopen&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;/dev/null&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Spawn one back up, and it begins streaming to no end, no lock ups, no problems.  Success!&lt;/p&gt;

&lt;p&gt;In the end, we changed the function to no longer use the deprecated method, and had the multi-worker process redirect stdio for child processes to an actual file, so that way we can actually log any valuables messages we might be otherwise missing.  But that is hopefully the first and only time something as off as stdio buffering takes down a process.&lt;/p&gt;
</content>
 </entry>
 
 
</feed>
