Archive for September, 2011

September 20, 2011

Running Heroku on Heroku

heroku logoThis is a live summary taken from this talk given at StrangeLoop.

Today Noah Zoschke @nzoschke will cover running Heroku on Heroku. Heroku for those that are not familiar is a cloud application platform as a service. It used to be a ruby application as a service platform but now it has been opened up for many other languages. Heroku was all about getting rid of the need for servers; at least you maintaining servers. This talk is going to be about bootstrapping and self hosting and all the benefits for the dev and operations cycles that come along with it – not to mention the benefits for your business.

The meaning of the word bootstrapping has come to mean a self sustaining process that proceeds without external help. There are many applications of this term, socio econimics, business, statistics, linguistics (how a small child can go from no spoken ability to having it), biology (we all start as just a few cells and our cells then figure things out), and then of course computers (booting up is bootstrapping up). We have a computer that is off and we need to figure out how to get the system up and running from that off state into a fully running a viable for work state. Boostrapping also has a very specific meaning for compilers. If you have a compiler written in a language that it itslef compiles then it is bootstrapped. We will talk about the compiler example for just a minute before we get into what this could mean for services for illustration purposes.

Self building/bootstrapping is something that allmost all languages and compilers strive to do. Bootstrapping is an excelent test for any compiler. It allows you to work on your compiler in a higher level language. It also leads to a really great consistency check of the compiler itself. A compiler that can compile itself is a good thing also because it reduces the overall footprint of the tools needed to work on the compiler itself. There is ofcourse a chicken and egg problem. There are a number of strategies for handing this.

Build compiler/interpreter for X in language Y
Using an earlier version of the compilar
Hand compile

Lets change terminology quickly, “Self hosting” is a computer program that produces new versions of that same program. This applys to compilers as we illustrated but it also applies equally well to kernals, programming languages, and revision control systems like git being maintained in git self host. There are more such as text editors; vim being developed with vim and so on. So, the question is

“Is this an applicable metaphore for services and the cloud?”

We see the same properties and benefits associated with compilers in services and cloud. At a simple level Heroku hosts http://www.heroku.com on Heroku. Not very surprising, probably more surprising if you found out Heroku was run on Slicehost or something like that! (it is not). There are a number of motivations though for taking self hosting a bit further than just this. Dogfooding, efficiency, and separation of concerns. Heroku used to be this large ruby app and any time some developer would screw something up he could crash the whole system. There were all kinds of hoops that were jumped through to prevent this from happening. The ultimate solution ended up being self hosting. Features used to be added to this large ruby app now most features, like Heroku cron, are turned into applications that actually run on Heroku itself not in its codebase where it can cause problems.

Now taking this even further to something more heroic. Heroku has a whole separate database cloud service. This thing is large and a fairly big deal. and the whole thing runs on Heroku itself. Can we keep going with this, and take it even a step further?

Heroku Cloud Architecture

Noah  Zoschke talking about heroku cloud architecture

The question is, what else can we self host? Take the compile part of the architecture and run them on the heroku dynos. so basically compiling new Heroku dynos will be compiled by the compile application running ontop of Heroku itself. We want to run a platform that is not just for sinatra apps or rails apps etc… We want a platform that is a generic computing platform. Running Heroku applications on Heroku helps us prove out we do, or move there is we are not already.

Other motivations are effortless scaling, decreased surface area of the architecture, and build/compile symmetry. We want our build servers to look just like our runtime servers. The motivation here is obvious and running compile on heroku itself really gets us there. The other and most important motivation is to be able to focus on these secure ephemeral containers, the dynos, and making them as secure and well factored as possible. If our business depends on these containers from top to bottom we will be forced to make these are sound as possible.

Martin Logan (@martinjlogan) also, if this kind cloudy stuff floats your boat you should check out Camp DevOps Conf in Chicago this Oct

Advertisements
September 19, 2011

Glu-ing the Last Mile by Ken Sipe.

This post was blogged real time by @martinjlogan at Strange Loop 2011. Please forgive any errors.

I [Ken Sipe] spent the last year focused on continuous delivery which is why I am so interested in this product. We will start this talk off with a commercial. You have of course heard of Puppet, and you might have heard of Chef. Now we have glu. I would actually liked to have called this talk Huffing Glu. So where does glu fit in. We need to start with the Agile Manifesto particularly the principle that our highest priority is to satisfy the customer through early and continuous delivery. We need to not only develop good software but be able to deploy valuable software.

How long does it take you to get one line of code into production? If you had something significant to push into prod, how long would it take you push that code into production? What does your production night look like? Are you ordering pizza for everyone to handle the midnight to 3am call? Why do we do this, because we have not automated. We are engineers and we automate things, but we have not even automated our own backyard. Even with simple rules though, things can be complex. “Just push this single war out to production”. Well, even really simple things can get really complex in the real world. Anyone that can think can learn to move a pawn, but to be a great chess player requires navigating a complex world.

When you look at most companies there are lots of scripts and people running procedures. When you look at LinkedIn they deploy to thousands of servers every day. Glu is model based, I am totally sold over the last few years on starting from models. Glu is model based, Gradle is model based, Puppet is model based. Chef is not. Puppet seems to be loved by Ops and Chef by developers. I am definitely on the dev side and I really love glu. Glu is fairly new, came out in 2009. Outbrain uses glu and unlike LinkedIn which always has a human step in deployments even though they are quite automated, pushes code into production in completely automated fashion.

statistics on the current usage of the glu project

Before glu we had manual deployments. I used to automate production plants in my past life. And workers felt I was taking their jobs away. I was like, I don’t know, I am young and just doing my job. I am sure there is something else for you to do right? The interesting thing is that Ops people often feel the same way about DevOps – but there is definitely quite a lot more to be done by ops folks aside from having to run tedious processes at 3am.

Glu – Big Picture. Glu starts with a declarative model. It computes actions to be taken. Glu has 3 major components. Agents, Orchestration Engine, and ZooKeeper. ZooKeeper is not built by the glu project. All the glu components can be used separately but in this presentation we will focus on using them all together. There are three concepts to focus on. Static model, scripts, and the live model generated as a combination of the previous two.

the model for glu deployment

ZooKeeper is a distributed coordination service for distributed applications. It is used in glu to maintain the state of the system. Each node in your system needs to have 1 agent at least. Putting more agents on a node is possible but does not make much sense. The idea is you have one node that is managed by an agent and that agent id unique to a given fabric. Clearly deployment tools have to be written in a dynamic language 😉 We use Groovy with glu. Agents at the end of the day are glu script engines. We have a Groovy api, the commandline, and a REST api all for handling and dealing with glu agents. So you have your pick. The heart of glu is really the orchestration engine itself.

The orchestration engine listens to events that happen out of ZooKeeper with its orchestration tracker. The events that come out represent the current state of the system. These are represented in Json. These events represent the live model.

The static model describes basically where to deploy something and how. All of these static events that create the static model are compared against the live model by the delta service in the orchestration engine. A delta point is calculated between the static and live models. It then becomes visible to the operator through the orchestration visualizer. Green in this visualization means that you have established the exact situation that you wanted in your static model.

the glu dashboard

With the delta a deployment plan also gets created. How do we fix red in the visualization? How do we get to the state we indicated in our static model. A deployment plan is created, there are usually a serial plan and a parallel plan. They each have their advantages and disadvantages. Speed is an advantage of the parallel model but consistency is sacrificed potentially.

Glu scripts provide instructions. There are 6 states. Install, configure, start, stop, unconfigure, uninstall. These are mapped out in a Groovy script. Each of these states have a closure block associated with them in the Groovy script. Glu scripts have a bunch of nice variables and services defined for you. Log is there for you, init parameters, full access to the shell and system env vars. The glu agent is again what handles and manages these scripts. It is basically a compute server for Groovy scripts.

useful things present by default in glu scripts

In order to test glu scripts we use the gluscriptbase test. Tests are nice and easy to run from within any build system like Gradle (or Maven if you feel the need for pain).

From a security standpoint glu is very focused on security. You can hook into LDAP. All things are logged into an audit log.

Some differences between glu and Puppet. They are both model based as well as being somewhat declarative – those are some similarities. Puppet is Ruby and glu is Groovy. The big difference though is that in glu delta computations are handled on the server side. You can see deltas across nodes. In the Puppet world the deltas are computed at the agent/node level. In glu it is the orchestration engine and zookeeper that is keeping track of all of this. There are advantages and disadvantages to this. Puppet also has better infrastructure support. If you are really nuts you can run Puppet from glu. To me this is nuts though.

Finding glu can be a bit hard. Google seems to find it now in many cases. The easiest thing to do is go to Github and search there. This will probably change over the short term though as glu becomes more popular. Here is the Github url: https://github.com/linkedin/glu

Also, take a look at the upcoming Camp DevOps Conference, its gonna be totally sweet!