Posts tagged ‘devops’

June 4, 2013

Fresh Stats Comparing Traditional IT and DevOps Oriented Productivity

This is a guest post by Krishnan Badrinarayanan (@bkrishz), ZeroTurnaround

The word “DevOps” has been thrown around quite a lot lately. Job boards are awash with requisitions for “DevOps Engineers” with varying descriptions. What is DevOps, really?

In order to better under what the fuss is all about, we surveyed 620 engineers to examine what they do to keep everything running like clockwork – from day-to-day activities, key processes, tools and challenges they face. The survey asked for feedback on how much time is spent improving infrastructure and setting up automation for repetitive tasks; how much time is typically spent fighting fires and communicating; and what it takes to keep the lights on. We then compared responses belonging to those from traditional IT and DevOps teams. Here are the results, in time spent each week carrying out key activities:

devops productivity stats

Conclusions we can draw from the results

DevOps oriented teams spend slightly more time automating tasks

Writing scripts and automating processes have been a part of the Ops playbook for decades now. The likes of shell scripts, Python and PERL, are often used to automate repetitive configuration tasks but with the newer tools like Chef and Puppet, Ops folk perform more sophisticated kinds of automation such as spinning up virtual machines and tailoring them to the app’s needs using Chef or Puppet recipes.

Both Traditional IT and DevOps oriented teams communicate actively

Respondents belonging to a DevOps oriented team spend 2 fewer hours communicating each week, possibly because DevOps fosters better collaboration and keeps Dev and Ops teams in sync with each other. However, Dev and Ops folk in Traditional IT teams spend over 7 hours each week communicating. This active dialogue helps them better understand challenges, set expectations and triage issues. How much of this communication can be deemed inefficient is subjective, but it is necessary to get both teams to onboard. Today, shared tooling, instant messaging, task managers and social tools also help bring everyone closer together in real-time.

DevOps oriented teams fight fires less frequently

A key tenet of the DevOps methodology is to embrace the possibility of failures, and be prepared for it. With alerts, continuous testing, monitoring and feedback loops that expose vulnerabilities and key metrics, teams are enabled to act quickly and proactively. Programmable infrastructure and automated deployments provide a quick recovery while minimizing user impact.

DevOps oriented teams spend less time on administrative support

This could be a result of better communication, higher level of automation and the availability of self-service tools and scripts for most support tasks. If there’s a high level of provisioning and automation, there’s no reason why admin support shouldn’t dwindle down to a very small time drain. It could also mean that members of DevOps oriented teams help themselves more often than expecting to be supported by the system administrator.

DevOps oriented teams work fewer days after-hours

We asked our survey takers how many days per week they work outside of normal business hours. Here’s what we learned:

Days worked after hours Traditional IT DevOps Oriented
Average 2.3 1.5
Standard Deviation 1.7 1.7

According to these results, DevOps team members lead a more balanced life, spend more time on automation and infrastructure improvement, spend less time fighting fires, and work less hours (especially outside of normal business hours).

DevOps-related initiatives came up on top in 2012 and 2013, according to our survey. There’s a strong need for agility to respond to ever-changing and expanding market needs. Software teams are under pressure to help meet them and the chart above validates its benefits.

Rosy Stats, but hard to adopt

How we got here

IT Organizational structures – typically Dev, QA, and Ops – have come to exist for a reason. The dev team focuses on innovating and creating apps. The QA team ensures that the app behaves as intended. The operations team keeps the infrastructure running – from the apps, network, servers, shared resources to third party services. Each team requires a special set of skills in order to deliver a superior experience in a timely manner.

The challenge

Today’s users increasingly rely on software and expect it to meet their constantly evolving needs 24/7, whether they’re at their desks or on their mobile devices. As a result, IT teams need to respond to change and release app updates quickly and efficiently without compromising on quality. Fail to do so, and they risk driving users to competitors or other alternatives.

However, releasing apps quickly comes with its own drawbacks. It strains functionally siloed teams and often results in software defects, delays and stress. Infrequent communication across teams further exacerbates the issue, leading to a snowball effect of finger-pointing and bad vibes.

Spurring cultural change

Both Dev and Ops teams bring a unique set of skills and experience to software development and delivery. DevOps is simply a culture that brings development and operations teams together so that through understanding each others’ perspectives and concerns, they can build and deliver resilient software products that are production ready, in a timely manner. DevOps is not NoOps. Nor is it akin to putting a Dev in Ops clothing. DevOps is synergistic, rather than cannibalistic.

DevOps is a journey

Instilling a DevOps oriented culture within your organization is not something that you embark on and chalk off as success at the end. Adopting DevOps takes discipline and initiative to bring development and operations teams together. Read up on how other organizations approach adopting DevOps as a culture and learn from their successes and failures. Put to practice what makes sense within your group. Develop a maturity model that can guide you through your journey.

The goal is to make sure that dev and ops are on the same page, working together on everything, toward a common goal: continuous delivery of working software without handoffs, hand-washing, or finger-pointing.

Support the community and the cause

Dev and Ops need to look introspectively to understand their strengths and challenges, and look for ways to contribute towards breaking down silos. Together, they should seek to educate each other, culturally evolve roles, relationships, incentives, and processes and put end user experience first.

The DevOps community is small but burgeoning, and it’s easy to find ways to get involved, like with the community-driven explosion of DevOpsDays conferences that occur around the world.

Set small goals to be awesome

Teams should collaborate to set achievable goals and milestones that can get them on the path to embracing a DevOps culture. Celebrate small successes and focus on continuous improvement. Before you know it, you will surely but gradually reap the benefits of bringing in a DevOps approach to application development and delivery.

Start here

For deeper insights into IT Ops and DevOps Productivity with a focus on people, methodologies and tools, download a 35-page report filled with stats and charts.

October 13, 2012

QCon SF 2012, a DevOps field guide

by @mattokeefe

QCon SF 2012 logo

I am very excited to be attending QCon SF for the first time Nov 7-9. There is quite a bit of DevOps related content, and I will be live-blogging as much as possible. Meanwhile, here are some notes on what you might look forward to as an attendee.

Monday 11/5

Tutorial: Continuous Delivery – Jez Humble
Jez wrote the book on Continuous Delivery, which ties Agile and DevOps together into a pipeline of goodness. I saw him present at Camp DevOps last year, and it was awesome.

Tuesday 11/6

Tutorial: Implementing a Continuous Delivery Pipeline: From Commit to Deploy – John Esser & Dan Gilmer
More Continuous Delivery goodness.

Wednesday 11/7

Opening Keynote: Cool & Useless – Kevlin Henney
When I think of DevOps, I often think of addressing concerns around availability, reliability, and performance. There are so many variables with each solution to a given technology problem that it is wise to limit your toolkit just so you can get a handle on these concerns. So, I hope that this talk addresses the “cool & harmful” aspect as well in the sense that the more things you try, the more you get burned in production.

The realtime web: HTML5 WebSockets, Engine.IO, Socket.IO, SPDY, HTTP2.0 & BeyondGuillermo Rauch
The realtime web promises a great leap forward in terms of UX. However I wonder how many Ops teams are prepared for some of these new standards and the impacts on infrastructure. For example, SPDY support is not yet provided in some infrastructure layers.

AppWatch – a big data application monitoring system for eBay – Bhaven Avalani and Yuri Finklestein
Large scale application and infrastructure monitoring… enough said.

How not to measure latencyGil Tene
Measure Everything is a core tenet of DevOps. I always thought measuring latency was as simple as a pair of well-placed calls to System.currentTimeMillis (in Java), but this session’s agenda suggests otherwise. Gil will demonstrate and discuss some false assumptions and measurement techniques that lead to incorrect results.

Continuous Happiness – Chris Kelly
This talk should help to drive home the point that DevOps is a cultural movement, not a specific set of tools and processes.

Caching Hypermedia APIs – Tim Stokes
Caching is one of the best ways to improve performance and scalability. However it is easy to get it wrong so I am always trying to learn more about this topic.

Thursday 11/8 featuring an entire Continuous Delivery track

Keynote: NoSQL: Past, Present, FutureEric Brewer
Eric Brewer authored the CAP Theorem, which is frequently referenced in discussions of design decisions related to NoSQL databases.

Changing Culture & Being a Force for AwesomeJesse Robbins, Master of Disaster
Jesse is the cofounder of Opscode, the company behind Chef. He will be talking about hacking culture.

Open Space: Continuous Delivery
An unconference session, where we decide the discussion topics.

Product Development with Continuous ExperimentationFrank Harris and Nell Thomas
Etsy is a pioneer of continuous deployment, and this talk will be about how they take advantage of data to drive that process.

Large-Scale Continuous Testing in the CloudJohn Penix
John will talk about how Google runs millions of automated tests per day, using Cloud infrastructure.

Release Engineering at FacebookChuck Rossi
Chuck, Facebook’s first release engineer, will describe how they release hundreds of changes every day.

Adopting Continuous DeliveryJez Humble
Jez will address the organizational, architectural and process factors that are important for adoption of Continuous Delivery.

Friday 11/9 featuring the “Architectures you’ve always wondered about” track

Keynote: Race Conditions, Distribution, Interactions–Testing the Hard Stuff and Staying SaneJohn Hughes
This talk will be about new automated testing techniques for the most tricky test scenarios.

Scaling Pinterest – Marty Weiner and Yashwanth Nelapati
A Cloud Ninja and a Cloud Balrog will discuss server management on EC2, amongst other things.

Cloud Computing at GoogleRandy Shoup
Randy will present design principles for building and maintaining highly-available planet-scale applications in the cloud, including isolation, failure tolerance, testability, and security.

Architecting for Continuous Delivery at – John Esser and Russell Barnett
This talk will describe how a service-oriented architecture can support Continuous Delivery.

Uncommon Sense – Scaling Youtube – Mike Solomon
Mike, one of the original YouTube engineers, will outline his philosophy on scaling, testing, and writing code.

Timelines at ScaleRaffi Krikorian
Raffi, who likes to break things at Twitter, will talk about building, managing, and debugging an infrastructure that supports hundreds of millions of users around the world.

San Francisco
photo: Håkan Dahlström

Besides the conference, I’m really looking forward to being in San Francisco. So many great restaurants, so little time!

September 30, 2012

Automating Cloud Applications using Open Source at BrightTag

This guest post is based on a presentation given by @mattkemp, @chicagobuss, and @codyaray at CloudConnect Chicago 2012

As a fast-growing tech company in a highly dynamic industry, BrightTag has made a concerted effort to stay true to our development philosophy. This includes fully embracing open source tools, designing for scale from the outset and maintaining an obsessive focus on performance and code quality (read our full Code to Code By for more on this topic).

Our recent CloudConnect presentation, Automating Cloud Applications Using Open Source, highlights much of what we learned in building BrightTag ONE, an integration platform that makes data collection and distribution easier.  Understanding many of you are also building large, distributed systems, we wanted to share some of what we’ve learned so you, too, can more easily automate your life in the cloud.


BrightTag utilizes cloud providers to meet the elastic demands of our clients. We also make use of many off-the-shelf open source components in our system including Cassandra, HAProxy and Redis. However, while each component or tool is designed to solve a specific pain point, gaps exist when it comes to a holistic approach to managing the cloud-based software lifecycle. The six major categories below explain how we addressed common challenges that we faced and it’s our hope that these experiences help other growing companies grow fast too.

Service Oriented Architecture

Cloud-based architecture can greatly improve scalability and reliability. At BrightTag, we use a service oriented architecture to take advantage of the cloud’s elasticity. By breaking a monolithic application into simpler reusable components that can communicate, we achieve horizontal scalability, improve redundancy, and increase system stability by designing for failure. Load balancers and virtual IP addresses tie the services together, enabling easy elasticity of individual components; and because all services are over HTTP, we’re able to use standard tools such as load balancer health checks without extra effort.

Inter-Region Communication

Most web services require some data to be available in all regions, but traditional relational databases don’t handle partitioning well. BrightTag uses Cassandra for eventually consistent cross-region data replication. Cassandra handles all the communication details and provides a linearly scalable distributed database with no single point of failure.

In other cases, a message-oriented architecture is more fitting, so we designed a cross-region messaging system called Hiveway that connects message queues across regions by sending compressed messages over secure HTTP. Hiveway provides a standard RESTful interface to more traditional message queues like RabbitMQ or Redis, allowing greater interoperability and cross-region communication.

Zero Downtime Builds

Whether you have a website or a SaaS system, everyone knows uptime is critical to the bottom line. To achieve 99.995% uptime, BrightTag uses a combination of Puppet, Fabric and bash to perform zero downtime builds. Puppet provides a rock-solid foundation for our systems. We then use Fabric to push out changes on demand. We use a combinations of haproxy and built-in health checks to make sure that our services are always available.

Network Connectivity

Whether you use a dedicated DNS server or /etc/hosts files, to keep a flexible environment functioning properly, you need to update your records. This includes knowing where your instances are on a regular and automatic basis. To accomplish this, we use a tool called Zerg, a Flask web app that leverages libcloud to abstract away the specific cloud provider API from the common operations we need to do regularly in all our environments.

HAProxy Config Generation

Zerg allows us to do more than just generate lists of instances with their IP addresses.  We can also abstractly define our services in terms of their ports and health check resource URLs, giving us the power to build entire load balancer configurations filled in with dynamic information from the cloud API where instances are available.  We use this plus some carefully designed workflow patterns with Puppet and git to manage load balancer configuration in a semi-automated way. This approach maximizes safety while maintaining an easy process for scaling our services independently – regardless of the hosting provider.


Application and OS level monitoring is important to gain an understanding of your system. At BrightTag, we collect and store metrics in Graphite on a per-region basis. We also expose a metrics service per-region that can perform aggregation and rollup. On top of this, we utilize dashboards to provide visibility across all regions. Finally, in addition to visualizations of metrics, we use open source tools such as Nagios and Tattle to provide alerting on metrics we’ve identified as key signals.

There is obviously a lot more to discuss when it comes to how we automate our life in the cloud at BrightTag. We plan to post more updates in the near future to share what we’ve learned in the hopes that it will help save you time and headaches living in the cloud. In the meantime, check out our slides from CloudConnect 2012.

July 2, 2012

The Eight Hats of Data Visualization by Andy Kirk

What gets measured gets managed. Sometimes however it is difficult to measure because well, at web scale, there are just too much going on. This is where data visualization can help. In this Orbitz IDEAS talk by Andy Kirk of we are presented with some powerful techniques for thinking about data in terms of how it should be visualized. Don’t forget to watch the QA at the end. It is quite informative.

Andy Kirk presents “The 8 Hats of Data Visualization Design” from Orbitz IDEAS on Vimeo.

The nature of data visualization as a truly multi-disciplinary subject introduces many challenges. You might be a creative but how are your analytical skills? Good at closing out a design but how about the initial research and data sourcing? In this talk Andy Kirk will discuss the many different ‘hats’ a visualization designer needs to wear in order to effectively deliver against these demands. It will also contextualize these duties in the sense of a data visualization project timeline. Whether a single person will fulfill these roles, or a team collaboration will be set up to cover all bases, this presentation will help you understand the requirements of any visualization problem context.

Speaker: Andy Kirk is a freelance data visualization design consultant and trainer, and editor of the website, a popular data visualization blog. After graduating from Lancaster University with a B.Sc (hons) in Operational Research, he held a number of business analysis and information management positions at some of the largest organizations in the UK. Late 2006 provided Andy with a career-changing ‘eureka’ moment when he discovered the subject of data visualization and he has subsequently passionately pursued an expertise in the subject, completing a research Masters M.A (With Distinction) at the University of Leeds along the way. In February 2010 he launched the blog with the mission of providing readers with inspiring insights into the contemporary techniques, resources, applications and best practices in this exciting subject. His consultancy work and training courses extend this ambition, helping organizations of all shapes, sizes and domains enhance the analysis and communication of their data to maximize impact. Andy is currently working on his first book, with more to follow, and has been seen speaking at a number of important conference events, most notably as judge and presenter at Malofiej 20, the 20th anniversary of the Infographics World Summit in Pamplona, Spain.

June 7, 2011

Orbitz IDEAS Video: Teyo Tyree on Model Driven Management with Puppet

posted by @martinjlogan

Teyo Tyree one of the founders of Puppet Labs talks to the about model driven configuration management with Puppet. I was really impressed by Teyo and the whole puppet team to be honest and really appreciate their rigorous sysadmin culture. They seem to be very focused on the practical issues at hand and less interested in keeping up with the latest marketing buzzword of the day.

Teyo Tyree on: Model Driven Management with Puppet from Orbitz IDEAS on Vimeo.

During this video you will learn how puppet works and what drives its architecture. You will get an understanding of how the model driven approach factors into Puppet. You will also learn how to leverage this in extending Puppet configuration management and integrating it with other systems.