Category canonical

The question method

Tonight I have been reading Improv Wisdom: Don’t Prepare, Just Show Up by Patricia Ryan .
20110717-223409.jpg

I first signed up for public speaking classes when I was about 7 or 8 years old, and have been continuously involved in some form of public speaking ever since. I have never encountered such a simple a brilliant tip for preparing your notes in a way that allows you to speak in a conversational, natural, improvised way while also ensuring that you don’t leave out vital information. After all, isn’t scripted the opposite of improvised?

The idea is this: write your notes as a series of short questions that you must answer while giving your speech or presentation. Make the questions as big or small as you need, and after you finish answering one question, you just go on to the next! Absolutely brilliant.

micro clouds

For a long time I’ve dreamed of a development environment that could easily spin up and down multiple lightweight containers wrapped around different service components. One way of doing this is with a tool called Vagrant [1], which will let me specify a base VM image for Virtualbox and then run chef or puppet recipes to configure the VM to match my webapp requirements and then automatically mounts my project source directly to be accessible inside the VM, does port mapping, and other magic. It’s amazing, a little too heavyweight, doesn’t scale to multiple containers to match production. You can see a toy project I did using Vagrant [3], it’s pretty neat to see the whole environment build up from nothing.

In the last release of OpenStack (shipped in Ubuntu 11.04), it supports a very lightweight and fast container/virtualization technology called LXC [2]. If you are familiar with FreeBSD jails or Solaris containers they have many similarities. This means that it is practical to run a dozen or more OpenStack VMs backed by LXC on a typical laptop. Another cool feature is that you can run LXC containers inside EC2 machines – in fact this is how cucumber-chef does acceptance testing of server configuration recipes [3].

I wonder if the default Launchpad.net developer setup could be changed to run in LXC managed by OpenStack? I imagine there would be many steps along the way, and perhaps the first one would be to use a single container, perhaps splitting out more containers as the Launchpad services rearchitecture moves along. We’d also need a HOWTO for installing and configuring OpenStack on Ubuntu 11.04 and grabbing a suitable Ubuntu 10.04.2 base image (to best match the Launchpad production servers). I am anxious for the day when I could decide to work on a Launchpad feature without needing to trash my local SSH, Apache, and Postgres configs – my involvement in projects is so infrequent and wide-ranging that I can’t really afford to have a dedicated Launchpad dev machine these days. Basically I want to have the default Launchpad dev setup be something that lets me run a script on a brand new Ubuntu laptop or a fresh EC2 instance and after a few minutes get the entire environment configured running with containers, and then also lets me do things like swap out service implementations by changing container configuration.

OpenStack is making heavy use of Launchpad, is a rising star in data centers around the world and is under active development, so it seems the ideal time to start using OpenStack for micro cloud development environments. What do you think?

[1] http://vagrantup.com/
[2] http://www.openstack.org/blog/2011/04/openstack-announces-cactus-release/
[3] https://code.launchpad.net/~statik/+junk/gpg-val
[4] https://github.com/Atalanta/cucumber-chef

The great liability of the engineer

A passage from To Engineer is Human – The Role of Failure in Successful Design

The great liability of the engineer compared to men of other professions is that his works are out in the open where all can see them. His acts, step by step, are in hard substance. He cannot bury his mistakes in the grave like the doctors. He cannot argue them into thin air or blame the judge like the lawyers. He cannot, like the architects, cover his failures with trees and vines. He cannot, like the politicians, screen his shortcomings by blaming his opponents and hope that the people will forget. The engineer simply cannot deny that he did it. If his works do not work, he is damned. That is the phantasmagoria that haunts his nights and dogs his days. He comes from the job at the end of the day resolved it calculate it again. He wakes in the night in a cold sweat and puts something on paper that looks silly in the morning. All day he shivers at the thought of the bugs which will inevitably appear to jolt it’s smooth consummation.

This quote claims to be from Herbert Hoover, speaking about being a mining engineer before entering politics. DevOps is full of those cold sweats, you fall asleep at night with the cold fingers of doom wrapped around your throat, whispering about Mean Time To Recovery.

Lazy test loading to deal with conflicting django settings

At work I have a bunch (ok, 3) different django projects in the same big code tree. Yes, I know we should split them up, thanks for pointing that out. Anyway, we are running python unit tests using the trial testrunner from twisted, because it’s very nice and we also have some twisted servers in this same code tree.

I have a problem with Django settings. There are some conflicting settings in the settings file used by different Django servers. The solution seems easy – run tests for each Django server in a separate subprocess. The excellent subunit library should do just the trick, it even has IsolatedTestSuite and IsolatedTestCase classes that take care of forking and running in a separate process.

Except this doesn’t work. Because when python modules are imported for test discovery, they also indirectly end up importing django.settings, and when the IsolatedTestSuite forks to run tests in a separate subprocess, that subprocess inherits the already polluted python environment that has the (sometimes wrong) django.settings imported already.

I am convinced that this must be solvable, but have been banging my head against it for a while and don’t understand unittest discovery well enough to solve it. I’ve created a self-contained little example that demonstrates the problem in isolation here: https://code.edge.launchpad.net/~statik/+junk/subunit-demo/

I will gladly endure your taunts if you teach me a solution.

Teambuilding and culture in a distributed workplace

Matt tells me that company culture isn’t the imaginary thing you are trying to create, it’s what you actually end up with based on a million little things that happened in real life. You can’t fake it, you can’t wish it better, you have to accept the reality and look at the way things ended up despite what the policy said.

That’s not an argument against trying to change things, but it means that none of us individually can completely control the end result, we only get to contribute little pieces. As a manager who feels responsible for doing everything I can to contribute to a healthy, positive, sustainable work environment and an employee who wants to work in such a place, one of the most amazing feelings is seeing other people take up the challenge, and make little gestures which pull everyone together in feeling like a team, like we are interconnected. You know, that whole Ubuntu thing.

I want to tell you a true story. For Ubuntu One, the server development team that writes the code is a completely separate department from the operational sysadmin team that deploys the code. This sysadmin team also services Launchpad and all the other websites that Canonical operates. There are some natural tensions between stability and feature development, we have different managers, the sysadmin team is much smaller and spread around the world in terms of timezones, the Ubuntu release schedule deadlines are carved in wiki stone, and we all want to get rid of downtime on the site. All of these things are forces which would tend to drive the teams apart, rather than together. We talk about devops ideas, but that ideal doesn’t change the reality of people working together under pressure with somewhat conflicting goals. We call this particular group of sysadmins “LOSA”. We do much of our real time collaboration over IRC, and the sysadmins are usually idling in a number of IRC channels, one per product team that they support. Because we’re not always sure who will be on duty at a given time, we ended up with a convention of saying “losa ping” on the IRC channel when we needed something done: kicking off a planned code update, running an ad-hoc test query against the staging or production DB, etc.

losa ping love online services

Losa Ping

After a particularly exhausting few months where it felt like we were saying losa ping every 30 minutes, the losa team went off duty for a week to have an in person meetup, and do some of their planning and teambuilding out of reach of the daily barrage of phone and IRC interruptions. Philip had an idea for a joke, and arranged for a cake to be secretly delivered.

As you can imagine, it was a hit, and everyone felt a little closer together. But the story doesn’t end there.

Losa pong

Months have gone by and we all forgot about the cake. Last week the Ubuntu One team gathered for an in person meeting. We were just as remote, the sysadmin team wasn’t there, we were sequestered away from laptops and IRC. The last day we had our afternoon break for coffee, and were amazed to see another cake! Later Michael confessed, and everyone had this silly grin on their face that just wouldn’t go away.

Two departments spread over several countries and dozens of different cities. Different backgrounds, different daily pressures, different opinions on the right way to do things. And yet, two cakes over several months make everyone feel connected. The cake is a symptom of the culture. You can’t prescribe it, you can’t control it, but you can contribute to it and you sure can enjoy it. What a fantastic crew.

A few words about Ubuntu One servers

I promised Matt Griffin I would talk a bit about Ubuntu One servers and some of the work that we’ve done in order to keep up with all the new users that have signed up over the last 6 months since Ubuntu 9.10 came out, and preparations for the growth that we expect with the launch of the music store and phone sync features in Ubuntu 10.04. I’ll start by writing up some descriptions about the different moving parts that make up the server side of Ubuntu One.

Ubuntu One has many parts. All the parts on the client side are free software, and about half the parts on the server side are free software. There are two major components that are currently closed source – the django webservers that implement the web interface for https://one.ubuntu.com and the twisted servers that implement the server side of the file syncing protocol (https://launchpad.net/ubuntuone-storage-protocol). The django web servers include some code that we are contractually not allowed to release related to integration with the music store partner, they also include some code that we’ve been pleased to be able to factor into libraries and release on their own (such as wsgi-oops and desktopcouch).

Aside from the file syncing protocol, the other major two channels to Ubuntu One services are syncml and couchdb protocols. Syncml is used to support syncing of contacts from mobile phones, and that server code is open source (http://funambol.com/), and the couchdb replication protocol is used to support replication of bookmarks, Gwibber messages, Tomboy notes (sort of), Evolution contacts, and just about any other application that cares to integrate with desktopcouch. If you are an app developer, the quickly project and the desktopcouch library have some really cool recipes for easily cloud-enabling your application. All the CouchDB server side code is open source as well (http://couchdb.apache.org).

Out of all the stuff in Ubuntu One that I find interesting, I’m most proud of the way we are using CouchDB, because this technology does so much to both preserve user autonomy over their data while also providing the convenience of replicating data through what could be called a personal cloud. If Ubuntu One goes away forever, all the data you have in CouchDB continues to work just fine, and all the applications integrated with desktopcouch continue to work just fine – you could even easily set up a separate CouchDB cloud and point all your machines to replicate to it instead of the Ubuntu One servers. For people who don’t feel like setting all that up, the apps will work out of the box with an optional Ubuntu One account. This ability for application developers to make use of a local data store that can automatically replicate if the users decide to enable Ubuntu One is something that I am convinced has huge potential for making users lives better without making them totally locked into Ubuntu One or any other service provider.

Finally, we have many somewhat boring servers running the standard things you run on any moderate-to-large web application: apache2, rabbitMQ, postgresql, squid, memcached, ha_proxy, iptables, nagios, etc. We’ve gone to some lengths to try and make sure that there are redundant paths to access the webserver farm even though an average page load may touch apache->squid-haproxy->django-memcached->postgresql. For every server that we run, we try to make sure and have several smaller servers running rather than a single big server, so that we can scale horizontally if at all possible, and do upgrades without taking the entire service out. And, ‘service’ is not a very good description, since we can update phone sync servers without taking down the file sync, bookmarks, music store, and tomboy notes. We have not yet split across multiple data centers, but are drawing up plans so we will be ready when the time comes.

There are two big changes that we’ve made on the back end that are not very visible to users but are still important. As of today, the biggest database we have is the one that helps keep track of the files you have stored in Ubuntu One, and that is now split across multiple ‘shards’, meaning that the data is partitioned so that even if a database goes down only some of the users are affected, not all users. This also lets us decrease our MTTR, or mean time to recovery, as well as improving performance of both the web site and the desktop file syncing client. We’re also putting the finishing touches on partitioning the CouchDB system, which has many many small CouchDB databases replicated from each users desktop. Partitioning or sharding here accomplishes the same goals – don’t allow the whole service to go down even if a server fails, make backups easier and faster, improve performance by scaling horizontally.

Another change that went live today is the new ‘dashboard’. If you have an Ubuntu One account and login to the website, rather than immediately being directed to the files view, you are now shown a dashboard that provides more of an overview of what you have stored in Ubuntu One. Hopefully this new dashboard is more informative, it is also significantly ‘lighter’ and cheaper to render than the entire files view. Here is a screenshot:

We are continuing to develop features in public and might have a few more surprises coming before the Ubuntu 10.04 launch. If you want to try out the very latest code, we deploy new versions every hour to http://edge.one.ubuntu.com, and are always interested in feedback on new features that you see there.

I hope this was useful – if anyone has questions about Ubuntu One, I’ll do my best to answer in the comments or perhaps write a new blog post if lots of people want to know about the same thing.

Managing a widely distributed engineering team – are weekly reports a good idea?

Thinking about why I haven’t blogged in a while, I realized I put a lot of effort into writing things at work. Things that aren’t really confidential, but often somewhat project specific. When we first started building this team over a year ago I explicitly did away with most reports to help us discover what shape the team should be. A few weeks ago I finally asked the Ubuntu One team to start writing weekly reports to help broadcast knowledge across timezones and teams, and it struck me that some of the reasoning and content might be interesting to other people who are trying to manage distributed engineering teams. So here is a lightly edited version of the email I sent internally a few weeks ago. Do you manage or work on a distributed engineering team? I’d love to hear your feedback and suggestions. Here’s hoping that I fixed the comments on my blog this time around :)

These weekly engineering activity reports will replace the weekly
reports that the ops+ team was doing, and replace the daily standup
summaries that the foundations+ team was doing. Each team is totally
free to conduct their standup meetings using the format, time, and
medium that works best for them – the standup is about your immediate
work group doing planning for the day, the weekly report is about
telling the rest of the team (and company, if they care to look) what
you have been doing. I hope the status reporting we used to do on team
lead voice calls can be done away with, and we’ll be able to spend the
voice calls really solving problems rather than catching up with old
news – also the whole team will be able to read the status, not just
the people who were live on the call.

We humans operate at a variety of different cognitive and affective
levels. We think about things like lifelong values, short and medium
term goals, and immediate activities – but not at the same time. It’s
vital to spend time thinking and working at each of those levels. One
of the things I love about this team is that we are a team of doers,
not a team of talkers or dreamers or empty unimplementable ideas. The
risk though, is that we never slow down enough to jump up a level or
two and do a weekly review, and think about how our activity that week
supports our goals for the next 3 months, directions for the next 5
years, and that patterns that are emerging. Over time I have found
that forcing myself to do a weekly review, and write a report even if
it’s only for me really helps me find and maintain balance and job
satisfaction. I value the ability to move fast, to adapt to change,
to be non-linear, to just do it rather than wait for a committee. But
I’ve come to accept that taking a few minutes to switch back out of
action mode to reflect on the bigger picture is not just useless
overhead, mental paperwork, it’s a critical part of the process of
being a creative person who *actually ships*. Don’t worry, I still
endorse the cult of the done!

Also, I’ve increasingly seen and been told that people don’t know what
the rest of the team is working on – this is a both a testament to how
hard communication is in an environment distributed over both
timezones and geography and a tribute to the incredibly diverse set of
projects we are privileged to tackle. But it can also lead to feeling
isolated, to not really having a shared understanding or vision of how
all the parts fit together. It’s my hope that by doing these weekly
reports together, we’ll have a lot better understanding of and
appreciation for each other. I’ve talked with the 4 team managers about these reports and gotten unanimous support.

This next section is trying to clarify some intentions based on
feedback and questions from them:

The goal of this report is not to find scapegoats, to assign blame, or
to point fingers. I don’t think we have a problem with people not
doing work. I think we have a big problem with people working their
hearts out somewhat invisibly, and not getting sufficient credit for
their work. The goal of this report is to massively increase
transparency, and often transparency can introduce some feelings of
vulnerability. That means if you have a bad week and get nothing done,
everyone else will know – and I think thats a good thing. Not because
people will point fingers, but because bad weeks happen and everyone
should be able to know the real hard truth of whether progress is
happening or not. It means if you spend 75% of your work helping
salvage some other part of the project that ran into big trouble and
doing customer support, and didn’t meet your own plan for the week,
everyone will know and join in appreciating you for seizing the
opportunity to contribute to the overall team when it made more sense
than sticking to the plan. It means if you spend two weeks doing work
on an upstream project that massively improves our standing as thought
leaders, and land zero branches in our own code, everyone will
understand what you are up to and why it’s the right thing to be
doing; rather than silently wondering “why is Joe not producing any
code?”.

A report is not a substitute for real time proactive action. It is
yesterdays news. What does this mean? If the DB server is going to
run out of room in 1 week, and Mary puts that in the report, and then
the DB crashes next week, i wouldn’t accept the excuse “but I wrote it
in the report, and nobody did anything” :) There should not be items
like “Joe: the release was late because the QA team did not give
feedback” “Tom: I couldn’t give feedback because the tests were
busted, so the release didn’t go out”. Instead, there should be items
like “Joe: missed the release due to a very short testing window,
worked with Tom on the phone to debug test problems, escalated a lucid
failure to the platform team, and finally got things uploaded 1 day
late.”

The report is going to be read – but maybe not the same day you write
it. Maybe the most valuable thing that comes out of your activity
report is when Susie Newhacker joins the company in 6 months to work
on a project, and scans through a bunch of weekly reports and
discovers a bunch of valuable links and branches that you reported on
this week. Maybe the most valuable thing is when you can approach your
manager and say “look, we really need to change my core hours or
change how rollouts are scheduled, I got pulled into emergency
firefighting/debugging 6 times in the last 4 weeks and missed
appointments with my family/friends each time”.

This email has gotten surprisingly long, but I hope it helps explain
why I’m requiring weekly reports and why I’m convinced they are not
just useless overhead. As we get some experience doing them, feel free
to propose additions/deletions/changes to the template. I’m looking
forward to the first activity report from everyone on Monday.

And here is the template I’m using for activity reports from each person on the team (managers also do a more big-picture report that talks about progress of various projects toward the overall roadmap):

  • Number of branches landed
    • One or two sentence description of each branch here, with a link
  • Number of code reviews worked on
  • Bugs or blueprints worked on and summary of each, with a link
  • Number of packages uploaded to PPA or Ubuntu or Debian
  • Upstream project contributions
  • Please list any absence last week and planned absence in the upcoming 3 weeks (including travel, sprints, conferences, and holidays)
  • Any other activities or achievements
  • Did you get pulled into any emergency unplanned work? What was it?
  • What is your plan for this week?
  • /dev/commentary – this is your chance to highlight something that the whole team should know about – a tip, a joke, a book, a blog, a tweet, a concern, some interesting news, a wild idea. More than a sentence, less than a chapter, links welcome.

Interested in porting Ubuntu i18n infrastructure from CDBS to debhelper

Ubuntu has some tools around i18n to make it easier to translate desktop software into many languages. Currently, some of that depends on stuff in CDBS (adding the gettext domain to .desktop files, stripping the translations from gconf schemas, etc.). This is all in /usr/share/cdbs/1/rules/langpack.mk. I would like to use debhelper instead of CDBS in my packages because I really like slide 45, but it doesn’t (yet) support this specific functionality that is mandatory for applications that go into Ubuntu main. I have no idea (yet) how to port this functionality, so I decided to start by writing down the goal. I’ve just started looking into debhelper code which is very nicely factored and documented, and here is the code (doesn’t seem all that impossible) that runs inside langpack.mk:

# -*- mode: makefile; coding: utf-8 -*-
# Copyright © 2006 Martin Pitt 
# Description: Rules for language pack support (POT file updating, and
# gettext domain key for .desktop/.directory/.server files)
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License as
# published by the Free Software Foundation; either version 2, or (at
# your option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
# 02111-1307 USA.


_cdbs_scripts_path ?= /usr/lib/cdbs
_cdbs_rules_path ?= /usr/share/cdbs/1/rules
_cdbs_class_path ?= /usr/share/cdbs/1/class

ifndef _cdbs_rules_langpack
_cdbs_rules_langpack := 1

# try to build a POT file
common-post-build-arch:: langpack-mk-update-pot
common-post-build-indep:: langpack-mk-update-pot

langpack-mk-update-pot:
	if [ -d $(DEB_BUILDDIR)/po ]; then 
	    if grep -q intltool $(DEB_BUILDDIR)/po/Makefile*; then 
		if [ -x /usr/bin/intltool-update ]; then 
		    cd $(DEB_BUILDDIR)/po; /usr/bin/intltool-update -p --verbose || true; 
		elif [ -x $(DEB_BUILDDIR)/intltool-update ]; then 
		    cd $(DEB_BUILDDIR)/po; env XGETTEXT=/usr/bin/xgettext ../intltool-update -p --verbose || true; 
		else 
		    echo 'langpack.mk: po/Makefile* mentions intltool, but intltool-update is not available'; 
		    exit 1; 
		fi; 
	    elif [ -e $(DEB_BUILDDIR)/po/Makefile ]; then 
	        DOMAIN=$$(grep --max-count 1 '^GETTEXT_PACKAGE[[:space:]]*=' $(DEB_BUILDDIR)/po/Makefile | sed 's/^.*=[[:space:]]([^[:space:]])/1/'); 
	        if [ "$$DOMAIN" ]; then 
	            echo "langpack.mk: Generating $$DOMAIN.pot..."; 
	            make -C $(DEB_BUILDDIR)/po "$$DOMAIN.pot" || true; 
	        fi; 
	    fi; 
	fi

	if [ -d $(DEB_BUILDDIR)/help ]; then 
	    cd $(DEB_BUILDDIR)/help; make pot || true; 
	fi

# add translation domain to installed desktop/directory/schema files
$(patsubst %,binary-predeb/%,$(DEB_PACKAGES)) :: binary-predeb/%:
	echo "langpack.mk: add translation domain to $(cdbs_curpkg)"; 
	if [ -e $(DEB_BUILDDIR)/po/Makefile ]; then 
	    DOMAIN=$$(grep --max-count 1 '^GETTEXT_PACKAGE[[:space:]]*=' $(DEB_BUILDDIR)/po/Makefile | sed 's/^.*=[[:space:]]*([^[:space:]])/1/'); 
	    if [ "$$DOMAIN" ]; then 
		for d in $$(find debian/$(cdbs_curpkg) -type f ( -name "*.desktop" -o -name "*.directory" ) ); do 
		    echo "langpack.mk: Replacing translations with domain $$DOMAIN in $$d..."; 
		    sed -ri '/^(Name|GenericName|Comment|X-GNOME-FullName)[/d' $$d; 
		    echo "X-Ubuntu-Gettext-Domain=$$DOMAIN" >> $$d; 
		done; 
                for d in $$(find debian/$(cdbs_curpkg) -type f -name "*.server" ); do 
                    echo "langpack.mk: Adding translation domain $$DOMAIN to $$d..."; 
                    sed -i "s// $$d.new; mv $$d.new $$d; 
                done; 
	    fi; 
	fi
endif

RTMP license forbids anyone from storing data?

I’m interested in the RTMP video over http protocol because I’m interested in video streaming. From reading the wikipedia entry, I saw that Adobe published the RTMP spec. Surprisingly, it had a license accompanying it – I’m not used to seeing a license next to a specification, so I read the license rather than downloading the spec. Shockingly, the license contains this statement:

Prohibited Uses
The rights and licenses granted by Adobe in the RTMP Specification, including those granted in
the Patent License, are conditioned upon Your agreement to use the RTMP Specification for only
streaming video, audio and/or data content and not to make, have made, use, sell, offer to sell,
import or distribute: (i) any technology that intercepts streaming video, audio and/or data
content for storage in any device or medium; or (ii) any technology that circumvents
technological measures for the protection of audio, video and/or data content, including any of
Adobe’s secure RTMP measures.
It’s hard to believe there is a software developer on the planet who can promise not to make, have made, or use any technology that intercepts data for storage in any device or medium. How could anyone possibly make use of this specification given such a license?

chdist, for easily working on packages from multiple distributions

If you ever need to do backports, I recommend the chdist tool. While working on one.ubuntu.com, I have frequently found myself needing to try out packages on both Ubuntu 8.04 and Ubuntu 9.10 (currently under development), often backporting a package from Karmic to Hardy. I’ve been running 9.10 (Karmic) on my primary laptop since the first alpha in order to work on packages included in the desktop, but we also need to run many of those same packages (Erlang, CouchDB, python-desktopcouch) on our server farms in the data center. Last week Tom Haddon showed me chdist, which makes it considerably simpler to work on backports, especially grabbing sourcepackages from an older distro version. http://packages.ubuntu.com is always nice for checking which version of a package is in the last few versions of Ubuntu, but chdist is even handier, since you can build APT trees for several different releases on the same machine, without requiring much disk space. Now I just need to finally learn how to use kvm, and I’ll be able to test the backports as well as make them.

Follow

Get every new post delivered to your Inbox.

Join 450 other followers

%d bloggers like this: