Josh Samuelson

It’s my blog.

Control Stack

One idea that keeps kicking around in my brain is how hard it is to get back to work after being interrupted, and how computers are able to handle that with ease. I wonder if it might be possible to deal with interruptions the way a computer handles a Control Stack. For the uninitiated, let me explain how a computer deals with things.

For a simple program here’s how it works: The computer goes merrily along performing the actions you give it, for example if you tell it something like print "hello" it will print out “hello” and then move on to the next thing. Maybe the next thing is print 2 + 2 and it’ll print out “4” and so on. That’s all pretty useless and boring until you add the ability to interrupt what it’s doing.

Those interruptions take the form of functions. So, if you wanted to do something fancy, like count the number of letters in a word, you’d use a function. That might look something like this print number_of_letters("hello") “number_of_letters” is the name of the function and “hello” is the word we’re counting. Assuming we’ve defined “number_of_letters” somewhere (or someone else has and we’re using their library of functions) we’ll see “5” printed out on the console. But here’s the thing, the computer has to remember where it was before it went off to count out those numbers.

The way the computer handles this is with a data structure called a stack. A good analogy is when you’re reading a book and you find a word you don’t know. Now you could place a bookmark in the book and close it, then grab a dictionary and open that and find your word, close the dictionary and open your book at the bookmark. But keeping track of that bookmark is a hassle, a smarter way is to set the dictionary on top of the open book and find your word. When you’re done with the dictionary, you can move it out of the way and pick up where you left off. That’s how a control stack in a computer works, execpt it stacks many more levels, a bit like if you’re reading a book in a Spanish you might need to look up a word in a Spanish/English dictionary and if you didn’t understand the English you’d need to look that word up in an English dictionary.

The computer is able to handle a huge control stack and it allows you to do some fancy stuff. For example, for our “number_of_letters” function we could write it some thing like this:

define number_of_letters(word_to_count)
  count = 0
  for each letter in word_to_count
    add 1 to count
  return count

I hope that’s clear enough for people who don’t know programming, I don’t want to over explain. That way of doing it is fine, but there is another way of defining that function that makes use of the control stack and it’s called “recursion”

define number_of_letters(word_to_count)
  if word_to_count is blank
    return 0
    return number_of_letters(word_to_count minus the last letter) + 1

So if we’re using it will “hello”, first it tries “hello”, then “hell”, then “hel”, then “he”, then “h”, and finally it’s blank. Like the stack of dictionaries, as it closes each function, it sends back a value as it looks it up. The number of letters in nothing is 0, so it picks up where it left off which was number_of_letters("h"). But now that it closed that top “book”, all it has to figure out is “0 + 1”, so number_of_letters("h") returns 1 and we can close that “book” and pick up where we left off with number_of_letters("he"). Since we’ve figured out number_of_letters("h") is equal to 1, we’ll return 1 + 1 from number_of_letters("he"), so number_of_letters("hel") will return 2 + 1, and so on. This is a trivial example of recursion, but it can be incredbily powerful and it can save a ton of keeping track of the counts of things and elaborate loops. It also let’s you make the computer smarter. All it needs to know how to do is check if a word is blank, how to drop the last letter from a word, and how to add. Hopefully this isn’t too hard to understand for someone who doesn’t know programming.

The point is, that the computer handles all of this with ease because it has the control stack, it adds new tasks to the top of the stack and removes them as their completed and painlessly picks up where it left off. What if we handled interruptions in the same way?

Here’s a human example: Checking your email when you start work in the morning.

Say you open your email and see there are 10 new emails. So you start from the top, and read the first one,

So here’s what your stack looks like:

  • Check Email
  • Read First Email

Except, that isn’t quite right. Even though that’s the order you did the things, in a stack the first thing is on the bottom. As you add new tasks, they get put on top.

So the stack is really this:

  • Read First Email
  • Check Email

The first email is about how you’re long lost relative in Nigeria has left you a substantial fortune and the sender just needs some money up front to pay the bank transfer fees. After reading that, your stack would look like this:

  • Delete SPAM
  • Read First Email
  • Check Email

But at that point you decide that you want to complain to the IT department about all the SPAM you’d been getting, so before you delete it you send an email to IT.

  • Send Email to IT about SPAM
  • Delete SPAM
  • Read First Email
  • Check Email

Once you’ve sent the email to IT, you remove it from the stack (or to use the CS term, “pop” it off the top of the stack). So know you’re back to this:

  • Delete SPAM
  • Read First Email
  • Check Email

So you delete the SPAM, and pop that off the control stack, and since that was the first email, you can also pop “Read First Email” off the stack. That puts you back to this:

  • Check Email

Now there are 9 emails left so you follow the same procedure. Since you deleted the SPAM, the 2nd message now become the “first”

  • Read First Email
  • Check Email

This one turns out to be from your boss saying “Turn in your TPS reports, they were due yesterday!”. Which makes you realize that you need to actually write the reports before you can turn them in so your stack turns into:

  • Finish TPS reports
  • Email TPS reports to boss
  • Read First Email
  • Check Email

And as your working on the reports you need to contact other people, look things up, get drafts reviewed, etc., etc. And everything is going fine until someone from IT comes by and asks about the SPAM email you contacted them about or you go to get a cup of coffee and someone mentions something else you need to be working on or your dentist’s office calls to remind you about your root canal tomorrow. Or all of those things plus the other 8 messages that you haven’t even looked at yet. And unlike the computer, your brain loses its control stack and you don’t remember that you need to finish the TPS reports until you get an email from your boss saying “Your reports are two days late!”

Ok so, I know it’s not realistic, who only has 10 emails in their inbox every day.

My idea is somewhat simple, just use a stack of sticky notes to emulate the control stack. That way, even if you get 10 interruptions you could still jump right back to working on those TPS reports. Now, I’ve seen people use sticky notes to keep track of work and usually it takes the form of a decorative fringe around their computer monitor, with every blank surface covered. That is not what I mean. I’m talking about a simple stack, whatever currently has your attention is on top. So you intentionally can’t see what’s underneath until you’re ready to work on it. If something else comes up, just add another note to the top.

So yeah, that’s a little wasteful of sticky notes. Maybe a better idea would be to always keep a blank note on top. Then as you’re working along if you get interrupted you can just write on that note what you were doing. Once you deal with the interruption you can go right back to whatever that note says and the one below it when that’s done and so on. That keeps focus narrow, you don’t have to hold all the to-dos in your head while you’re doing one of them.

It might even work with a Kanban style system. As interruptions come, sometimes they don’t need to be completed, but simply recorded and added to the backlog. So the control stack would always have “grab task from backlog” at the bottom. Or in reality it would have “Plan next project” at the bottom, something like this:

  • Do subtask
  • Do task
  • Grab next task from backlog
  • Plan next project

Although, most people probably don’t need something like “plan next project” at the bottom since once you’ve cleaned everything out you usually know how to decide what you want to do next.

Well, there’s the idea and probaly way over exlained. I’m going to try it out, I’ve been doing pretty well at work getting my tasks done, but I think I end up wasting time and mental effort because I’ll get interrupted and need to remember what it was I was working on. This was much worse at previous jobs, where I was interrupted all the time.

I’ll report back on how it’s going, I can already think of tools that might improve the process like a simple app to keep track of the stack.

Rinse Repeat

The last few months have been a major crash course for me in configuration management and Puppet in particular. I wouldn’t call myself an expert at least considering what counts as a puppet expert at Puppet Labs, but I feel like I know enough now that I could easily do my previous job almost entirely using puppet. A big part of my job now is building VMs that we will use in the classroom. The part that has really changed my perspective is how freely I can now throw something away that doesn’t work. My attitude with computers has often been that if something is seriously screwed up, it’s often simpler and faster to just start fresh rather than figure out some arcane issue. This is espeically true with Windows desktops OS’s. It’s also why I get a little annoyed when someone asks me to fix their computer, because if it were mine, I would just wipe it out and start fresh. Doing that too often can be a big pain and with servers it can be seriously risky because it’s easy to miss something important.

That’s where puppet has really changed my perspective. I realized that I no longer think of the server (or the VM image in my case) as the “thing” I’m managing, that’s just the expression of the real thing. The puppet code is what I’m managing and maintaining. That let’s me make and undo small changes as I work toward improving things, without having to worry how undoing a change might cause issues down the line. It’s so easy to start fresh that I don’t mind trying something out, throwing away the result, and trying another variation until I find the right solution.

I think this isn’t quite what puppet has been used for, or maybe what it was initially intended for, but I think it’s really the future of how people will manage most of their systems. Puppet is a great tool for addressing drift, i.e. when something has changed on one system that should otherwise be identical to the others. In the long run, I think it’s going to become simpler to just trash a system that has drifted and redeploy from the code.

If you think of it in terms of software its more obvious. If a piece of compiled software becomes corrupted, say because of an issue with the disk, you would never go in to the file and try to repair it so that it matched a known good copy. No, the correct thing to do in that case is to simply redploy the file. Now that so many servers are virtualized or containerized, I think it makes much more sense to think of them as files.

What’s been remarkable about this for me is how much it’s shifted my thinking. Even when working on projects at home I find myself wanting to set things up in puppet just so I don’t have to worry about it when I need to start fresh. It also has me doing something I’d never bothered with before for person tinkering, documentation. I’m actually documenting my work as I go by writing the puppet code.

Contain Yourself

My most recent project has given me a chance to play with docker again. I first heard of docker when I was in the midst of trying to learn puppet. I remember thinking to myself “This is great, I don’t have to learn a whole new language, I can just do everything in shell scripts in my Dockerfile, if it gets screwed up I’ll just trash it and build a new one.”

Now that I understand how puppet works and what it’s really for, I realize how silly that way of thinking is. Using docker that way is essentially a more conventient way to manage golden images, but it still fraught with the maintenance nightmares that come from custom setup scripts and golden images. I think the idea of immutable infrastructure is appealing, but it doesn’t solve the same problems as configuration management.

What I have gotten a chance to explore is what else docker is good for. I’ve been using it for sandboxed nodes in a standalone mockup of a puppet based infrastructure, i.e. a Puppet Enterprise master that also hosts a bunch of docker containers that pretend to be nodes. As far as the master is concerned, they are totally separate machines and it offers a good way to simulate a larger environment in a lightweight way. That’s not what docker was designed to do, per se, but it’s working amazingly well.

Since I wasn’t really using it for it’s intended purpose at work, I decided to play around with it at home. I reformatted my iMac, which had been running Mint. It had been acting a little weird, it was having some package management corruption issues, the kind of thing I hate debugging, so I just wiped it out and installed Ubuntu 14.04. One of the things I lost in that was my installation of PLEX which meant I couldn’t stream shows from the mac to my TV anymore (Sara was a bit bothered because she wanted to watch MASH). Since I had a fresh install with nothing but docker and puppet installed, I decided to try a dockerized version of PLEX. I found an appropriate image on docker’s website and ran it using Gareth Rushgrove’s docker module from the puppet forge. Basically all I had to do was declare the container with a couple of folders mapped, one for config and one for data, and that was it PLEX was running again.

That’s what docker is good for, to save you from the dependency hell that so often comes with running applications. It lets you say “I want this application, here’s the config and data” instead of saying “I need to install a bunch of stuff I don’t care about, some of which might conflict with what I currently have installed and also might break because of network issues or corrupted packages, just so I can get to the application I actually want installed.” Docker is especially wonderful at handling packages that have conflicting dependencies. Because everything is inside the container, you just don’t have to worry about it.

The way I see it, docker solves a real problem that has driven SysAdmins totally nuts. Docker allows clean and easy packaging of applications, it removes the unknowns of deployment and allows apps’ dependencies to be pinned to a version. If designed correctly, they also allow package maintainers to easily update internal dependencies and application code without worrying about all of the possible platforms that users might be running. It does this well. It does it so well that people begin to think they don’t actually need configuration management. Depending on your environment, you might not, or might not need much more than a simple manifest and puppet apply.

The thing is, puppet and docker solve different problems. Puppet is often used to solve some of the problems that docker solves, that’s true. For those cases, docker is better, hands down. Puppet does make it much easier to manage applications and all the hassle of updates and patches. But that isn’t really what puppet it for. Puppet is a way of defining infractructure in a declarative way. It’s a way of defining the finished state of a system and a tool for making that happen. I’m really glad that Gareth built the docker module for puppet because it’s a great example of how puppet and docker work well together.

My PLEX server is case in point, it’s defined by puppet, and the “application” (i.e. the docker container of the PLEX app and dependencies) is managed by puppet. Docker takes care of what docker is good at, puppet takes care of what puppet is good at.

Learning With Instant Feedback

I recently watched this somewhat mind-blowing video. I know he’s making a larger point about purpose in life, and I think he’s got a point, but really the part that blew me away was the idea of instant feedback not just as a “crutch” for learning, but as he put it “writing code without a blindfold on”. It also occurs to me that part of the reason more people don’t code is because it involves different parts of the brain, you have to constantly visualize data and what’s happening inside the code while at the same time using your analytical brain to actually create the code.

I know there are people who can do this, to some extent I’m one of them. I also know that I sometimes have trouble forming complete sentences when I’m deep in hack mode working on some problem. It’s also why something like IntelliSense in VisualStudio is so popular, it gives you instant context.

I used to think of that kind of think as cheating, a crutch for those who couldn’t remember how to do things. In retrospect, that attitude is probably what kept me from doing more coding after college. It’s also an attitude that was reinforced in my CS classes.

Now that I actually have to do some programming as part of my work, I don’t really care about cheating. I’ll happily use every trick in the book to make things better or build them more quickly. This is especially true of copying or repurposing what others have done (with proper attribution obviously).

With that in mind, I’ve been trying to think of ways that I could make writing puppet manifests have a little more instant feedback. Khan Academy has a great project that they use for teaching basic programming with JavaScript. It allows a student to write their code and see the results even before they’re finished. This makes errors like missing semicolons at the end of a line almost impossible, because the feedback is so immediate.

I think some of the could be translated to puppet. For example, as a student types code into an editor, puppet-lint could be checking the syntax live. That shouldn’t be too tough and could maybe even be done by adapting Khan’s editor and just wiring it up to a different output. But the output of puppet is more tangible, in order to check the results of a manifest, it needs to be applied to an actual node and then the node’s state needs to be checked.

There are a few pieces to this puzzle:

  • A responsive editor with live feedback on code
  • A sandbox environment where puppet code can be tested
  • Serverspec (or similar) tests with a live display of environment status
  • Some way of arranging the whole thing

This isn’t really on my official project list, but I’m going to try to tackle this one. I’ll also write up as much as I can and make a blog series out of it.

Blog Reboot

Things have changed quite a bit for me in the last year and I’ve been feeling the urge to start blogging again. Since my previous blog a bit of a mess and the content was all over the map, I decided to start fresh and keep the content more focused. This time I’m going to focus on projects, mostly tech projects, but maybe I’ll throw in the occasional cooking or brewing adventure.

The other reason for the reboot is that I’m sick of dealing with Wordpress’ nonsense, most recently it was alerting me to an update and wasn’t letting me update. Rather than deal with the pain of fixing it by hand, and possibly redeploying and restoring I thought I’d just give octopress a try.

My plan is to move this into S3 eventually but for now I’m just using the built-in github pages. I like the idea of replacing a heavy slow platform like wordpress with something totally static in S3. Once that’s done, I may even transfer my domain over to Amazon too and just get rid of my old hosting provider.

I’ll probably recycle some of my old blog, but some of it I’ll leave behind, for now I’m just going to focus on new content.