Software Engineers versus Software Writers

Several years ago I worked at Microsoft and during that time I was involved in interviewing several hundred candidates. For those not familiar with the MS interview process (it may have changed) it is broken up into a number of one hour interviews where each interviewer is probing the candidate for specific qualities (not skills – qualities; a person can be taught C# but they can’t easily be taught to work well on a team). My role in the interview was typically to gauge how effective they were at actually engineering a solution – and to be specific here, the focus was not on their proficiency with a particular language; I actually asked for pseudo-code because I didn’t want the candidate caught up in worrying about syntax problems (the IDE/compiler does that for you anyway). What I was looking for was how the candidate made sure they understood the problem, and then what thought they put into designing a solution.

The problem I gave candidates to solve was the following:

  • Given a string of arbitrary characters, find every sequence that was a palindrome (a sequence of characters that was the same regardless of being read left->right or right->left)
  • Do NOT mark nested palindromes (i.e. one palindrome contained within another). for example aeraweabbadfrar should mark abba and rar, but NOT bb (nested within abba)
  • overlapping, but not nested palindromes, should be identified. for example aeraweabbabdf should mark both abba and bab

Within 5 seconds of giving the problem I knew immediately when certain people were going to fail, because they picked up a marker and immediately started writing pseudo code on my whiteboard. They were not going to arrive at anything resembling a working algorithm. For starters, what I provided was an incomplete spec definition. Without getting into implementation specifics (like if this is C/C++ is the string null terminated or is a length param passed), here are some questions that should be asked:

  • Should upper and lower case characters be considered equivalent or different?
  • Are we talking only about basic Latin alphas? Do numbers count? Do extended Latin characters? Is ô equivalent to ö or are they different? Hell, are we even restricted to Latin chars?
  • How should the palindromes be marked? I.e. print to screen, returned in array, written to file, etc?

The point of leaving out some details was to see if the person would just make assumptions, or if they would press me for a full problem definition. Now there is all sorts of psychology at play during an interview that could cause a person to exhibit behavior not indicative of their reaction under normal work conditions, so this alone isn’t a make or break. However asking questions to fully spec out a problem definition was a characteristic behavior of a person who was first really thinking about the problem. The other characteristic behavior was writing out a whole lot of sample inputs, covering a variety of many different input cases, to understand what sort of rules they needed to include in their logic BEFORE they started writing code. Potential variants in input included:

  • even vs. odd length string
  • even vs. odd length palindromes within string
  • no palindromes
  • one mega palindrome
  • many separated palindromes (i.e. palindromes with non-palindrome characters separating them in the string)
  • nested palindromes
  • single and multi overlapping palindromes (e.g. cattacat has both cattac and tacat – so only a single overlap. cattacata has cattac, tacat, and ata overlapping in a series)

And so forth. If a person did not at the very least spend a good deal of time brainstorming variants of input before they even started writing a solution, they were guaranteed to arrive at a solution that failed with at least some of the above variants. In my entire time in Redmond, interviewing hundreds of candidates, exactly 3 arrived at a working, pretty much perfect algorithm. A couple more created mostly working algorithms that maybe missed one or two specific variants, or had some simple off by one errors when traversing a string that a couple of minutes at an actual debugger would have found pretty easily – I didn’t hold that against people. The vast, vast majority produced crap, and by in large they were the ones that started trying to white board code immediately.

I don’t know who to attribute the quote to, but I’ve heard it in a number of places: “Crappy programmers spend 90% of their time writing code, 10% of their time thinking; Great programmers do the exact opposite” but it pretty accurately captures the few who did well in the interview and the multitude that did poorly. I also have a different manner of thinking about the same situation – making a distinction between Software Engineering and Software Writing. People who write software largely go about it the same way I go about this blog post – they have some general notion they want to convey, and start typing until they manage to convey it. At the end it isn’t going to be particularly well structured, and could benefit from a whole lot of editing. People who engineer software go about it completely differently – in the extreme, the go about as the folks writing shuttle software go about it. Engineering is a whole lot of time spent fully understanding the problem the code tries to solve, understanding all of the conditions the code will be used in, creating blueprints for the code, documenting everything so that QA and other devs can understand things easily, and only a very little amount of time actually writing code. Engineering is a whole lot of process, and it’s a fair bit of unfun work. That’s why coders by and large prefer to just sit down and write code, and really hate when companies try and dump a bunch of process on them.

A good engineering process pays for itself though. For starters, it’s pointless to write code if it is to solve a problem or serve a function that isn’t well understood. Money is just being thrown away in that scenario (for example, in a past job I shot down a proposal by a dev team to expose error logs to end users to help debug issues when engaging support. For any number of reasons, including security, that isn’t a freaking solution, but more specifically the two real problems that needed to be solved were that enough customers were seeing issues with the product that this feature was proposed in the first place, and that the code wasn’t well instrumented enough to provide support with the error details to address the issue. Giving the customer server error logs doesn’t solve either problem but it does create new ones. I really wish I had the power to fire folks on the spot when they propose stuff like that). On top of that though, ad hoc code tends to be less reusable, more error prone, of lower general quality, less secure, harder to test, offering lower performance, and with a shorter shelf life (because of the proceeding points). You may get initial code faster if the dev just starts writing, but you also get your dinner quicker if the chef doesn’t spend time preparing it. In either case, just because it is faster doesn’t mean it is worth paying for, especially as the speed is an illusion. In the end all of the drawbacks to ad hoc code – the lower quality, increased number of errors, etc. is going to cause the dev to spend much more time revisiting that code rather than working on other things. Test passes are going to take longer. The customer is going to be less happy with the finished product, so the next version is going to revisit things from the past version instead of forging new ground.

So my plea to companies is to really focus on building a great engineering process and then enforcing it. You will produce products faster, your customers will like them more, and for folks like me it makes it a whole lot easier to bake security into development (and thus make your products more secure). It’s a really tall order to get devs to engineer with security in mind when they follow no other engineering process. Its damn easy when devs already have coding standards they follow that just need to be updated for security, when the design process is already very analytical and just needs a slight nudge to include security specific concerns, when QA is already given ample documentation to really test something and can be shown how to include a handful of security tests as well, and so forth. For companies that want to produce secure software the first question they should ask is whether or not they have a culture of software engineering or software writing. If it’s the latter, they have a lot more work ahead of them.

~ Joshbw

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>