Mission Statement

The "Planet Bruce" blog is dedicated to a four-fold mission:

* Improve the technical and non-technical skills of software developers.

* Address the communication gaps between management (both technical and non-technical) and software developers.

* Help software developers to increase their income and happiness by maximizing their utility and productivity to their clients and employers.

* Contribute to the understanding of best practices in software development and technical management.

Monday, February 17, 2014

Debugging Dark Matters

Learning to Debug in Familiar Surroundings


Welcome to Computer Science 101.

Today, I'll be covering the principles of debugging. In later posts, I'll cover the theory and practice of using a single-step debugger in software development. But this post is about the mindset and techniques needed to debug any problem or situation.

Let's start with a simple but powerful definition.

To define "debugging" as "solving problems" is circular, at best. We need a definition that gives us a roadmap as to how to solve problems. Here's mine:

Debugging is the systematic challenging of one's assumptions.

Bam!

If you're not challenging assumptions systematically, then, IMHO, you are not debugging. You may be guessing, experimenting, attempting to intuit, or daydreaming, but you aren't debugging.

Let's begin our sordid tale with one simple piece of knowledge: 
  • We've entered a room and it is dark.

That's likely a problem, and we need some light.

"But, soft! What light through yonder window breaks?"

Sorry, the quote from Shakespeare's Romeo and Juliet is not "Hark, what light..." And now you understand why we have to challenge our assumptions (in this case by googling for the exact quote).

Let's list some of our initial assumptions and then go about challenging them:
  • There is a lamp in the room
  • The lamp is broken
  • This room needs a working lamp
  • The light is supposed to illuminate the room "reasonably well"

Note that we cannot begin to debug a situation until we have defined both the problem and what constitutes success.

In this case, let's define the problem as, "The room is too dark for me to find my pack of cigarettes," and define the solution as, "Illuminate the room so that I can find my cigarettes."

Note that success should usually be defined as some productive outcome (I want my pack of cigarettes). Don't confuse the light in the room--a means to that end--with the goal itself.

So that leads to alternative approaches we might investigate:
  • Can I find the cigarettes another way? (maybe a basset hound can sniff them out)
  • Can I grope around in the dark and still find the cigarettes? (avoid the light issue altogether)
  • Can I buy other cigarettes or do I need this pack? (find an alternative goal)
  • Can I light the room with my lighter instead? (find an alternative solution)
  • Can I light the needed area with a flashlight? (temporary solution)

First Do No Harm, Then Gather the Spec


I must investigate the alternatives, such as using a lighter.

What if there is a gas leak in the room? Will using a lighter cause an explosion?

What non-destructive tests can I do?

Let's assume that in this case, the homeowner (project manager) has said, "I know you want to find your cigarettes, but I really want a permanent light/lamp in this room. It used to work, but now it doesn't."

You've now gained three really important pieces of information: The homeowner:
  • Doesn't care about your cigarettes
  • Wants the room lit
  • Claims the light "used to work"
You can take the homeowner at his/her word for the first two bullet points. Let's assume that you respect and agree their opinion about these two. So, at least you've defined the problem and an agreed on a metric for success.

The goal is to light the room, preferably with the existing (but broken) lamp. 

What about the claim that the light used to work?

If you haven't personally verified it, or have 100% confidence in the person who did verify it, then you are not systematically challenging your assumptions and not debugging effectively.

I'm not suggesting you challenge the person directly, but I have seen countless engineering projects delayed because someone assumed that the initial claim was true. Take any unverified data with a grain of salt, not as the gospel.

The two most important pieces of information to gather when debugging an existing system are:
  • When was it last known to be working? 
  • What changed before it broke?

Elements of the Light Bulb Circuit


Now is where your expertise comes in. You should have a mental model of how a light works, and all the components that go into a working light. 

In this case, we'll assume the following (but in real life, you would investigate to verify):
  • There is a floor lamp in the room
  • The lamp is plugged into an outlet
  • The outlet is controlled by a wall switch
  • There is an incandescent bulb in the lamp
  • The person reporting the problem is not blind or deaf, nor delusional, nor a pathological liar (be wary of the last two). 

The next thing you need to do is replicate the problem.

Once you have replicated the problem, you have verified many of the earlier assumptions, and you will be able to tell when/if you've fixed it. So let's try to fix it.

There is nothing wrong with making an initial guess. It is probably a burnt out lightbulb, so you might as well try swapping in another one.

Right?

Wrong. See next section.

Try the Easiest Thing(s) First


In this case, the easiest approach is to try these three things:
  1. Check that the lamp is plugged in
  2. Flip the wall switch on
  3. Test any switch on the lamp itself
Did you notice any hidden assumptions in the above list?

How do you know what position constitutes "on" for the wall switch? Is it a dimmer switch that needs to be pushed and/or turned? Is it a flip switch that is reversed (or connected to a secondary switch) so that you may have to push it down to turn it on?

You need not be 100% sure of any of these things, but at least achieve reasonable certainty, or make a note of any areas of uncertainty for future investigation.

The most important principle at this point is to change only a single component or element state at a time. Record the results and make sure they are reproducible.

Swap Out Broken Pieces For Known Working Components


So, your best guess is that the lamp, lamp switch, wall switch, and cord are not the obvious culprits.

Is it a system in which you can remove the broken bulb (like a string of Christmas lights) and test the remaining ones?

If not, you'll need a known working bulb to swap in.

How do you establish a known working bulb? Well, one good way is to take the bulb from another room where it is already lighting a similar lamp.

Your goal in this phase should be to isolate or eliminate potentially broken components.

If swapping the bulb doesn't work, then maybe something else is broken.

In that case, it is prudent to carry another lamp (or other electric device) into the room, plug it in, and make sure the wall switch and outlet are working. Thus, you can continue to narrow down the likely problem.

Obtain and Learn to Use Appropriate Debugging Tools


So, at this point, we've either narrowed down the problem to something in the room, or eliminated all of the items in the immediate room as the culprit.

Let's assume something in the room is broken, but we don't know if it is the wall switch, the lamp, the lamp switch, or the outlet.

What hidden assumptions have we made?

Have you considered these:
  • Is the problem reproducible or intermittent?
  • Is there potentially more than one source of the problem?
For this discussion, let's assume you've narrowed it down to the lamp being the problem.

Do you go back to the homeowner and ask if you can permanently move another lamp into the room, or do you need to fix this lamp?

Is there an alternative outlet in the room that you can use? (perhaps one without a wall switch)

What tools and techniques can you use to debug the light switch and outlet?
  • Do you have a screwdriver to open the switch plate or outlet cover?
  • Do you have an electrical meter with a good battery?
  • Do you know how to use them without electrocuting yourself?

Expand Your Horizons to Consider the Entire System


Let's suppose that you've verified the operation of the in-room components (lamp, lightbulb, switch, etc.) but it still doesn't work.

What other assumptions have you made that are perhaps invalid and need to be explored further?
  • Is the circuit breaker for the room tripped?
  • Does the house have power from the main panel or a sub panel?
  • Is power out in the entire neighborhood?
  • Is power out for large areas?
Why does it matter if power is out for the entire neighborhood versus the entire state?

Two reasons:
  • If the power is out for just your house or neighborhood, have you called the power company to notify them?
  • If the power is out for a large area, how does it likely impact the schedule for restoring power? Have you made a back-up contingency for an extended outage, such as a generator? 

Keep working through the elements with the standard process described above. When in doubt, do some research or ask for some help. When stuck, get some food or sleep, or just time away from the problem to clear your head.

Let There Be Light


So what have we learned about debugging in today's blog post?
  • Gather the requirements - identify both the problem and viable solutions
  • Reproduce the problem
  • Identify and challenge your assumptions
  • Verify assumptions by testing
  • Swap out components to narrow down the problem
  • Use debugging tools to find more details about the problematic component
  • Consider outside sources of error
  • Make contingency plans

Conclusion


We've covered a lot of things that may seem obvious, silly, or superficial.

But, believe me, these are time-tested techniques that work every time to debug any situation.

When you are struggling with a problem, remind yourself of this mantra:

"Basic debugging techniques apply"

That is, you should fall back on these fundamental techniques when you are stuck on a problem.

If you follow this process religiously, you will see how well it works, and you will solve problems that have flummoxed you or co-workers in the past.

Happy debugging!

Be sure to comment, like, or share, and check out the other blog posts on the right-hand side.