Tuesday, November 24, 2009

Ban The Debugger

How much logging do you really need in your application? I was visiting a client recently to help diagnose some problems with one of their applications. I asked for their support people to send me the log file, seemed like a good place to start. I got a file of about 2.5K, that seemed kinda small to me. I opened it up and found just one exception and it's stack trace so I got back to the support people to check. Yep, that was it, logging cranked up to max and the only output was one stack trace. Not even the date and time of the problem. Wow! A technique I find very useful prior to go live is to Ban the Debugger. Developers get very used to just firing up the debugger when fixing issues or diagnosing problems. This is fine but it means that no one looks at the log files from the point of view of someone who only has those available to find out what is going on. For our colleagues in support and operations it is only the log files that they can use to find and fix issues. So during development prior to go live I stop the developers from using the debugger, instead I ask them to spend at least some time trying to fix the issue based solely on the log files and whatever else our friends in support will have available once we have gone live. This usually leads to a big upswing in the amount of logging and it's logging we know helps to fix issues. Of course sometimes you do need the debugger, but hopefully after we've used the logging to narrow down the problem area and to understand what the users were trying to do at the time. So back to the client above - it turned out the technical lead had never worked in support and that the support team had not really been represented during development. We owe our colleagues better than this, perhaps Banning the Debugger for a time during development might help.

7 comments:

JimWi said...

I think if any two programmers used the same logging format this would be great...and of course there are the devs who have a binary approach to logging and you either get nothing or, quite literally, a line of log for every line of code.

Another option of course it to provide the support function with a very 'thin' debugger app so they can view the code and locate the problem. Why this is good, in my opinion, is because variable/register values etc can be directly manipulated in the debugger to recreate the failure scenario. This depends on the abilities of the support staff I guess...

auxbuss said...

Good article. I sort out corporate computing messes of various scales. A debugger is cool to find out where something has been and whether it goes where you think it went. Sometimes -- often early in a gig -- it's the only way. But if you aren't developing on your logs, then you aren't developing maintainable software. And if you aren't developing maintainable software, then you've got a lot to learn.

ketchup said...

Banning the debugger is like cutting your legs off. Doing so just before going live is even worse, because your developers are, at this point, used to their debugger; if you have to force them to use logfiles, you're probably without an efficient debugging method just before due day. I would recommend against it.

Next point: Too verbose a logfile is worse than useless, because it hides vital information in tons of tons of junk (if it really is in there). Developers tend to love stacktraces in logfiles, which is fine for debugging purposes, but useless for admins, how have to read logfiles of many different applications. As a rule of thumb, anything with a stacktrace attached is routed as bug back to development.

So I suggest to have an eye on your logfiles early in development. Make sure they are readable, not only by developers (because most logfiles are not read by developers), and completey free of junk. Ask one of your admin staff if he likes your logfiles and make sure he can find configuration problems himself. If he complains, listen. (Mind you, developers reading logfiles to find a wrong line in a configuration file is a brilliant way to throw away real money.)

Finally, log any exception which does not imply a configuration error with full stacktrace. No need to be shy here, a bug is a bug. But hey, that's what all people do, right?

Teemu said...

We built quite elaborate logging system for our game libraries.

It has really started to pay off and my usage of debugger has dramatically decreased. With debug traces, we can see the context of the problem, which is critical for reasoning about the execution in complex problem cases.

Building a good logging/tracing system is not a trivial task. There are a lot of aspects and design issues you need to consider: flexible tracing syntax, conditional tracing & conditional compilation of tracing for performance reasons, pretty printing, callstacks, robustly sending traces to server, etc. Different projects have different tracing needs, but I still find it odd that there isn't more quality open source qtracing libraries in common usage.

Wes Williams said...

you cannot debug most real production apps. the performance hit is way to much.

i think he is suggesting that requiring that the logs be read will require them to design the log output just as they would any other part of the app. this is real functionality that has a real role/user that requires it.

how about a user story:

In order to be able quickly investigate and resolve user issues
As a customer support specialists
I need a log that will indicate what was happening, when it occurred and where exceptions occurred.

maybe you need this for every feature you are adding.

Maht said...

Someone said it more succinctly in 1979

"The most effective debugging tool is still careful thought, coupled with judiciously placed print statements." -- Brian Kernighan

and followed it up with

"Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"

JimWi said...

I guess what I was trying to say is that, as with most things in life, it depends on the situation.

When promoting code to production I feel having some kind of debugging mechanism ready to troubleshoot any issues caused by the production environment is key to rapid resolution.

Once in the production environment, and from then on, proper and useful logging is the way to go...

And as a previous post stated, logging (just like security) should be integrated into the code as it is being written, not treated as an afterthought. We use a library for logging that gets reused each time...in theory at least.