Tuesday, November 24, 2009

Ban The Debugger

How much logging do you really need in your application? I was visiting a client recently to help diagnose some problems with one of their applications. I asked for their support people to send me the log file, seemed like a good place to start. I got a file of about 2.5K, that seemed kinda small to me. I opened it up and found just one exception and it's stack trace so I got back to the support people to check. Yep, that was it, logging cranked up to max and the only output was one stack trace. Not even the date and time of the problem. Wow! A technique I find very useful prior to go live is to Ban the Debugger. Developers get very used to just firing up the debugger when fixing issues or diagnosing problems. This is fine but it means that no one looks at the log files from the point of view of someone who only has those available to find out what is going on. For our colleagues in support and operations it is only the log files that they can use to find and fix issues. So during development prior to go live I stop the developers from using the debugger, instead I ask them to spend at least some time trying to fix the issue based solely on the log files and whatever else our friends in support will have available once we have gone live. This usually leads to a big upswing in the amount of logging and it's logging we know helps to fix issues. Of course sometimes you do need the debugger, but hopefully after we've used the logging to narrow down the problem area and to understand what the users were trying to do at the time. So back to the client above - it turned out the technical lead had never worked in support and that the support team had not really been represented during development. We owe our colleagues better than this, perhaps Banning the Debugger for a time during development might help.

Wednesday, November 04, 2009

Technology Lightning Talks in Chicago

Some of my colleagues from the ThoughtsWorks Technology Advisory Board members will be delivering Lightning Talks next week in Chicago, I'm not speaking myself but will be attending. If you are in Chicago and would like to come along follow this link for registration information. I think it's a pretty great list of speakers and am looking forwards to hearing them myself.

Monday, November 02, 2009

Databases and Separation of Concerns

A continuing source of pain on projects is Object Relational Mappings and databases. A contributor to this pain is the mixture of two concerns, this mixture seems to occur on nearly every project that uses a database. I think spending a moment to think about these two different kinds of usage is worthwhile. So what are these two concerns? A. Persist State Needed for Recovery This is just working state saved so that when the application restarts we can continue processing. For example perhaps the current state of a customer order or work in progress on a very long running calculation. B. Data saved for Reporting and Querying This is data saved so it can be queried later on, perhaps to allow end of month reports to be generated or tracking of user trends. It is not needed to recover the working state of an application. Many teams try to overload (A) to achieve (B), the sorts of problems this can cause are i) Data Volume - the volumes of data needed for (A) tend to be smaller, it's current working data as opposed to historical data. This can show itself as performance issues for the application as queries become slow over time. ii) The object design needed for (A) and (B) is not necessarily the same. This often shows itself as fields or object relationships being created with names like "history" or "recordOf", so an object design created for (A) becomes overloaded with things needed for (B). This again causes performance issues as the number of objects and data getting pulled into memory by the ORM can increase. It also means a simple state update can touch a lot of tables as we try to achieve (B) at the same time, chances are the indexes needed for historical queries can start to impact the update speed for these. iii) Confusing Code As with any other area where we fail to achieve separation of concerns the code can become confusing, for example state change operations become implicitly overload to create and persist data needed for historical reasons. iv) Archive of historical data becomes problematic, so you can't cleanly identify what data in the DB can be safely moved out to a historical database or deleted without impacting the functionality of the application itself. It wont always help but separation of these two concerns can provide clarity and address some kinds of performance issues. I certainly think it's worth calling out these concerns as different kinds of requirement even if you end up using the same implementation for both.