Friday, 23 March 2012

I imagine a software troubleshooting team should look like this.....

Troubleshooting (Debugging) at system level has a great role in embedded firmware/software development. Sometimes, there is a dedicated team for it. In my opinion, the players of this team are special in the sense they perform their debugging activities i.e. moving from hypothication to proving and defining the problem and then suggesting a solution, the dynamism of the problems they face, and their readiness to explore new system areas.

A team like this may not be in limelight always and their activities also may not be much visible always to the other teams (involved in product development) during the entire product development lifecycle.Sometime this team is recalled only when any trouble becomes a blocking issue.

I spared a lot of time thinking how a troubleshooting team should surface out and establish its existence as a front end role player throughout the product development lifecycle.

Well, a lot has been said and written on the art of firmware/software troubleshooting but one thing which I have not yet come across is "TROUBLE-SHOOTERS DIARY" . Every troubleshooter should maintain a troubleshooting diary which should contain :-
  • A QRC (quick reference card) of all debugging commands.
  • Short notes on tool usage.
  • Basic data flow and control flow models of the modules in the system.
  • Short notes on verification methods.
  • Instrumentation/debugging code. 
  • Parsing scripts.
  • Statistics of issues debugged in a week and in a month(type of issue, frequency of occurence, preventive/corrective action taken).
  • (If the issue is new) A case study. 
  • Templates/Auotomation scripts for all kinds of reports he/she prepares.
  • there can be more..........
Unit test bug found and fixed by a system level troubleshooter, should be highlighted in the statistics corner of his diary; as finding a unit testing bug at system level (that too when your system is concurrent) is tedious and should bring him an accolade.

The troubleshooting team should call a "DIARY-MEETING" every week or every month. The team should discuss each others diary entry of the week/month and then should present a quantified data/statistics of their effort and issues fixed/analyzed and actions to the entire team.This meeting should also include identification of new/strange issues and a case study preparation on the same.

The prepared case study should be shared with the development team and newbies in the troubleshooting team so that they can learn from their mistakes and the issues never recur or if recur, the analysis should not take much time.

  • Every patch released by the integration team should be verified and stamped by this team for release. This can save a lot of painstaking effort.
  • Troubleshooters should also be called for module level design reviews as over the time they become more acquainted with the system.

This may be an existing model in many organizations where the process model supports it.
But it may be a missing link in some organizations where time to market is a critical factor and multiple products are pipelined.

Apart from this there should be a "Troubleshooting Skills Improvement Plan" which should contain schedules for:-
  • Workshops on various components of the system.
  • Brainstroming sessions.
  • Data Analysis sessions.
  • Sessions by troubleshooting Pandits.
  • and more.......
 My imagination looks perfect in making a troubleshooting team front end role player in software product development.

Having said all this, I would also say that I am not an expert in this area.However, I aspire to be.