Tuesday, April 27, 2010

Jinx! (Beta) Visual Studio Plug-to to help debug your multithreaded nightmare (err… um… multithreaded code…)

Visual Studio Debugger Team Blog - Jinx: Visual Studio plug-in for debugging multi-threaded code

“Today I’m going to introduce a plug-in for Visual Studio (still in beta) that helps to speed up finding concurrency bugs in multi-threaded applications.

Example of a concurrency bug

Consider an application that has two threads (“Thread A” and “Thread B”) that share a common stack. Each thread reads (pops) one value off the stack and then writes (pushes) one value back onto the stack; and between pushes and pops these threads do other work. During testing the application almost always works correctly; however occasionally the application crashes. The test team records data inputs, machine configurations, and use of the application, unfortunately after hours of this they are still unable to reliably reproduce the crash. It turns out that this a concurrency bug (a bug that occurs only if the order of events that produce the crash occur with the exact “right” timing) in the following stack_push() function , which makes reproducing the bug very unreliable.

Introduction to Jinx

Jinx works by making a copy of the application’s state while it is being executed, and then runs multiple "simulations" of the application in the background trying to force concurrency bugs to appear. Since concurrency bugs normally occur in or around code that accesses shared data, Jinx adds artificial wait states to the simulations so that shared memory accesses occur as close together as possible. In this way, it can potentially reproduce concurrency issues such as the one demonstrated above in far fewer runs than waiting for the correct order of events to naturally occur on the system.

Unfortunately once the bug is reproduced, locating the problem code can be much harder. One issue that can interfere in correctly locating the problem is called overshoot. Overshoot occurs when one thread causes another thread to crash, the problem thread then continues to execute for a short period of time before the processor halts all of the threads. The problem thread is now at location that is nowhere near where the bug occurred, making discovery of the faulty code difficult. To address overshoot, Jinx introduces a feature called SmartStop, which holds the problem thread on the last line of code to communicate with the shared data, making discovery of the offending code much easier. In the example above, SmartStop would stop thread A in the stack_push() function - since this was the last point of communication before the crash.



Interesting… Even with all the multithreaded goodness in VS2010, there’s still room to grow. Plus I just liked the name of this project… :p

No comments: