Why am I babbling about it? Well, in order to answer that I have to introduce you to yet another programmer's best friend: DTrace (too lazy to write more, go read about it yourself!). In my opinion, if there's one particular attribute of DTrace that makes it so appealing, it has to be its simplicity. I mean, look at what it achieves; in a completely unobtrusive manner, you're suddenly in the know of the inner workings of the deepest layers of your code. All you need is knowledge of the D language and a shell. Such information about the behavior of your code has never been so easily accessible. Valgrind, similarly, gives you a wealth of information about your code -- at a much lower level, though -- but the process you have to go through to arrive at such information is very complicated in comparison, unless a tool that extracts the exact information you need is already available. So, I've been thinking, what if DTrace's ease and accessibility could be brought to Valgrind, so that anyone could write a Valgrind tool in a scripting language? Although the complexity of writing a Valgrind tool would not be an as big an obstacle for its intended audience in comparison to DTrace's, I believe such possibility would still be appreciated.
It's been a while since I started playing with this idea in my head, but for a few weekends now, I've been actually trying to come up with a prototype. The idea was to write a Valgrind tool that's composed of a scripting language interpreter and a set of interface functions/objects that act as a bridge between the script and the Valgrind tool API. How the code is instrumented is completely dictated by the script -- the tool is simply an environment in which the script can be executed.
Initially, I thought of writing my own interpreter for a language that I had intended to make as similar as possible to D. Fortunately, not much time had passed till I realized this was not the wisest thing to do. It made much more sense to try to embed an interpreter for any existing language such as Ruby or Python. I settled for Ruby, since I found it easy to write C extensions for, and Ruby 1.9 was moving ahead nicely in the performance race. Filled with excitement, I ran immediately into the first roadblock. Valgrind has this incredibly tight restriction of not allowing tools to use any libc calls. Instead, it offers its own implementation of a subset of libc functions. Good luck trying to modify a modern, full-fledged language interpreter to use those!
However, I didn't give up. I downloaded the Ruby interpreter source and tried to switch it over to Valgrind's own libc substitutes. It was tough. I first had to get rid of the dependency on any library, since this basically means code that also relies on libc and which I can't change easily. During that process, I had to disable some language modules, but then I came across code that looked like trouble; code using traps & setjmps. My inner guts felt this won't play nicely with Valgrind, and it was difficult to replace anyway, so I started looking for a less demanding interpreter.
This time around I did my homework better, and spent more time comparing scripting languages. I finally chose Lua: it's light, fast, compilable, it's interpreter is smaller and simpler than Ruby's, and it too can be extended easily in C. In less than day, I was able to get it to compile inside a Valgrind tool, and actually run simple code snippets at certain events, so basically, phase 1 of the project is done. I called the tool luagrind, and for starters, here's a sample script that simply dumps the command line options passed to the tool:
-- Script: script.lua
print("Valgrind command line options:")
for i, v in pairs(vg.clos) do
print(i, "=", v)
When luagrind starts, it creates a "Lua state" where a single global class called "vg" lives. The script simply dumps the "clos" table which is also created by luagrind using the C API, and contains all command line options. To invoke luagrind using this script, the following command is used:
valgrind --tool=luagrind --luascript=script.lua --arg1=v1 --arg2=v2 date
It doesn't matter which binary we're running here as we're not instrumenting anything yet. In the example above, I chose "date". The output looks like this:
Valgrind command line options:
luascript = script.lua
arg2 = v2
arg1 = v1
Sat Jun 7 03:17:38 EEST 2008
A lot of work still has to be done. The "vg" class has to grow to support all important operations, such as defining the needs of the tool, exposing the VEX instruction set (used internally by Valgrind as an intermediate representation of instructions), and most importantly, directing the instrumentation process. More on this when more is done!