Processing the output of any program on the fly

You run a tool for hours, save its output to a logfile, then parse the log for errors only to realize that there was a critical message in the first few minutes of the run. What a waste of time.

You can instead process stdout of a program on the fly. Using a unix pipe and your most powerful ally, the bash shell, Greg’s wiki explains how you can read stdout on the fly.

Let’s start with a basic example: You want to stop at the first error.

set -o pipefail
(echo "foo";  echo "error"; echo "baz") |
(
  while read; do
    if [ `echo $REPLY | grep -c "error"` -eq 1 ]; then
      echo Found error: $REPLY >/dev/stderr
      exit 2
    fi;
    echo $REPLY
  done
)
exit $?

First we pretend to have a process which outputs messages. This is modeled by the echo statements. They are grouped in a subshell (the parens open a subshell) and the output of this subshell is piped to next command down the line, which is a subshell too. Inside the second subshell contains the while statement. The while statement reads its stdin and greps for errors. When an error is detected, the problematic input is copied to /dev/stderr and an exception is thrown with the exit statement. If there were no errors, the input is simply copied to stdout with the echo statement. At at the end, we re-thrown the exit code with exit $? so the caller knows this script has encountered an error.

There are a few shell things in there. The parenthesis ( ... ) creates a subshell. The while read reads stdin one line at a time. The $REPLY is a bash built-in variable whose value is set by read. The $? built-in variable holds the exit code of the last command, function or script that was run. In this script, the last thing that was run is the subshell containing the while loop, so $? holds the exit code of that subshell.

Here is another version, where we look for both errors and a “must see” expression.

set -o pipefail
var=1
(echo "foo" &&  echo "bar" && echo "baz") |
(
  while read; do
    if [ `echo $REPLY | grep -c "error"` -eq 1 ]; then
      echo Found error: $REPLY >/dev/stderr
      exit 2
    fi;
    if [ `echo $REPLY | grep -c "must see"` -eq 1 ]; then
      var=0;
    fi;
    echo $REPLY
  done
  exit $var
)
exit $?

Here we have a variable var which we clear when we find the “must see” string. In this example, it will not find the expression, and after the loop exits, exit $var throws an non-zero exit code (var is initialized to 1 at the beginning). The exit code is re-thrown after the loop has exited so callers to this script will know how it ended.

You can do all kinds of sophisticated things here, such as count lines, or print a few hundred lines beyond an error before exiting, or count errors and abort after you’ve seen N of them. You can store the output to a logfile with I/O redirection. It gets a bit hairy when you also want to use tee, but it can be done.

Your first book when you go into ASIC verification should not be the Art of Verification with Vera, but rather the Advanced Bash-Scripting Guide. Read all you can about the shell, it will not be wasted. Hang out on IRC #bash, it helps too. When you reduce every program to its essence, you realize all you ever need is the exit code. You do care about what thousands of log files have to say, but first and foremost you want to know: pass or fail? The exit code is the answer, no matter how sophisticated your entire verification environment becomes: cmd && echo pass || echo fail.

Advertisements

3 Responses to Processing the output of any program on the fly

  1. Etan says:

    I agree that Bash is useful, and that processing errors as soon as possible is a much better approach as you describe. However when you delve into career advice in the last paragraph I think that is a bit of a stretch. First — if learning a scripting language, Bash would not be my recommendation. It is no where near as powerful as Python / Ruby / Perl / TCL. To choose Bash as the scripting language of choice would be a mistake. Second — and more fundamentally you should not have to go thru these hoops — the verification environment should have hooks so it fails after N errors, or N ps after N errors, or one fatal error messages.

  2. Colin Marquardt says:

    Good article.
    The simple “can do all kinds of sophisticated things here” actually means
    “*must* do all kinds of sophisticated things here” however.

    Things that can make your life hard are e.g.:
    timing violation reports at a time in simulation when they can be ignored;
    occurrence of the word “error” etc. in otherwise innocent places;
    wrapped-around lines that would need to be unwrapped to correctly grep on them;
    STDOUT/STDERR not being an exact replication of the logfile;
    helper programs that do not set the exit code correctly;
    grave messages not being marked up as error or failure consistently.

    And I’m sure this list is not even close to being exhaustive…

  3. Martin d'Anjou says:

    Etan, I wish you were right. I wish verification environments were simpler. But I am afraid Colin is right: there is more to it than well written test cases that exit when they fail.

    Sometimes, an already “working” script and its companion parsing script have to be inserted in the existing verification flow. This is when parsing output on the fly is useful and can replace the parsing script.

    By the way, verilog does not return with a proper exit code, so in order to obtain a proper exit code, the output has to be parsed and a proper exit code produced.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: