On 03/10/2009 09:11 AM, Alex Rousskov wrote:
> It does not matter what the testbed script tries to do in this case. The
> facts are that the overloaded all-am target in root Makefile is:
>
> 1) not always the last target evaluated by make
> 2) sometimes evaluated when previous targets have failed
>
> This should be enough to see the design flaw, IMO. You can argue that
> you have carefully scripted testbed's make calls to minimize the harmful
> effects of that flaw, but they are not completely eliminated, require
> constant attention as the script and Makefiles are updated, and the flaw
> is still there. Why not just fix the flaw? That is what the proposed
> patch does.
A fresh case in point:
TESTING: layer-02-maximus
BUILD: .././test-suite/buildtests/layer-02-maximus.opts
Build Failed:
Build Successful.
make[1]: Leaving directory `/EsiLayout/btlayer-02-maximus'
This, of course, can be fixed by grepping more than just the last log
line, but the proper fix is not the endless grep fight, but the robust
and simple design based on the exit-code.
Thank you,
Alex.
>>> 2) The string is logged when make fails if make is told to ignore
>>> errors. This does not happen in the build-test context (by default), but
>>> often happens during developement, when I ran "make -k" to get more
>>> errors than one. It is rather confusing to see the "Build Successful"
>>> message when there are hundreds of errors above it.
>> Would you not define a successful run when make finishes all its tasks
>> as requested to do so?
>
> The successful make execution is the one that finished all its tasks
> successfully (the all-am target is just one of the tasks). You can
> detect a successful run by examining the return code. That is why we
> (including the testbed's test-suite/buildtest.sh script!) often write
>
> make foo && make bar
>
> instead of writing
>
> make foo;
> if grep magic $log
> then
> make bar
> fi
>
> The latter is pretty much what the current test-builds.sh script
> attempts to do, and it is known to cause troubles at least for some of
> us, is more complex, less reliable, yet we are still arguing whether we
> should work harder on implementing that idea correctly!
>
>>> 3) "Make" is designed to exit with an error when there is an error. We
>>> should not be re-implementing that logic. The problems with the current
>>> script and the above caveats are all cases in point.
>>>
>>> Overall, it seems strange to me to reject the change that uses the
>>> correct interface to detect build errors while the current code is using
>>> a half-broken hack.
>> I agree we should not use a hack when there is a legit method of
>> detecting all and any errors. I' just recalling past history of the
>> development of the testbed when this exact approach left some errors
>> uncaught.
> Why did it leave some errors uncaught? Could it be due to all other bugs
> in the testbed? After checking script revision history, I suspect that
> was exactly the case:
>
> r9080: Correct overall approach, but ignores top-level errors.
> r9133: Correct overall approach, quits on the first top-level error.
> r9181: Correct overall approach, but incorrectly added error reporting
> breaks it.
> ... error reporting improves, but the bug is not noticed ...
> r9345: Correct overall approach killed in favor of a buggy "grep
> error" design.
> r9376: A "grep success" check added, in hope to fix "grep error"
> design bugs:
>
> "append success message to build so testbed can catch it and find
> any error without having to enumerate all failure cases."
>
> The unnecessary fight to detect more and more errors stemmed from an
> incorrect addition of error reporting in r9181. The correct approach
> probably worked, but was broken by that addition, and is now blamed for
> problems caused by the error reporting and then error detection code :-(.
>
> Alex.
Received on Tue Mar 10 2009 - 22:34:54 MDT
This archive was generated by hypermail 2.2.0 : Wed Mar 11 2009 - 12:00:03 MDT