I have been working on a large agile development and it has made me think alot about build problems during continuous integration. I see three problems:
1. the build fails when any of the tests fail. This is the number ONE reason for the build going red. IMO the build should only go red when there is a failure to build. Testing is something else. The reason for it being done this way is, of course, that when a test fails it indicates that somewhere there is broken code, i.e. something has stopped working. However, what often happens is that an area requires more test data so it is added and this breaks the tests that are in a completely different area.
2. When the build is red this blocks the developers from checking in their changes. This means that when the build goes green a flurry of changes are committed in a very short space of time which often makes the build go red again. This the project is very much stop-start-stop. The periods when it is stopped tend to last a long time, due to the long time it takes to do a build because it takes a long time to run all the tests.
3. Fixing defects does not take first priority. There may be dozens of defects even when the build is green. These will be filed in a bug reporting system but there are no failing tests for them. Hence, when a build goes red due to a failing test, by stopping and fixing it a relatively minor problem is fixed whilst many more problems are left unsolved.
I welcome suggestions on how to improve on this situation. Here are some initial thoughts:
First, I would distinguish between a failure to actually build (e.g. due to a compilation error or cyclic dependency) and failing tests. I would only make the build go red on the former. Under these conditions when the build goes red I would programmatically stop all checkins. I would arrange things so that change sets only get a final commit once they have passed through the build. The build without tests is quite quick, taking only a few minutes, so IMO this would not be too onerous.
Second, I would establish a rule that defects must be fixed before there is any coding for new functionality. This is supposed to be the agile way anyway. I would try to find a way whereby a failing test automatically raises a defect, ensuring that it gets fixed as rapidly as possible. I think that when such test failures are as a result of fragile tests that are overly sensitive to the addition of new test data this rule will quickly make such tests less sensitive.
Can anyone suggest any other ways to avoid the build problems I have described?
> I have been working on a large agile development and it has made me > think alot about build problems during continuous integration. I see > three problems:
> 1. the build fails when any of the tests fail. This is the number ONE > reason for the build going red. IMO the build should only go red when > there is a failure to build. Testing is something else. The reason for > it being done this way is, of course, that when a test fails it > indicates that somewhere there is broken code, i.e. something has > stopped working. However, what often happens is that an area requires > more test data so it is added and this breaks the tests that are in a > completely different area.
> 2. When the build is red this blocks the developers from checking in > their changes. This means that when the build goes green a flurry of > changes are committed in a very short space of time which often makes > the build go red again. This the project is very much stop-start-stop. > The periods when it is stopped tend to last a long time, due to the > long time it takes to do a build because it takes a long time to run > all the tests.
> 3. Fixing defects does not take first priority. There may be dozens of > defects even when the build is green. These will be filed in a bug > reporting system but there are no failing tests for them. Hence, when > a > build goes red due to a failing test, by stopping and fixing it a > relatively minor problem is fixed whilst many more problems are left > unsolved.
> I welcome suggestions on how to improve on this situation. Here are > some initial thoughts:
> First, I would distinguish between a failure to actually build (e.g. > due to a compilation error or cyclic dependency) and failing tests. I > would only make the build go red on the former. Under these conditions > when the build goes red I would programmatically stop all checkins. I > would arrange things so that change sets only get a final commit once > they have passed through the build. The build without tests is quite > quick, taking only a few minutes, so IMO this would not be too > onerous.
> Second, I would establish a rule that defects must be fixed before > there is any coding for new functionality. This is supposed to be the > agile way anyway. I would try to find a way whereby a failing test > automatically raises a defect, ensuring that it gets fixed as rapidly > as possible. I think that when such test failures are as a result of > fragile tests that are overly sensitive to the addition of new test > data this rule will quickly make such tests less sensitive.
> Can anyone suggest any other ways to avoid the build problems I have > described?
The standard XP answer is that you should never have failing unit tests. Period. End of discussion.
So I'd ask why failing unit tests are happening in the integration process in the first place? This should be a very unusual occurrence, not an everyday occurrence.
At a guess, the relevant part of the test suite isn't being run on the developer's machines. If this is so, then that's the problem that needs to be fixed.
There are, of course, a dozen other possibilities. Most XP discussion takes place on yahoo Groups in the extremeprogramming mailing list. You'll get a much more thorough discussion there.
On 7 Sep, 16:45, John Roth <johnro...@gmail.com> wrote:
> On Sep 6, 11:59 am, Andrew <marlow.and...@googlemail.com> wrote: > > I welcome suggestions on how to improve on this situation. Here are > > some initial thoughts:
> > First, I would distinguish between a failure to actually build (e.g. > > due to a compilation error or cyclic dependency) and failing tests. I > > would only make the build go red on the former. > The standard XP answer is that you should never have failing > unit tests. Period. End of discussion.
I should have mentioned that the failing tests are always at integration or acceptance level. Of course the unit tests should always run as part of the build and any unit test failure should cause the build to fail.
> So I'd ask why failing unit tests are happening in the integration > process in the first place? This should be a very unusual occurrence, > not an everyday occurrence.
See above.
> At a guess, the relevant part of the test suite isn't being run on > the developer's machines. If this is so, then that's the problem > that needs to be fixed.
The test suite IS being run on the developers machine.
> There are, of course, a dozen other possibilities. Most XP discussion > takes place on yahoo Groups in the extremeprogramming mailing list. > You'll get a much more thorough discussion there.
On Sep 8, 12:59 am, Andrew <marlow.and...@googlemail.com> wrote:
> I should have mentioned that the failing tests are always at > integration or acceptance level. Of course the unit tests should > always run as part of the build and any unit test failure should cause > the build to fail.
XP doesn't distinguish between unit and integration tests; code is integrated on the developer's machines. What the integration system is for is to catch the, hopefully very rare, occasions where two developers have made incompatible changes and a race condition occurs.
In the same sense, previously passing automated acceptance tests should pass on the developer's machines before the code is checked in.
> > There are, of course, a dozen other possibilities. Most XP discussion > > takes place on yahoo Groups in the extremeprogramming mailing list. > > You'll get a much more thorough discussion there.
Let me emphasize that you'll get a much more thorough discussion on the Yahoo Groups ExtremeProgramming list than you will here. You're lucky that anyone at all is still looking at it - this usenet group is pretty much dead.
On 9 Sep, 09:48, Paul Sinnett <paul.sinn...@gmail.com> wrote:
> On Sep 8, 7:59 am, Andrew <marlow.and...@googlemail.com> wrote: > > The test suite IS being run on the developers machine. > Can you describe a case where a test passes on the developer's machine > but fails on the server?
Yes. The selenium tests are failing due to timing issues. Sometimes the test code waits for something asynchronous to happen and does not wait long enough. These errors are intermittant.
> What is the cause of the error?
Problems in the test code. It is very dangerous to have to wait for asynchronously computed results in tests.
On Sep 11, 1:54 am, Andrew <marlow.and...@googlemail.com> wrote:
> On 9 Sep, 09:48, Paul Sinnett <paul.sinn...@gmail.com> wrote:
> > On Sep 8, 7:59 am, Andrew <marlow.and...@googlemail.com> wrote: > > > The test suite IS being run on the developers machine. > > Can you describe a case where a test passes on the developer's machine > > but fails on the server?
> Yes. The selenium tests are failing due to timing issues. Sometimes > the test code waits for something asynchronous to happen and does not > wait long enough. These errors are intermittant.
> > What is the cause of the error?
> Problems in the test code. It is very dangerous to have to wait for > asynchronously computed results in tests.
Do these timing problems show up in production, or are they an artifact of the way that tests are being written?
On Sep 11, 8:54 am, Andrew <marlow.and...@googlemail.com> wrote:
> Yes. The selenium tests are failing due to timing issues. Sometimes > the test code waits for something asynchronous to happen and does not > wait long enough. These errors are intermittant.
My guess is that a pure XP approach is unsuited to the project you are describing. I think you'll need to make adjustments for these problems. You describe the size of the project as "large," whereas XP was designed for a team of a dozen. You have real-time requirements, whereas XP was designed on (I think I'm correct in saying) an offline batch processing program.
I'm sure it's possible to run your in an agile way. But I think trying to follow XP by the book might not be the best approach in this case.