Eppur Si Muove

 
 
I gave my boss a long talk on build and integration systems today.  He wanted to know what we have found are best practices and what are the difficulties we encounter with our build and integration systems.  The reason the info was requested was so my boss can give his boss a write up about how good our build systems are.
Yet I didnt really feel that my boss got any of the technical ideas that I was trying to get across but since they are important and I want someone to understand here are some incomplete experiences with maintaining build and integration systems:
Developer environments -
Its very important that people can develop on self contained environments.  At the very least this means they have to have a local database that they can modify as they add functionality, or that will contain special data sets that they need.  A local db allows developers do their work without impacting everyone on the team and without the rest of the team impacting them.  When a developer is ready to publish his work, the common dev database can be updated and he can commit his code as well.
<experience> = In my experience a lot of developers have no interest in maintaining their own local database.  Or using the command line tools that most dbs come with.  They want to shovel code out the door and shoveling code is just so much easier when someone else manages their database.  They will try to do work and even make changes to common use databases.
Its also important that the external services other than the database can be mocked out -- and these days those external services are more and more.  Being able to selectively detach/attach external dependencies allows developers to continue work if some service goes down, allows them to set up certain situations that are hard to duplicate, allows them to work off line, on the road - ie be more productive.
<experience> = Having configurable dependencies is extra work and will probably involve the build system.  The build system is not seen as interesting.  Extra work does not increase the rate of shoveling.  Developers will avoid these types of tasks and hope no one notices.
Database -
In general one wants to have a structured method of updating databases and the most practical method I have seen is a migration method.  This works for relational dbs and it also works for no-sql databases.  There is a lot of talk about how no-sql databases make migrations unnecessary.  I have not found that in my own work with no-sql dbs (couchdb).  When I develop I want my data well defined.  I feel it leads to more robust code.
<experience> = Developers still dont want anything to do with sql and will generally try to find someone else to write scripts for them.  Most importantly though, developers will go along with a migration system though and consider it a good idea.
Unit Testing -
The real story is testing allows one to embed knowledge in the code, allows everyone to better change code when requirements change, encourages developers to think about code chunks in terms their specification, and encourages better design.  Unit tests dont 'catch bugs'.  Unit tests do not necessarily lead to 'fewer bugs' and 'more reliable code'.
<experience> = These days all managers say their application has a suite of units test and their developers are writing unit tests.  This is mostly bullshit posturing.  Most developers still dont write tests and a lot of developers will throw in a few tests for their managers during the qa, post qa / pre deploy stage.  Where I work, at a worldwide, top ten trafic volume web site, the unit tests have devolved into essentially assertTrue(true).  People have pressure to get stuff out and it takes only a few people who dont want to maintain other peoples test cases for it all to fall apart.  But we have tons of test cases - and they always pass.
Integration Testing -
One wants to test database code against a real database.  Mocking doesnt cut it here and small changes to sql can often have unexpected consequences.
<experience> = Its a pain in the ass to use tools like dbunit.  Spring is the way to go here.  One wants to have a special database for the integration tests and always have it in a well defined state.  Developers will write integration tests but it may be difficult to get them to understand the difference between an integration test and a unit test (may not understand what a mock object is).  Its important to have a separate project that is just for integration tests and tell the developers that that is where the db tests go.  Developers are more willing to write integration tests than unit tests.  Make writing integration tests easy and a team will tend to see payback from efforts in this area.
Automated testing -
I have used selenium and it works fine.  There are other tools here that work on pattern rec but those are very new -- Im interested though.
<experience> = Keep these tests simple.  These types of tests tend to be very fragile as they test the ui, which tends to change a lot on most projects.  Still simple functionalities tend to break more than one would expect so even simple tests here can be effective.  Simple test suite of limited size here.
Continuous Integration Servers -
CIS gives one confidence that one is always ready to deploy.  Do a daily build every night, have CIS deploy that to a dev server and run smoke/automated tests.  Have you customers keep in touch with the latest code work by checking dev server.  In addition to a daily build, build all products and run all integration tests continuously.
<experience> = I use Hudson. Im used to it. I like it.  All the CIS pretty much do the same thing.  Management needs to buy into continuous integration, though, so that when the build fails it is everyones first priority to fix the build.  I have yet to meet a manager who made the build a first priority.
Build Systems -
Run you unit tests whenever you build your product.
<experience> = I use Maven right now.  It does what I need it to do.
Static analysis -
Checkstyle, FindBugs, Simian, JLint
<experience> = The above are a few that I have used and integrated into Ant / Maven during builds.  I currenly use Simian which I find the most worthwile.  I like FindBugs as well.  I have used Checkstyle but I dont integrate it into builds (I may use it in my own dev environement).  Everyone has their own style and Im not going to impose my ideas on someone else.  I do like the statistics / metrics that Checkstyle offers (for instance the complexity measures) and I have added those to build systems but my experience is that when one of the metrics goes high and is flagged in the build there is always strong resistance to changing the code in question.  Its fine...it works and there is no good reason...it cant be implemented any other way...I dont know how to...There are always reasons and it not worth the trouble.  On the other hand FindBugs is seen as helpful by developers and tells them about problems they were not aware of.  Simian is a hard call.  For some people its a good reminder, for other people it just makes them better cheaters.
All of the above practices are valuable but not always in the way that we read about.  Also we hear a lot about the practices above and how they'll help us build better products.  But there is a gulf between the rosy visions and possibilities that we might hope for and the real world that we have to develop in.  Still good build and integration practices are one essential of the many practices, that when combined together helps teams put out better code.
do your best, Marco



Leave a Reply.