Blog Archives

Build and integration systems

9/17/2010

I gave my boss a long talk on build and integration systems today. He wanted to know what we have found are best practices and what are the difficulties we encounter with our build and integration systems. The reason the info was requested was so my boss can give his boss a write up about how good our build systems are.
Yet I didnt really feel that my boss got any of the technical ideas that I was trying to get across but since they are important and I want someone to understand here are some incomplete experiences with maintaining build and integration systems:
Developer environments -
Its very important that people can develop on self contained environments. At the very least this means they have to have a local database that they can modify as they add functionality, or that will contain special data sets that they need. A local db allows developers do their work without impacting everyone on the team and without the rest of the team impacting them. When a developer is ready to publish his work, the common dev database can be updated and he can commit his code as well.
<experience> = In my experience a lot of developers have no interest in maintaining their own local database. Or using the command line tools that most dbs come with. They want to shovel code out the door and shoveling code is just so much easier when someone else manages their database. They will try to do work and even make changes to common use databases.
Its also important that the external services other than the database can be mocked out -- and these days those external services are more and more. Being able to selectively detach/attach external dependencies allows developers to continue work if some service goes down, allows them to set up certain situations that are hard to duplicate, allows them to work off line, on the road - ie be more productive.
<experience> = Having configurable dependencies is extra work and will probably involve the build system. The build system is not seen as interesting. Extra work does not increase the rate of shoveling. Developers will avoid these types of tasks and hope no one notices.
Database -
In general one wants to have a structured method of updating databases and the most practical method I have seen is a migration method. This works for relational dbs and it also works for no-sql databases. There is a lot of talk about how no-sql databases make migrations unnecessary. I have not found that in my own work with no-sql dbs (couchdb). When I develop I want my data well defined. I feel it leads to more robust code.
<experience> = Developers still dont want anything to do with sql and will generally try to find someone else to write scripts for them. Most importantly though, developers will go along with a migration system though and consider it a good idea.
Unit Testing -
The real story is testing allows one to embed knowledge in the code, allows everyone to better change code when requirements change, encourages developers to think about code chunks in terms their specification, and encourages better design. Unit tests dont 'catch bugs'. Unit tests do not necessarily lead to 'fewer bugs' and 'more reliable code'.
<experience> = These days all managers say their application has a suite of units test and their developers are writing unit tests. This is mostly bullshit posturing. Most developers still dont write tests and a lot of developers will throw in a few tests for their managers during the qa, post qa / pre deploy stage. Where I work, at a worldwide, top ten trafic volume web site, the unit tests have devolved into essentially assertTrue(true). People have pressure to get stuff out and it takes only a few people who dont want to maintain other peoples test cases for it all to fall apart. But we have tons of test cases - and they always pass.
Integration Testing -
One wants to test database code against a real database. Mocking doesnt cut it here and small changes to sql can often have unexpected consequences.
<experience> = Its a pain in the ass to use tools like dbunit. Spring is the way to go here. One wants to have a special database for the integration tests and always have it in a well defined state. Developers will write integration tests but it may be difficult to get them to understand the difference between an integration test and a unit test (may not understand what a mock object is). Its important to have a separate project that is just for integration tests and tell the developers that that is where the db tests go. Developers are more willing to write integration tests than unit tests. Make writing integration tests easy and a team will tend to see payback from efforts in this area.
Automated testing -
I have used selenium and it works fine. There are other tools here that work on pattern rec but those are very new -- Im interested though.
<experience> = Keep these tests simple. These types of tests tend to be very fragile as they test the ui, which tends to change a lot on most projects. Still simple functionalities tend to break more than one would expect so even simple tests here can be effective. Simple test suite of limited size here.
Continuous Integration Servers -
CIS gives one confidence that one is always ready to deploy. Do a daily build every night, have CIS deploy that to a dev server and run smoke/automated tests. Have you customers keep in touch with the latest code work by checking dev server. In addition to a daily build, build all products and run all integration tests continuously.
<experience> = I use Hudson. Im used to it. I like it. All the CIS pretty much do the same thing. Management needs to buy into continuous integration, though, so that when the build fails it is everyones first priority to fix the build. I have yet to meet a manager who made the build a first priority.
Build Systems -
Run you unit tests whenever you build your product.
<experience> = I use Maven right now. It does what I need it to do.
Static analysis -
Checkstyle, FindBugs, Simian, JLint
<experience> = The above are a few that I have used and integrated into Ant / Maven during builds. I currenly use Simian which I find the most worthwile. I like FindBugs as well. I have used Checkstyle but I dont integrate it into builds (I may use it in my own dev environement). Everyone has their own style and Im not going to impose my ideas on someone else. I do like the statistics / metrics that Checkstyle offers (for instance the complexity measures) and I have added those to build systems but my experience is that when one of the metrics goes high and is flagged in the build there is always strong resistance to changing the code in question. Its fine...it works and there is no good reason...it cant be implemented any other way...I dont know how to...There are always reasons and it not worth the trouble. On the other hand FindBugs is seen as helpful by developers and tells them about problems they were not aware of. Simian is a hard call. For some people its a good reminder, for other people it just makes them better cheaters.
All of the above practices are valuable but not always in the way that we read about. Also we hear a lot about the practices above and how they'll help us build better products. But there is a gulf between the rosy visions and possibilities that we might hope for and the real world that we have to develop in. Still good build and integration practices are one essential of the many practices, that when combined together helps teams put out better code.
do your best, Marco

0 Comments

Programming productivity

9/5/2010

0 Comments

I have run into a few articles on developer productivity this week. The articles have got me thinking enough to add a few comments of my own
First, one of the articles that I ran into was an 'ask' dialogue on Hacker News. There were a lot of ideas proposed but what I found particularly interesting was that as the dialogue progressed it become mostly about debugging. For some people debuggers were useless; all they needed was 'println'. Or maybe logs where enough. Or for some people once they had their test suite in place that was enough; if anything went wrong they could just look at their test suite and that took them to the root cause. And then of course some people found debuggers pretty valuable. Some statements that I thought were pretty valid:
Debuggers are useful for for figuring out unfamiliar code when just reading the code doesnt give one any insight.
Debuggers are also useful when something has gone wrong and all the tests look fine and all the code looks fine and one is looking for clues for where to start.
But overall given the topic of programmer productivity and the fact that the dialogue converged almost exclusively on debugging, my take was that however one likes to do it; program introspection is major enabling technology. We either automate the introspection (log, println) or we manually (debugger) do it. Code tends to behave differently than our best efforts end up writing and we don't end up very productive if we cant introspect our running code. Use all the tools you have at your disposal and use them well.
The second point about productivity I want to make is that there is not single productivity secret. In the 1980's the Japanese car producers were making major inroads in the US market and people would ask how can they make higher quality cars for a lower cost than US manufacturers? What's their secret? The secret was a multiple upon multiple small changes to how they produced cars that in the end added up to a big difference. And I shouldn't say added up -- rather to put it plainly multiple small improvements in productivity compound multiplicatively and they start to make a noticeable difference a lot faster then one would believe based on our natural intuitions. If one wants to be more productive expect to improve in a lot of small ways and then have those new habits compound at some point into a visible difference.
To be more productive one also has to become more productive where, as a developer, we spend the most time. Most of our time is spent trying to figure out how to implement a feature, trying to figure out what existing code does, and trying to figure out why code doesn't behave as expected. We spend very little of our time actually writing code, estimates are on the order of 10%. So one isn't going to be more productive by coding faster. One should concentrate one efforts where the fat is -- thinking and understanding. So write code that has clear architectural intent (macro level), and is understandable at the micro level as well. Write code that is as simple as possible. Write code with clear contracts that fails fast. And solve every problem once (see previous post on writing reliable code)
The final point is that productivity is ultimately limited by our primary constrained resource, time. One can make more time by automation, if there is enough payback from that. And to the extent possible, one can also use time well by identify tasks that are well thought out and have the most customer value. Don't just do work; do the 'right work'.

do your best, Marco

0 Comments

Eppur Si Muove

Build and integration systems

Programming productivity

Marco

Archives

Categories