Eppur Si Muove

 
 
In software development there is always the trade of between get X coded immediately fast and get X coded so that it has properties like -- it deals properly with possible inputs, it handles errors properly, it has tests, it follows standard code conventions, its legible, it has normal usability characteristics...
Most of the time though people seem to be impressed by speed -- 'wow! you all got that done really fast.'  Last week Thursday my boss was explaining to the big boss that we had a new application request from San Jose.  We could understand about 80% of the initial spec and that was probably enough to get started working on the code design ideas, maybe some database design, put some infrastructure in place...The big bosses reaction was 'Great! Ill come by next Wednesday and you can demo the application for me'.  And so demo preparations ensued with, naturally, a great deal of energy and in the end we put together a little dog and pony show for the dude.
It's surprising to me that a senior executive at a well know software technology company would have so little knowledge of what an application consists of (especially for Java) that he would think to himself 'Lets see Friday, Monday, Tuesday -- three days should be enough to write an application.'
My own boss should know better.  He has been a developer, but I found out during the pre demo coding efforts that he didn't have any conception of time himself.  There were several times during the three coding days where I got requests for additional features.  My response was 'Fine, that will take about an hour to code; here are my current feature priorities; you know how much time I have; where do you want to fit that in?'  'An hour? cant you do that like right now, like, in 1 or 2 minutes?'  People who haven't worked with the code of an application seem to have very little ability to judge times even when they have coding experience....Still, I can help but expect a bit more in this case.
And the results from the demo....the big boss thought that 'it' was really great.  He said a couple of times during the demo 'That was built really fast'.  Evidently that speed was important.  And his advice to us was to 'continue to add stuff fast and get the application out there'. 
There is a balance to be struck here in making applications.  There has to be long term thinking and short term thinking.  At any point in time one has to have a long term design - or maybe its better to call it a long term philosophy for the application.  What will it do /not do?  How will it do that? What's the applications mission?  What is best for the users?  That long term philosophy guides the overall focus of the work.  And then there are the short term needs like 'I need to get the login page working now.  Let's just get that working the simplest way possible and make it perfect later.'  As one develops an application one works back and forth between the long term focus and needs and the short term ones.  And as one moves forward the long term philosophy changes which changes the short term focus.  Each feeds back to the other in a long learning process.  Most importantly that learning process takes time, and there is no way to short circuit that. 
So looking back, at the end of the week, I have an 'application' written with no overall coherent philosophy of what we are building.  But 'it' was built really fast.  And the big boss says that is good.  And it has also been a horrible waste, part of which we will throw out and part of which we will rewrite.  So how good is that now?
 
This week there has been a lot of design work going on in the group (we are gearing up for development on the next project)  and I want to recap my personal experiences this week on modeling.
I worked mostly on designing the wsdl for our projects SOAP interface this week.  I did what I call a tracer bullet design using Haskell.  Basically I flesh out the data types and the interfaces in code but there is no actual implementation code.  I just want to see if everything communicates intent well and is consistent as far as the compiler is concerned.  Haskell is nice for this work.  I can express a lot in a very small space so I can get a very good over visual overview of the final system I'm shooting for in one or two short files.  Its like having a good map in front of me, I can take in all the important points in one view.  Writing Haskell is also very close to writing in a specification language so there is the added advantage of being very precise while being very expressive at the same time.  And lastly I didn't expect it before hand but I found  that transforming my Haskell code into wsdl was very intuitive and straight forward.
Although it should be clear from the context, its only useful to do this tracer bullet development with a compiled language.  Designing this way, one is using the language and the compiler together as a system design tool.  And the better one can express ideas via the languages type system, the more design help the compiler will be able to provide.
One other idea in modeling that I have never heard expressed anywhere else...whenever one has a design one should alway compare to an existing design as a stress test.  This idea is the same as the idea in statistics where one builds a model with data from a training set, but evaluates the model against separate set of data that is never consulted during the modeling process.  A lot of models may have good predictive power against their training set, not so many look as good against the test / evaluation set, (and then the real world is another hurdle still.)  With software, the idea is to review ones model against a design model put together by another engineer, or against a similar system designed somewhere else, or, if one is working on new generation of software x, check that one can still express the concepts of old generation software x in the new model.  A good model encompasses the constructs of (-ie the accumulated knowledge of) other engineers and other attempts.
So to end this post, I will say to you what I felt today at work.  1.  Uml pictures are long on pretty and short on constraints and trade offs.  Long term very little of those pictures will survive the translation to real code in a real system.  2.  People came before us and spent a lot of time thinking about the same business problems we are trying to model.  We may be smarter and have better thoughts but how could we prove that?

Do your best -- Marco
 
It been about a month since I mentioned scrum as a methodology.  And Id just like to do a quick recap on a few points that I mentioned in my last point.  The first note, my wife told me a few nights ago that her scrum team was not going to deploy on time; rather, they had decided that they were going to deploy a week later than planned, because the team was not confident that their code was ready for qa.  No big deal actually.  This kind of stuff happens all the time, which was one of the points of the last post.  Once once steps away from the perfection of the books reality still has to be dealt with. 
Her other comment to me was that her team no longer wanted to use the scrum software they had been using, Rally, and had decided to just use a white board instead.  About a month ago Rally was the best thing since cornflakes.  I remember one night I got an unasked for 20 minute introduction to it extolling all its features.  Again there is theory and practice.
Companies provide tools for their workers to manage projects but what kind of a tool is needed for a developer to do his work?   Its not enough to just give us a tool -- 'this is great use it'.  In practice a tool has to provide value to the people that are using it or it ends up like the ~$1000+ a year per person licence for Rally, unused.  Or more likely, from the companies, I have been a part of, one has to force people to use it and give some kind of punishment for not.  At my last job reliably filling out forms once a week, to provide metrics to management, became a element in our year end review.  Part of the problem with this type of monitoring is that the people you want to stay with the company don't give a shit about year end reviews because (a) they are pulling far more than their weight and suspect that you need them amidst the mass of incompetence and (b) because they can go some place else where people wont pester them if its an issue.
I've spent some time thinking about what managerial tools would help me develop and I have done a bit of experimenting here and there.  I have a few thoughts about what is good and what is bad.  And I am going to start with the bad.
Tools that collect metrics, the popular term for data these days, are less than worthless.  The big objection to a statement like the above is management needs to instil accountability, and needs 'insight into the process' so that they can then 'improve the process' through repeated virtuous cycles.  Multiple problems here.  First, usually proper metrics are not collectable for the type of work I do.  Several years ago I worked for a CMM 5 certified dev group and I remember on my first day of work there was a big debate between the dev lead and the project manager about how the # of lines of code should be reported.  This is not a very good metric to start with and the debate was over the question -- do we include comment lines in the count of lines of code?.  After consultation with the facility manager the answer was -- yes, comments are counted in lines of code.  Good metrics are hard to come up with and the bottom line is that good people in an organization know who among their direct peers is doing good work but those same people are rarely able to quantify their opinions.
The above story brings out a second problem with metrics.  The people that provide metrics have a very strong self interest to make the metrics appear as good as possible.  For metrics that are going to be visible across an organization, both individuals and the operating units that provide the metrics, will generally do whatever they need to do to make those number be whatever upper management wants to see.  People are intimately concerned with their careers and the more focused they are the better cheaters they will become.
The last problem with metrics, that particularly peeves me as I have done a lot of statistical work, is that data is collected in a very uncontrolled manner and then it is analysed by people who lack the background to do the analysis -- like, for instance, have ever heard of a Gausian distribution.  At the CMM 5 company I mentioned above the statistics where put together by a secretary and then management reviewed the final numbers and added fudge factors to make the numbers come out more to their perference.  So try not to force tools on people so one can obtain metrics.  Most likely the final result will just be accumulation of the wrong kind of data, analysed incorrectly and then falsified to show what people want their superiors to see.
The last 'bad' for today are tools that add steps to work processes.  There are people in companies whose reason for living is to build processes and unfortunately these same people are the ones who seem to drive the tool building efforts within and organization.  Using computers for that kind of activity just introduces two liabilities (ie - code and process steps) into an organization.  My feeling here is that any tool that is built has to be created with the specific intent of removing some task that has to be done manually, making some current task unnecessary, or replacing someone's job or a class of jobs with a computer.  There is always a cost benefit trade off for new tools and, just to keep everyone honest, I expect that when I perturb my system with a new tool that that perturbation takes me immediately to more optimal equilibrium on the 'work smarter' curve.....
And with that I try to get some ideas about what is 'good' in at a later date

Do your best - Marco
 
Scrum is very big in the company right now.  The teams that are able, all seem to want to transform into scrum teams.  Its a badge showing ones more forward thinking.  My wife herself moved to a scrum team two weeks ago and when I asked her what her experience was so far her response was 'scrum really really works'.  How should I take that considering that statistically speaking she's not pulling from a large data set?  Well, I know she has been interested in doing scrum for the past year or so.  Is the any confirmation bias going on there?
I have nothing against scrum but over the years I have worked on many projects, some as a developer, some as a lead, some using more agile methodologies, some using more formal methodologies.  There is no one right way to success and that is what I'm feeling these days as scrum is promoted as the answer, as  people focus on learning the one scrum way, the right way.  and, at the worst, as people look for the step by step scrum rules that they can follow to the letter.
My own team has been doing scrum for the past year.  And when I compare this team with all the teams and projects that I have seen over the past 11 years of development.  Scrum hasn't made this team more productive or more effective than the average.  Why is that? On one hand it matters who is on the team.  Are they effective? Do they have good judgement? Do they get stuff done? Do they get the right stuff done? Do they have foresight or are they reactive?  Good people are described positively by the answers to those questions.  And good people contribute effectively regardless of the official methodology that management endorses.  A second factor is the culture of a team.  Its hard to graft a series of practices on a team and have them result in anything different unless that team already has a potential to work with those practices.  To give a metaphor for this, the problem here is very much the same as a problem in developmental economics.  For some time economists have known that if one takes laws and systems from first world countries and transfer them to third world countries, the result is not a third world country that transforms into a first world country.  The result is no improvement.  Or putting the same idea in a more immediate context, the US can try bringing democracy and freedom to Iraq but Iraq still isn't going to be a beacon of freedom in the middle east.  It will end up being Iraq as its always been, and maybe a bit worse.  Good methods don't make effective teams; rather effective teams have good methods because there is an internal culture that promotes good methods intrinsically.
In all, just like its better to have societies with good rules, it better to have teams with good rules.  But the key is to have thoughtful people of like minds to guide themselves along because there are never enough rules to cover infinite, unexpected imperfections in the real world.  Good luck to those people who have found their scrum religion.  But for me, I just end up thinking of the words of Antione de Saint-Exupery - 'Every religion claims it knows how to make men but none can say in advance what kind of men it will make'.  And so it goes for methodologies and teams as well.

--Do your best - Marco
 
Im always experimenting; looking for ways to be a better developer.  Here are a few of the practices, that as I look back over the last 10 years, I feel have made me a more effective developer
1) tracer bullet design
When I'm thinking through a design I find that its really helpful to skip the UML diagrams and work out the design directly with a shell of empty code.  I will write the interfaces.  I put in some implementations but they will be just empty shells with no actual code realized yet.  This allows me to experiment with different naming conventions and see if I am communicating my intent through the language.  With code in place I can see how dependencies develop between modules.  Naturally, I can also compile my work as I go along as well.  Being able to compile helps me to catch any potential mistakes in my thinking; it holds me to a rigorous standard of correctness.  And lastly once I am done and satisfied with the design, I have code.  I may even have a build system in place.  Having code is a tangible result and that counts more to me than a set of pretty UML pictures.
2) Unit testing
People thing that unit testing is a practice that will decrease bugs in delivered code.  In my experience unit testing doesn't catch a lot of bugs.  Rather the main benefit from unit testing is that it has changed how I write code.  I have been forced to develop code that's more abstract and more configurable and less coupled because that style of code is the only code that can be tested easily.  On the same idea I find it very useful when learning a new language to do a lot of testing .  Testing provides quick feedback which is nice for learning; it also forces one to learn the idioms that language provides for abstraction.  Lastly, I should also mention that an important benefit from writing unit tests and having good code coverage, is that one can refactor with confidence when ones application has to undergo change.  Without that coverage there is no way to rework code with confidence on a large project -- one is stuck with the past and trying to money patch as one goes along.
3) Design by Contract
Its really hard to write correct code and write robust code.  Thinking in terms of preconditions / postconditions / invariants, adding asserts to code, has all helped me to write correctly working code the first time through.  I also find that thinking in terms of DBC, along with unit testing,  changes ones thinking towards a more specification oriented frame of mind.  The thinking is more rigorous, that influence works it way down to the code.
4) Dynamic Languages
For fast development go with a dynamic language, an IDE that supports that language and a debugger.  As an example, I worked on a supporting tool for our main project at work using Groovy in Eclipse.  The tool was of fair complexity and it was finished in a week.  Comparing the development speed to comparable tools I have written in Java, I would say I was at LEAST twice as productive working in Groovy and there were no negative effects in other areas.  That said, I'll just emphasize the productivity seems to be not just a function of a dynamic language alone.  I also develop for my own pleasure in Clojure / Slime and I don't see the productivity gains there.  I've done a fair amount of Perl in emacs; good productivity but no huge gains there.  Also, as a slightly non topical reference point, I haven't seem much in the way of gains using Haskell in emacs.  My explanation is that using dynamic language along with the full support of an IDE sets the basis for remarkable productivity gains from day one of working with that language. 
And you? What practices have been valuable to you as a developer?
 
I'm going to mention a principle which is part of the long term construction of software systems.  This principle centers on the following contradiction: fixing bugs doesn't make your software any better.  Well, that's a strange thing to say.  Of course every bug fix makes my software better.  That's one less bug hiding within the constantly changing innards of my system.Well maybe I should phrase it like this.  Every bug one fixes makes ones software better but only when the bug is fixed in such a way that it provides the answer to the question, "What do I have to do to make sure this never happens again?"  This is the supervising question behind all of the child bug fix questions that I will list below.  But before I get to that list, I think I'll talk about a few stories of the applications of the supervising question.Many years ago, I inherited the batch and reporting system of a growing financial institution.  Important but not the most glamorous work.  Think, telephone calls at 1 - 2:00 in the morning to the effect of "xxx batch job failed, I don't see any error messages but the whole batch system is down and I cant get it restarted."  Those calls happened and problems had to be solved that night for the company to be up and processing the next day.  And then there were nightly failures of the non show stopper jobs and reports.  Everyday I would come to work and spend the whole morning tracking down the cause of failures from list of broken jobs that could "wait till morning".  In this kind of situation its a good thing to keep an issue notebook and write it in every problem that occurs.  What are the symptoms? What are the error messages? How did I solve the issue to get the processing done?  What did I try?  What do I think the root cause is? A record is nice to have if one is tracking problems over time.  Sometimes problems are related or they will reoccur again in other contexts.  Your notebook will ultimately help you over time as you add the final resolution to each entry by answering the question -- What do I have to do to make sure this never happens again?  I can pretty much guarantee that if you have a system which averages 4-5 failures per day, you will have a system which averages 1 failure per month after a year.  That's not perfect but its about a two orders of magnitude improvement, and that's not bad either.Once you are at that point you can get past just thinking about system failures.  Because system failures are just brutish things that hit one over the head and say fix me.  "What do I have to do to make sure this never happens again?" demands a different attitude to constructing systems.  Here is another example.  Shortly before I left the above mentioned financial organization.  We had a batch emergency.  It seems the batch operator at 4:00 in the morning killed the invoicing process.  He couldn't see any activity in the logs and as far as he could tell the process was hung.  He called the admin on call and Frank said "kill it".  A bad call?  Well, it meant that when the system came up the next morning we couldn't tell our partners how many millions of dollars they had to send us that day.  It also meant that we had no idea what financial data had been processed and what data hadn't.  Was the db in an inconsistent state?  The problems turned out to be tractable, the program hadn't hung; it was just a heavy processing day and after tracing the each thread through the logs we figured out where we stood.  That was a bad morning for Frank I think, although I don't know what transpired in meetings that morning, but I remember Frank coming up to me in the afternoon, heavily affected, and saying "I really sorry about screwing up batch."  "You know Frank, Its not your fault.  The fault is in the design.  This program ran in such a way that its state of execution was indistinguishable from a program that was hung.  Not only that but batch is so complicated that no one knows what jobs do what.  What jobs can be killed safely.  And what jobs cant be killed safely.  The only people that can figure it out are the engineers who have to work with it all the time - and not even we can make decisions with confidence.  The management of the software group has known all this and has decided to allocate resources elsewhere.  This is a failure of the organization, you did the best you could have done, and that's fine."  My point here is that when people fail while operating complex systems, its the systems fault.  Its the designers fault.  And they have to ask the question.  You know the question?I said before that "What do I have to do to make this never happen again?" is the supervising question of system improvement.  But sometimes it helps to have a bit more guidance that will lead one to the supervising question.  So here are some derivative questions that I have listed in my little notebook that I usually carry with me.  Here we go --Can this type of problem be automated away?What could have been done to catch this error right away?Does this error belong to a class of error?  What is that class and how can I uncover them? remove them? prevent them?How could this bug have been caught automatically?How did this bug get into the system and how could we eliminate that entry route?What is the root cause of this error?Could I write a test to prevent this bug from being reintroduced? Its variants?Could I introduce a new type that would block errors like this from being introduced?Is this caused by a design weakness?  What is a design that would be fail safe?Can I eliminate this error by eliminating human interaction with the program?Lastly learn from other peoples failures.  Amazon S3 went down worldwide for eight hours a week ago.  It seems the ultimate cause was minor corruption in messages sent from one part of the system which eventually cascaded through out the whole system.  I thought of this this week when I put together tar file for deployment later this month and added a md5 checksum file to the distribution.  That tar file has to be sent from Shanghai to Austin on deployment day and the software is mission critical, as they say, and, yeah, we should check just to be sure once it reaches its destination.  What the chance of minor corruption?  But one should get into good habits even; one day good habits will be good to have.  So one more question should be added to the list here.How would I prevent something like that happening to me?Well that's a lot of questions.  One cant always do everything perfectly and I guess fixing everything perfectly is a tall order.  But that's the basics as I see them.  Its only one question -- what do I have to do to make sure that never happens again? Do your best.