How Did Technology End Up on the Sunday Morning Talk Shows?

It has been two months since the Healthcare.gov launch and by now nearly every American has heard or witnessed the poor performance of the websites. Early on, only one of every five users was able to actually sign in to Healthcare.gov, while poor performance and unavailable systems continue to plague the federal and some state exchanges. Performance was still problematic several weeks into the launch and even as of Friday, November 30, the site was down for 11 hours for maintenance. As of today, December 1, the promised ‘relaunch day’, it appears the site is ‘markedly improved’ but there are plenty more issues to fix.

What a sad state of affairs for IT. So, what does the Healthcare website issues teach us about large project management and execution? Or further, about quality engineering and defect removal?

Soon after the launch, former federal CTO Aneesh Chopra, in an Aspen Institute interview with The New York Times‘ Thomas Friedman, shrugged off the website problems, saying that “glitches happen.” Chopra compared the Healthcare.gov downtime to the frequent appearances of Twitter’s “fail whale” as heavy traffic overwhelmed that site during the 2010 soccer World Cup.

But given that the size of the signup audience was well known and that website technology is mature and well understood, how could the government create such an IT mess? Especially given how much lead time the government had (more than three years) and how much it spent on building the site (estimated between $300 million and $500 million).

Perhaps this is not quite so unusual. Industry research suggests that large IT projects are at far greater risk of failure than smaller efforts. A 2012 McKinsey study revealed that 17% of lT projects budgeted at $15 million or higher go so badly as to threaten the company’s existence, and more than 40% of them fail. As bad as the U.S. healthcare website debut is, there are dozens of examples, both government-run and private of similar debacles.

In a landmark 1995 study, the Standish Group established that only about 17% of IT projects could be considered “fully successful,” another 52% were “challenged” (they didn’t meet budget, quality or time goals) and 30% were “impaired or failed.” In a recent update of that study conducted for ComputerWorld, Standish examined 3,555 IT projects between 2003 and 2012 that had labor costs of at least $10 million and found that only 6.4% of them were successful.

Combining the inherent problems associated with very large IT projects with outdated government practices greatly increases the risk factors. Enterprises of all types can track large IT project failures to several key reasons:

  • Poor or ambiguous sponsorship
  • Confusing or changing requirements
  • Inadequate skills or resources
  • Poor design or inappropriate use of new technology

Unfortunately, strong sponsorship and solid requirements are difficult to come by in a political environment (read: Obamacare), where too many individual and group stakeholders have reason to argue with one another and change the project. Applying the political process of lengthy debates, consensus-building and multiple agendas to defining project requirements is a recipe for disaster.

Furthermore, based on my experience, I suspect the contractors doing the government work encouraged changes, as they saw an opportunity to grow the scope of the project with much higher-margin work (change orders are always much more profitable than the original bid). Inadequate sponsorship and weak requirements were undoubtedly combined with a waterfall development methodology and overall big bang approach usually specified by government procurement methods. In fact, early testimony by the contractors ‘cited a lack of testing on the full system and last-minute changes by the federal agency’.

Why didn’t the project use an iterative delivery approach to hone requirements and interfaces early? Why not start with healthcare site pilots and betas months or even years before the October 1 launch date? The project was underway for three years, yet nothing was made available until October 1. And why did the effort leverage only an already occupied pool of virtualized servers that had little spare capacity for a major new site? For less than 10% of the project costs a massive dedicated farm could have been built.  Further, there was no backup site, nor any monitoring tools implemented. And where was the horizontal scaling design within the application to enable easy addition of capacity for unexpected demand? It is disappointing to see such basic misses in non-functional requirements and design in a major program for a system that is not that difficult or unique.

These basic deliverables and approaches appear to have been fully missed in the implementation of the wesite. Further, the website code appears to have been quite sloppy, not even using common caching techniques to improve performance. Thus, in addition to suffering from weak sponsorship and ambiguous requirements, this program failed to leverage well-known best practices for the technology and design.

One would have thought that given the scale and expenditure on the program, top technical resources would have been allocated and ensured these practices were used. The feds are  scrambling with a “surge” of tech resources  for the site. And while the new resources and leadership have made improvements so far, the surge will bring its own problems. It is very difficult to effectively add resources to an already large program. And, new ideas introduced by the ‘surge’ resources, may not be either accepted or easily integrated. And if the issues are deeply embedded in the system, it will be difficult for the new team to fully fix the defects. For every 100 defects identified in the first few weeks, my experience with quality suggests there are 2 or 3 times more defects buried in the system. Furthermore, if one wonders if the project couldn’t handle the “easy” technical work — sound website design and horizontal scalability – how will they can handle the more difficult challenges of data quality and security?

These issues will become more apparent in the coming months when the complex integration with backend systems from other agencies and insurance companies becomes stressed. And already the fraudsters are jumping into the fray.

So, what should be done and what are the takeaways for an IT leader? Clear sponsorship and proper governance are table stakes for any big IT project, but in this case more radical changes are in order. Why have all 36 states and the federal government roll out their healthcare exchanges in one waterfall or big bang approach? The sites that are working reasonably well (such as the District of Columbia’s) developed them independently. Divide the work up where possible, and move to an iterative or spiral methodology. Deliver early and often.

Perhaps even use competitive tension by having two contractors compete against each other for each such cycle. Pick the one that worked the best and then start over on the next cycle. But make them sprints, not marathons. Three- or six-month cycles should do it. The team that meets the requirements, on time, will have an opportunity to bid on the next cycle. Any contractor that doesn’t clear the bar gets barred from the next round. Now there’s no payoff for a contractor encouraging endless changes. And you have broken up the work into more doable components that can then be improved in the next implementation.

Finally, use only proven technologies. And why not ask the CIOs or chief technology architects of a few large-scale Web companies to spend a few days reviewing the program and designs at appropriate points. It’s the kind of industry-government partnership we would all like to see.

If you want to learn more about how to manage (and not to manage) large IT programs, I recommend “Software Runaways,” , by Robert L. Glass, which documents some spectacular failures. Reading the book is like watching a traffic accident unfold: It’s awful but you can’t tear yourself away. Also, I expand on the root causes of and remedies for IT project failures in my post on project management best practices.

And how about some projects that went well? Here is a great link to the 10 best government IT projects in 2012!

What project management best practices would you add? Please weigh in with a comment below.

Best, Jim Ditmore

This post was first published in late October in InformationWeek and has been updated for this site.

IT Project Delivery – Dismal Government Projects Track Record

In the past several weeks, I have posted on project management best practices. And we talked about the track record of the IT industry as being, at best, a mixed bag. It turns out  for the UK government, that would be putting a very positive spin on their IT projects. The Times recently ran an article* detailing the eight worst areas in the government which have cost the taxpayers dearly. IT projects had 3 blatant failures of the eight and had a major hand in 2 of the remaining. How did it come to this? Why is practice of IT reasonable for more than half of the UK government wasteful initiatives? And these are not minor blowups. The Fire and Rescue Plan, which started out as a 120 million pound effort to consolidate 46 control rooms to 9 regional centres was finally axed after it cost 469 million pounds! A straightforward project (how many of us have consolidated call centres, trading floors or command centres in the past 10 years? I would venture 50% of your firms have done this) that cost 4 times the estimate and never even delivered! And that is not the worst one: the NHS records project has cost 6.4 billion pounds to date with at least 2.7 billion of that wasted. While there are 60 million citizens in the UK, many of us in industry have customer bases in the tens of millions where we keep critical financial data safe and accessible for our customers. So while I would agree the health records breaks new ground in some areas, it is not the Manhattan project. This is a very doable project where given the monies spent, it should have already delivered significant benefit and capability. And yet very little has been delivered and there is low confidence this will change in the near future for the program. Overall, how can IT projects have 3 of the 8 slots of government failure areas when the defense industry only has 1 (and you could argue IT projects contributed heavily to that one)?

I think this dismal track record in government for IT projects is  due to some common issues and a few unique ones. First, typically there is poor and ambiguous sponsorship. And it is compounded by very weak and changing requirements. Too many parties and groups in government with a stake and a reason to argue and change the project. Applying the methods of  political processes of lengthy debate, consensus and influence to defining and running a project are a recipe for disaster. And I suspect the contractors doing the work likely encouraged changes and debate as this was then an opportunity to grow scope with much higher margin work (change orders are always much more profitable than the original bid). Second, the approach undoubtably used a waterfall method. And given the size and scope and vast array of stakeholders, each step (e.g. requirements definition, etc)  took an extremely elongated time. An elongated schedule with a cumbersome and bloated program structure to match the stakeholder complexity would certainly have multiplied the costs. Still, it takes even more to cause such spectacular blowups.

There is an excellent book, Software Runaways, available that documents such ‘death march’ projects. ‘Death march’ projects are the kind of massive program that everyone knows is doomed to failure and yet everyone is still lashed to the ship on this voyage to failure. It is a fascinating read and some ways like watching a traffic accident unfold — it’s pretty awful but you can’t tear yourself away. Of course, the UK government and its contractors do not have a monopoly on such spectacular program failures (though they certainly seem to be doing their best to enable additional chapters to be written). What is relevant here though is that the book does an excellent job of reviewing about a dozen of the more interesting past IT program failures and identifying the root causes. These rot cause include the poor sponsorship and ill-defined requirements we discussed above. But it also describes the mentality that sets into a large program team on their ‘death march’. In essence, even though the members of the team know there are massive flaws in the program, because of the complexities and different agendas and influences of a large complex program team, they are often unable to repair them from within. Even worse, when an external party identifies the flaws, the program team then bands together to defend from such external attacks at all costs.  Their identity has become so caught up in the program that they would create a Potemkin village to demonstrate that there are no flaws.

So, it is the instinct of these large program organizations that assume a life of their own with all the members now vested in its survival (not its delivery but its survival)  that when combined with major flaws (such as ill-defined requirements or the wrong methodology) creates the spectacular failures. Or, put another way, it is how humans work together within large programs that if based on poor practices, can be multiplied to raise the negative results to such an irrational level.

With that in mind, what are some approaches to prevent this from occurring? I would suggest the basic ones of ensuring there is clear sponsorship, proper steering committees should help, but more radical changes to the approach would be better. Let’s take the command centre consolidation. Why do all 46 into 9 in one waterfall or ‘big bang’ approach? Instead take two regions of the 9 and do two pilots, each constructed with their separate sponsors, steering committees and contractors. Set an overall schedule for them to deliver to a well-defined, but high level set of requirements or outcomes. The team that completes their work on time and meets the requirements will have an opportunity to bid on the next two regions to be consolidated. And the one that does not meet the bar will result in the contractor being barred from the next round of work and a negative performance mark will go on the sponsors and government leads. Now, the payoff for contractor encouraging endless changes by government is gone. Further, you are breaking up the work into more doable components that can then be improved in the next regional implementation. Smaller problems sets are eminently more doable then massive ones. By changing the approach to more incremental work with short cycles and aligning program structure and incentives to getting real results, I think you would find a dramatic difference in the delivery of the project.

While these changes would certainly improve project delivery, I am sure there are several other elements that have caused impact and problems. What would you change? How do we get IT projects to not be a huge portion of wasted taxpayer funds?

I look forward to your comments.

Best, Jim

* The Times, which is a very good newspaper, unfortunately does not provide access to its articles via the internet without a subscription. If you have a subscription, the article title is ‘Scandal of the big spenders who have cost taxpayers dear’ published on January 9, 2012.

Ensuring Project Success: Best practices in Project Delivery

Project delivery is one of the critical services that IT provides. Unfortunately, in the industry, the track record of project delivery for IT is at best a mixed bag. A number of different studies over the past five years put the industry success rate below 50%. A good reference in fact is the Dr. Dobbs site where a 2010 survey found project success rates to be:

  • Ad-hoc projects: 49% are successful, 37% are challenged, and 14% are failures.
  • Iterative projects: 61% are successful, 28% are challenged, and 11% are failures.
  • Agile projects: 60% are successful, 28% are challenged, and 12% are failures.
  • Traditional projects: 47% are successful, 36% are challenged, and 17% are failures.

So, obviously not a stellar track record in the industry. And while you may feel that you a doing a good job of project delivery, typically this means the most visible projects are doing okay or even good, but there are issues elsewhere. In this post and in the next few I will provide a quick primer to check and ensure you are leveraging the key project delivery best practices that enable more successful track record.

Key project delivery practice areas: These areas are project initiation, project communications and reporting, project management, release management, and program management. There are also a few critical techniques to overcome obstacles including how to get the right resource mix, doing the tough stuff first, and when to use a waterfall approach versus incremental or agile. I will cover the first two areas today and the rest over the few weeks.

Project Initiation – I have rarely, and I suspect similarly for you, been part of an organization where the demand for IT to deliver projects is less than the supply of IT resources to execute those projects. In fact, this over-demand is often chronic and our IT response sometimes exacerbates the situation. Because of our desire to meet the business’s wishes, and to show progress, we make the mistake of initiating projects before they or our teams are ready to start. It is critical to ensure that you have effective sponsorship, a good bead on requirements and adequate resources before you initiate the project. There is no issue with doing concept or project exploration, initial requirements and designs, high level planning or estimation, but these must all be divided from the formal project effort with a strong entry gate that ensures you have the sponsorship, understanding of the deliverable, and adequate resources. Most projects that fail (and remember, this is probably half of the projects you do), started to fail at the beginning. They were started without clear business ownership or decision-makers, or with a very broad or ambiguous description of delivery. or they were started when you were already over-committed, without senior business analysts or technology designers.  This is just a waste of money, resource time, and business opportunity.

Address this by setting a robust entry gate and be disciplined about when to start a project. If the business sponsors aren’t there, this is a discussion for you to have with that business leader. If the project is defined as overly broad or is ambiguous, don’t fall into that trap. Take the two or three most well-defined needs and split off or chunk the project into starting with just that piece. The rest will fall into place as the work is done and everyone understands better the situation. You and the business may decide not to progress further, If you decide to proceed then start up the next chunk as a separate project. And, as much as you want to get more done, don’t start the project before you have resources. As you operate your project ‘factory’ at extremely high utilization levels, each additional piece of work you add makes you less efficient and more costly. You must use higher priced contractors rather than internal resources. And your key experts that must do the work, now have to juggle one more thing and will be less effective. It would be far better for your 500 resource team to do say, 80 projects effectively, rather than 110 projects ineffectively. It is like a server that has not enough memory for the applications active on the system. More and more time is spent on overhead, paging things in and out rather than on real work. The end result is less work done in total even though more projects are active. This is a common trap that is overcome by knowing you effectiveness range and then simply prioritizing with the business. If something critical has come up then for it to be started you must ice and take off the table other projects. Make these choices with your business partners. If you maintain this balance, then you will get more projects done in the year — and that should be the true measure of delivery (not how man projects you started).

Project reporting and communication – Doing a project is a team effort. You have many staff with different backgrounds and skills from different organizations that must come together to deliver on a single blueprint to a common goal. With such a diverse team, effective communications are paramount. And yet, often, the only formal communications are those that a directed towards senior management. There are three primary audiences that a project manager should communicate formally: senior management, the business customer, and the project team. Further, the project manager should leverage the same detailed reporting and advanced analysis that they are doing to manage the project to quickly reformat into a digestible report for their audience.

Oftentimes, you find the project managers are burdened by multiple reporting systems where they are manually entering the same information two or three times. And then middle management for both IT and the business demand additional reports on metrics that are not useful and forms for approval that are bureaucratic and repetitive. Meanwhile, the critical data (e.g. risk items, resource utilization, critical path delays) are not reported broadly if at all. And the project manager is overwhelmed with the busy work versus the real task at hand.

So streamline the reporting process. Ensure your team a single, effective project reporting tool and invest in one if not. I recommend that all the but the smallest projects produce a weekly ‘4-Box’ report. This one pager can be used for all three primary audiences and ensures the project manager, the sponsor and the key stakeholders are paying attention to the important aspects of the project.

I will be placing a 4-Box sample on the reference page for your use later this month. But they key components are simple:

  • a landscape page with 4 quadrants, a left margin column and a header section
  • the header consists of the Project Title centered and Project Status in color boxes on the far right upper corner, and the project mission in small font (it always amazing how many people work on a project and do not know what it is intended to do — thus the mission on every communication)
  • the left margin contains the names and phone numbers of the project manager, sponsor, and all key participants and stakeholders. Thus everyone knows who to call if there is an issue or question
  • The upper left quadrant contains a brief description and the key milestones for the project with dates and a status indicator (e.g. completed, underway, etc)
  • The lower left quadrant box contains accomplishments and progress for the past week (or time period). There should be a brief description of progress and a listing of the key milestones or tasks completed.
  • The upper right quadrant is a listing of the key risks and issues the project faces. They should be catalogued and a status indicated (e.g. Mitigated, Underway, Open) with a color status as well.
  • The bottom right quadrant should provide what will will get done this next week or time period by milestone or significant task with dates and with owners.

Additional information can be used to augment the report, but they key is now you can use the same one page to communicate effectively with all your audiences. This ensures everyone is one the same page (literally) at a minimum of effort. Note also that we avoid the ‘ creative writing’ of project status reports that some many organizations waste time and use to put an optimistic spin on the project progress. Instead, just the facts.

By aligning your project process and teams to these two best practice approaches, you will find:

  • you are not starting projects before they are ready to be started
  • you will run your project factory at optimal output and effectiveness
  • you will lighten the overhead load on your project managers, so they can do more real work
  • your project teams will be on the same page (and thus more effective)
  • you and your businesses will know what is going on and can identify issues much earlier and solve them more quickly

In essence, you will deliver projects more successfully.

What are the variations on these approaches that you have used with success? What would you do differently? What other areas of project delivery are problematic that you have solutions for? Have a wonderful holiday and I look forward to your perspectives.

Best, Jim