All posts by: MissionReadySoftware

Elementor #17361

Critical software failures that are common, identifiable, and preventable

The Top 3 Edge Cases You Can't Afford to Miss

Click Here

Elementor #17240

Press and Magazine Publications

Engineering for a Safer World: The Mind and Mission of Ann Marie Neufelder

International Business Times

Click Here

Mission Ready Software Launches Requs Software FMEA for a New Era of Safe and Reliable Software Intensive Systems

Yahoo Finance

Click Here

Top Ten Development Practices Proven to Improve Software Reliability

Top ten development practices proven to improve software reliability
We have identified 10 development practices that can be replaced with 10 practices that are proven effective at improving software reliability and on-time delivery. This is the first installment of a 10 part educational series. We will also hear some key insights about each practice from a senior engineering leader in the medical device, defense, and avionics industries with over 35 years of experience.  This month we will cover the most effective practice which is shorter release cycles.  
The 10 most effective development practices
Since 1993, we’ve been benchmarking the reliability of software-intensive mission-critical systems of more than 150 software projects spanning the defense, space, aerospace, energy, electronics, healthcare, and other industries. Our research clearly shows that some popular development practices are either not as effective as people think. Our benchmarking database has the actual success of the software project as successful, mediocre, and distressed as per the objective criteria in the below table.
Reliability Measurements of Distressed, Mediocre and Successful Software

#1 -Ten Most Effective Practices – Smaller More Frequent Release Cycles

 

This month we will discuss the most effective development practice for reducing defect density – Smaller, more frequent release cycles. The table below illustrates actual data from several software projects.  On the y-axis is the actual defect density of the software once deployed – the total software defects found by customers divided by the normalized effective size of the software in KSLOC.  The x-axis is the engineering cycle time which is the total months spent in development and test prior to deployment.  The data is color-coded by the actual outcome of the software project – distressed, mediocre, and successful.

The figure below shows that when the engineering cycle is > 18 months there were no successful releases. When the engineering cycle is less than 9 months nearly all of the projects were either successful or mediocre.  The fact that smaller cycles are less risky is nothing new – our data simply supports what many software experts have suspected all along. Smaller cycles are one of the key concepts of agile, incremental, and spiral development.

Engineering cycle time versus defect density
The bad practice of “aiming big” with longer release cycles started in the early 1980s when I started working as a software engineer.  So, why are longer release cycles problematic?  In summary, longer release cycles are more likely to result in a significantly late software project.  Our benchmarking data[1] clearly shows that whenever a software project becomes late, the reliability of the software drops significantly since the software development and testing efforts are almost always compromised. These are just a few reasons why longer release cycles cause both late software and unreliable software:
•        Software releases become late one day at a time.  The longer the release cycle – the more likely software people waste one day at a time.
•        “Kick the can” mindset by all of the engineerings leads to late starts
•        Software people think they have plenty of time to make up for the late start.  In our database of software projects, there were no cases in which anyone made up for the lost time.
•        Similar mindset to the college student who is given a month to do a project but starts working on it the night before.
•        People tend to not think about anything that’s not due this year
•        If they aren’t thinking about it then it’s not likely that they are working towards it
•        Longer release cycles mean it takes a lot longer to find out the software wasn’t what the customer wanted, misunderstandings about the requirements or serious design flaws

So the question is – how does one reduce the engineering cycle so as to “aim small – miss small”? There is one way to definitely NOT shorten the release cycle and that is to arbitrarily chop the release time but not the scope. Shortening the release cycle means splitting the functionality that used to be in one big release into smaller releases. It does not mean taking less time to do develop the same functionality. If that worked, I wouldn’t be writing this paper. The right answer depends on how long the cycle is now.  If the engineering cycle is well beyond 18 months, trying to shorten it to a few months maybe a bit aggressive.  Reducing the cycle time to not more than a year may be a more reasonable first step. More than one pass may be required to achieve the ideal release cycle.

One alternative is to have alternating feature and bug fix releases. For example, each is 8-9 months apart but the feature releases and bug fix releases are 4-5 months from each other.  That means that the customer is getting something every 4-5 months.  The odd-numbered releases, for example, can be new feature releases while the even-numbered releases can be bug fixes.

The biggest obstacle to reducing the engineering cycles is the culture change and dealing with the many bad excuses for the long cycles.  Software engineers may complain about the “additional overhead” of smaller releases as opposed to larger less frequent releases. This is largely nonsense.  If there is substantial overhead in “releasing” the software then the software group isn’t doing what they are supposed to do throughout development to prepare for a release. Marketing people tend to have the biggest angst about shorter releases because they fear that whatever does not make the early release will never happen.  Or they are simply unwilling to prioritize any feature.  If the release cycle is set to be smaller AND engineering doesn’t backtrack on the schedule by allowing extra scope, the marketing people will eventually realize that this is their best bet for getting features predictably on time.

Perspective from an Experienced Engineering Leader
“Without question, delivering smaller, more frequent releases is more effective than aggregating features, functions, bug fixes, etc. into larger releases that require eighteen months or more to complete.  There are a number of reasons for this but I will focus on a few which I believe have the greatest impact.  Shorter development cycles have the benefit of 1) constraining the scope of any one release; 2) establishing a cadence of regular, timely releases; and 3) building the capabilities of the development team.
Limiting releases to a defined time period of several months has the effect of constraining scope.  High predictability in development starts with an accurate estimation of resources and time required regardless of the task.  Estimating smaller scope tasks is inherently easier than estimating large, long-duration often inter-related activities.  Likewise, the accuracy of estimation is also improved by the build-up of experience and historic data which comes with more frequent release cycles.  The constrained scope has the effect of limiting the extent of unintended consequences arising from design changes as individuals are more likely to analyze and comprehend the impact of additions or modifications they are making across the broader codebase.  Larger scope directly translates to increased complexity.
Regular, timely releases instill confidence in the broader organization (e.g. Marketing, Customer Support, Manufacturing, etc.) that there will be opportunities subsequent to the currently planned release to address their needs.  The increased confidence has the effect of reducing the pressure to expand the scope of the “current” release due to a lack of certainty regarding the timing of the “next” release.  Regular releases also provide a ready vehicle for addressing emergent concerns such as safety issues or security vulnerabilities without significant change to the program underway or an unacceptable delay in delivering a resolution to the market.
Software development requires a high degree of teamwork.  Like any other activity which involves teams, the team gets better with practice.  Instituting a release cycle measure in months allows the team to fully exercise all elements of the development process – architecture design, estimating, resource planning, coding, testing, documentation, and release delivery at least once every year and ideally, two to three times per year.  All of these activities improve with repetition.  It’s also important to keep in mind that as a person progresses through his or her career, they often change roles every two to three years.  With longer release cycles, it’s quite likely that persons in key roles such as Product Owner, Architect, Scrum Master, Program Manager, etc. may have experienced only one or, at most, two releases before they transition to a new role if the typical release timeline is 18+ months.

For these reasons, I highly recommend instituting a regular rhythm of major release projects planned for 9-12 month duration with minor updates and bug fixes planned for 3-6 month duration depending on the nature of your product.  Doing this will improve code quality and team productivity. “

Tom Neufelder, Retired Senior Vice President Philips Healthcare – Diagnostics Imaging


Next month
 we will discuss the second ineffective software development practice – allowing software engineers to work on auto-pilot.

[1]”The Cold Hard Truth About Reliable Software”, edition 6i, 2019, Ann Marie Neufelder

Top 10 Common Practices that Lead to Software Failures

Top 10 Common Practices that Lead to Software Failures

#3 Requirements testing is necessary but insufficient

Since the 1980’s there has been an “either-or” mentality to software testing. Either you test the requirements or you test the design/code. It is a common myth that testing the requirements is all that is needed.  The fact is that if requirements based testing were sufficient there would be no failed projects and no world events due to software failures. Clearly this popular testing approach is not working. The facts[1] show that organizations that develop reliable software on time do both requirements and design/code testing.  This is why.

Testing only requirements may- at best -cover 40% of the code. Requirements based testing won’t cover:

·       Endurance or peak loading (Caused the Iowa Democratic Primary Caucus and SCUD missile attack failures)
·       Timing (Caused the Therac 25, 2003 Northeast blackout failures)
·      Data definition (Caused the Ariane 5 and F22 International dateline)
·       State transitions (Multiple events due to dead states, prohibited state transitions, etc.)
·      Logic (Caused the AT&T Mid Atlantic outage in 1991)
·      Fault injection (Incorrect fault handling caused the Apollo 11 lunar landing, Quantas flight 72, Solar Heliospheric Observatory spacecraft,  NASA Spirit Rover failures)
·      Requirements that are missing crucially important details (Another cause of the F22 International dateline failure)


Why is this approach so popular? Answer:

·      Engineers have difficulty understanding “necessary but not sufficient”.
·      People who don’t understand software engineering started this myth in 1980s because they assumed that testing all requirements is equivalent to testing all code.
·      Software engineers hate to test design and code and hence do their best to propagate this myth.


So how does one change this popular but ineffective approach? Requirements management tools such as DOORS present obstacles for testing anything but requirements. Some alternatives include:

·      Pull more details into the software requirements specification.  The more detailed the SRS the more code coverage you get when testing.
·     Include pictures and tables as informative information (#4 on top ten list)
·     Develop testable requirements for what can go wrong (#8 on top ten list)
·     Include testing of design (#6 on top ten list)
·     Test the mission and not just one requirement at a time (#8 on the top ten list)

Perspective from an experienced engineering leader

There’s an old adage in software development that states “you can’t test in quality.”  Others state “software doesn’t break.”  Both of these miss the point as they relate to testing of code in development.  Highly Accelerated Life Testing (HALT) and Highly Accelerated Stress Screening (HASS) for electro-mechanical systems demonstrate that you can, and must, utilize testing which forces components to failure to identify and correct inherent design weaknesses and to objectively characterize the reliability of the system as a whole.  While it’s arguable that software does not “break” in the classic sense of a physical change which no longer conforms to original specifications, it does “fail” when it operates in a manner that was unintended with consequences which may range from undetectable by the user to catastrophic depending on the nature of the failure and the effect that failure has on application or system operation.  Just like the breakage of a mechanical component, software failures often require a repair function although it may be in the form of an application restart or system reboot as opposed to component reconditioning or replacement.  A comprehensive approach to testing the software is necessary for ensuring software quality and reliability and requirements-based testing alone is insufficient to meet these goals.

Requirements based testing is, by definition, only as good as the written requirements on which it is based.  Functions that may have been implemented but not explicitly defined in the requirement set may not be tested at all while those which are ambiguous or lack sufficient definition of edge cases may be only partially tested.  In a complex application or system, there is often a one-to-many or many-to-many relationship of requirements to test cases.  In these situations, completeness of the test cases is dependent on the level of understanding and degree of rigor of the engineers developing and executing the test cases.To ensure user satisfaction, it is important that a subset of the requirements reflect the user needs of the application or system from which lower-level requirements can then be derived.  The US Food and Drug Administration (FDA) refers to the demonstration of fulfillment of user needs as Validation with a demonstration of conformance to requirements defined as Verification.  The inclusion of actual or representative users during Validation testing is an important element in demonstrating that their needs are met with the application as developed.

The only way to fully understand the completeness of testing is the use of coverage tools that monitor the application in operation to determine which lines of code are being executed during the test cycle.  Even full structural coverage is not enough as the code may operate differently based on conditional flow and the variables which determine the flow. The Federal Aviation Administration (FAA) has adopted a standard for code coverage requirements based on the severity of consequence in the event of a software failure.  DO-178B/C defines five levels of risk for software components as shown in the table below.


A
s you can see from the table, in avionics software, requirements testing is sufficient only for Level D components which have at the most minor impact on safety in the convent of failure.  While I do not believe it is necessary to test to equivalence with Level A requirements for all software, I do believe that testing of most commercial applications falls far short of even Level C requirements.  If the code is not tested to the extent of its potential operating conditions, it cannot be a surprise when it behaves in an unintended fashion.

Test automation is generally required to effectively and efficiently reproduce the conditions which drive variations in conditional flow including error handling which may result in the unintended or undesired operating behavior.  It is important that automated testing recreate as much as possible the operational conditions which will be experienced by the application throughout its lifecycle. After completion of the unit test and by the development engineer along with integration test for components delivered from multiple developers, latent failures generally are not encountered the first time or two a function is called.  Hundreds, thousands, or millions of cycles may be required for the conditions to arise which results in the failure.  These conditions could be the result of common coding issues such as resource or memory leaks or insufficient storage capacity.  Manual testing cannot efficiently execute the number of cycles required for these issues to be exhibited nor is it conducive to reproducing the behavior when a failure is encountered.In summary, a comprehensive approach to testing which includes confirmation of user needs, objective measure of the code exercised and usage of automation to emulate lifecycle conditions in addition to demonstration of full requirements coverage is necessary to ensure the quality and reliability of the code being delivered.


Tom Neufelder, Retired Senior Vice President Philips Healthcare, Diagnostics Imaging
Next month we will discuss the fourth most ineffective software development practice – “Using words when pictures are better”.
[1]”The Cold Hard Truth About Reliable Software”, edition 6i, 2019, Ann Marie Neufelder

Since 1993, we’ve been bench-marking the reliability of software-intensive mission-critical  systems of more than 150 software projects spanning the defense, space, aerospace, energy, electronics, healthcare and other industries. Our benchmarking database has the actual success of the software project as successful, mediocre and distressed as per the objective criteria in this table.

We have identified 10 development practices that can be replaced with 10 practices that are proven effective at improving software reliability and on time delivery.

This is the third installment of a 10 part educational series and covers regular reviews with software engineers and their leads. We will also hear some key insights about each practice from a senior engineering leader in the medical device, defense and avionics industries with over 35 years of experience.  This month we will cover the practice of frequent reviews with software engineers.  See last month’s installment if you missed it.