Software Reliability Definitions

For more information about Mission Ready Software, click below

The below are some commonly asked questions concerning software reliability definitions.  

Definition of software failure

As per the IEEE 1633, a software failure includes all of the following:

(A) The inability of a system or system component to perform a required function within specified limits

(B) The termination of the ability of a product to perform a required function or its inability to perform within previously specified limits .
(C) A departure of program operation from program requirements.

 

NOTE 1—A failure may be produced when a fault is encountered and a loss of the expected service to the user results.

NOTE 2—There may not be a one-to-one relationship between faults and failures. This can happen if the system has been designed to be fault tolerant. It can also happen if a fault does not result in a failure either because it is not severe enough to result in a failure or does not manifest into a failure due to the system not achieving that operational or environmental state that would trigger it.  

Contrary to popular myth:
1. All of the above definitions apply for software.(A), (B) and (C) are to be considered in measuring, prediction, or improving software. Reliability engineers tend to consider only the failures that cause termination of service (B) or failure to perform at all (A).  However, if the software is perfectly performing the wrong function as per the requirements, that is also a failure.
 

2. Systematic software failures are counted as failures. 

3. When calculating an actual failure rate, all occurrences of the same defect are counted as failures. Unique occurrences of the same defect are only used when estimating the total defects.

What is Software Reliability?

 Technically speaking, the definition of software reliability is the probability of a failure-free operation over some period of time.  However, it’s been used to describe a collection of development practices aimed at improving software reliability or reducing software defects.  There are software reliability models for predicting failure rate, defect density, availability during every part of the software lifecycle.  Reliability definition within software engineering means simply that the software you develop performs its function as specified.

History of Software Reliability

The first recorded software failure in modern times was the Mariner 1 failure in 1962. Over the last 58 years, it has evolved from models used very late in development to models that can be used very early in development. The IEEE 1633 Recommended Practices for Software Reliability, 2016 discusses these models plus the software failure modes effects analysis, reliability driven testing, and additional definition of software reliability.

Types of Software Reliability Models

Software Reliability Models

Software Reliability Prediction models

 These models predict defects or defect density early in the software development process without using any actual test data. The primary purpose of these models is the predict the reliability metrics before the code is complete to allow for cost-effective alternatives in the event that the reliability objectives cannot be met. These models include:

Software Reliability Growth Models 

(Also known as Software Reliability Estimation Models). Estimate remaining defects, failure rate, etc. based on the observed failure and defect data. These models are used later in development when the software is in a testable state and are used to make release decisions.

However, because they are used once the code is developed there are limited alternatives of the software failure rate objectives that aren’t being met in the scheduled time frame. The general exponential model, Weibull model, Logarithmic model are examples of software reliability growth models.

Factors that affect software reliability

  • Product factors – these pertain largely to the code and include popular metrics such as complexity.
  • Product risks – These are things that make it more difficult to develop the software.
  • People – These are factors that make it easier for the software people to be productive in developing and testing the software.
  • Process – These are factors that ensure that the software people can repeat the way they develop software from one software project to the next.
  • Techniques – These are the most overlooked factors and represent the specific methods that the software engineers employ when developing the specifications, design, test plans, etc. These should not be confused with the software process.

The most accurate software reliability prediction models are those that measure all 5 factors. These include but aren’t limited to the Shortcut model, Full-scale model, Neufelder Assessment Model and the Rome Laboratory TE-92-52 models.

Software Reliability Assessment

An assessment is used to establish a software reliability range and is employed early in the software development lifecycle. Typically software reliability assessments predict defect density as well as other metrics. Software reliability assessments are the first step in predicting software failure rate, MTTF, availability, etc. Software reliability assessments measure any of the 5 factors proven to affect reliable software- product, risks, people, process, technique.

The Software assessment predicts one of seven clusters from distressed to world-class which are associated with specific ranges of defect density, probability of on-time delivery as well as defect removal percentage (also called DRE).

The assessment is based on all 5 factors.

Assessments are also used to establish alternatives for improving the reliability of the software based on facts and not opinions.

Software reliability statistics

Question: How reliable is software? Answer: It ranges from world-class to distressed. Contrary to popular myth the software organizations that had fewer defects were also more likely to stay on schedule. When their software was late it was late by a smaller margin.

The top-level software failure modes

At the very highest level, there are three things that can and do go wrong when developing software.

The most overlooked top-level failure mode is that the specification is missing crucially important details.  The second most overlooked top-level failure mode is that the specification is itself incorrect.  This does not mean the customer or system specification but rather the software specification.

At the next level of abstraction, these things can go wrong.  Each of the below things can apply to all three top-level failure modes.  For example, state management can be faulty because of the specifications or because the code wasn’t written to specifications.

  1. Faulty statement management
  2. Faulty sequences
  3. Faulty timing
  4. Faulty error handling
  5. Faulty data
  6. Faulty processing
  7. Faulty functionalit
Common myth #1

“Software failures” and “software failure modes” are often interchanged.  The primary failure mode for hardware is worn out.  This failure mode doesn’t apply to the software.  The “software doesn’t fail” myth was started because reliability engineers think of “wear out” when they think of  “fail”.

Common myth #2

Reliability engineers think that because the code does exactly what the programmer coded it to do that software cannot fail.  This myth is partly due to to the faulty interchanging of wear out and fail and partly due to the faulty assumption that the software engineer knows exactly what the code is required to do and that the software specifications and design meet the system requirements.  Software systems today are massive and complex.  There is a near-infinite possibility of inputs and paths for even medium-sized systems.  The software engineers can’t know everything.  Even with perfect software specifications and perfect software design, there would still be faults due to their code not meeting the specifications and design.