A detailed study about the characteristics of different types of faults is necessary to enhance the accuracy of software reliability estimation. Over the last three decades, some software reliability growth models have been proposed considering the possibility of existence of two types of faults in a software: (1) independent and (2) dependent faults. In these software reliability growth models, it is considered that the removal of a leading fault or independent fault causes detection of corresponding dependent faults. In practical, it is noticed that some dependent faults are possible in a software which are removed during the removal of other faults. Moreover, dependent faults may have different characteristics, which cannot be ignored. Considering these facts, a detailed study about the different characteristics of both dependent and independent faults has been performed, and based on this study, dependent faults have been categorized into different categories. Furthermore, a new software reliability growth model has been proposed with revised concept of fault dependency under imperfect debugging by introducing the fault removal proportionality. In addition, the effect of change point on model’s parameters due to different environmental factors has been considered. The fault reduction factor is considered as a proportionality function. Experimental results establish the fact that the performance of the proposed model is better with respect to estimated and predicted cumulative number of faults on some real software failure datasets.
For high reliability calculation efficiency and evaluation accuracy, saddlepoint approximation technology has been introduced into design and optimization under uncertainties. When using saddlepoint approximation, there are two prerequisites: all random information is tractable and saddlepoint equations are easy to be solved. However, the above requirements cannot always be met in complex multidisciplinary systems. Random variables sometimes are intractable, or saddlepoint equations are highly nonlinear. To tackle these problems, in this study, an efficient reliability-based multidisciplinary design optimization using the combination method of saddlepoint approximation and third-moment is given. A simplified alternative cumulant generating function can be constructed by saddlepoint approximation and third-moment with the first, second and third moments of a random variable effectively. Then, this cumulant generating function can be utilized to calculate the cumulative distribution function and the probability density function of this random variable approximately. Moreover, to obtain better efficiency, the framework of sequential optimization and reliability analysis is introduced in this study. The corresponding formula of the proposed reliability-based multidisciplinary design optimization is given in detail. Two test problems are solved to show the application of the proposed method.
The verification of safety requirements becomes crucial in critical systems where human lives depend on their correct functioning. Formal methods have often been advocated as necessary to ensure the reliability of software systems, albeit with a considerable effort. In any case, such an effort is cost-effective when verifying safety-critical systems. Often, safety requirements are expressed using safety contracts, in terms of assumptions and guarantees.
To facilitate the adoption of formal methods in the safety-critical software industry, we propose a methodology based on well-known modelling languages such as the unified modelling language and object constraint language. The unified modelling language is used to model the software system while object constraint language is used to express the system safety contracts within the unified modelling language. In the proposed methodology a unified modelling language model enriched with object constraint language constraints is transformed to a Petri net model that enables us to formally verify such safety contracts. The methodology is evaluated on an industrial case study. The proposed approach allows an early safety verification to be performed, which increases the confidence of software engineers while designing the system.
Reliability analysis of consecutive k-out-of-n systems and their generalizations has attracted a great deal of attention in the literature. Such systems have been used to model telecommunication networks, oil pipeline systems, vacuum systems in accelerators, spacecraft relay stations, etc. In this paper, nonrecursive closed form equations are presented for the reliability functions and mean time to failure values of consecutive k-out-of-n systems consisting of two types of nonidentical components. The results are illustrated for reliability evaluation of oil pipeline system.
Multipath Transmission Control Protocol is a new networking protocol which can allow data transmission and retransmission across multiple minimal paths. It can reduce data transmission time and ensure data integrity. In computer networks, the capacity of each arc should be considered as stochastic because there are different situations such as failures, partial failures, and maintenance. This article focuses on a stochastic retransmission flow network to evaluate data transmission reliability that the data can be transmitted successfully through multiple minimal paths within transmission time threshold. We first propose an algorithm to generate all minimal capacity vectors satisfying both constraints and requirements and to obtain data transmission reliability. A numerical example and a large-scale case study of the Pan-European Research and Education Network are presented to illustrate the algorithm.
Mechanical systems and their components usually have multiple failure modes and different performance states. Most existing system reliability modelling theories are developed on the basis of binary logic, which lack sufficient ability to describe the above phenomena. In this article, dynamic Bayesian network theory is employed to evaluate the multi-state reliability of a hydraulic lifting system. First, failure mode and effect analysis and structural analysis and design technique are comprehensively applied to analyse the functionalities and failure modes of the components. Afterwards, the time factor is integrated into the model by considering the state transition of the components. In this way, the multi-state reliability model of the system is established by dynamic Bayesian network. The reliability assessment and diagnostic analysis are performed by taking advantage of the dynamic Bayesian network’s bi-directional reasoning ability, and the results are in good agreement with actual situation. It shows that the proposed approach is effective and convenient for multi-state reliability modelling and analysis for mechanical systems.
A two-stage model is developed between a company and a government. The government, representing the general public, earns taxes on production and chooses the tax rate in stage 1. The company allocates its resources into productive effort and safety effort. The disaster probability is modeled as a contest between the disaster magnitude and the two players’ safety efforts. Three new propositions are developed. First, both the government’s and the company’s safety efforts decrease in the unit safety effort costs, and the company’s safety effort increases in the unit production cost and in the company’s resources. Second, both players’ safety efforts are inverse U shaped in the disaster magnitude. Third, the company’s safety effort increases, and the government’s safety effort decreases, in taxation. Taxation can thus ameliorate companies’ incentive to free ride on governments’ provision of safety efforts.
In this article, a least square optimization method for multi-fault detection and isolation has been revisited and validated through simulation and experimentation on a pedagogical hydrostatic transmission system. A nonlinear regression analysis has been made on the state equations, obtained from bond graph model of the system, to estimate the unknown parameters as a part of system identification. The model, assigned with the estimated and some known parameters, was then validated with the responses from the test rig. The rig was designed to impose faults (one at a time and/or simultaneously) in different components for the purpose of experimental validation of multi-fault detection and isolation. The model-based fault isolation was done using structural analysis of some constraint relations called analytical redundancy relations, the numerical evaluation of which is residuals. The robustness in fault isolation was addressed through linear fractional transformation approach to ensure residuals bounded within adaptive threshold under no fault situation. Finally, the isolated faulty parameters were estimated through particle swarm optimization algorithm for fault sizing. This article is directed towards corroboration of the existing fault isolation methodologies through experimentation on a power hydraulic circuit.
This article is within the context of decision models aimed for maintenance of structures and infrastructures in civil engineering. The contribution relies on the construction of a degradation model oriented toward risk analysis. The proposed model can be defined as a meta-model in the sense that it is based on observations while incorporating key features from the degradation process necessary for the maintenance decision. We propose to stimulate the construction of the degradation model based on the crack propagation of a submerged reinforced concrete structure subject to chloride-induced corrosion. Furthermore, a set of numerical illustrations is performed to demonstrate the advantages and applicability of the proposed approach in risk management and maintenance contexts.
This article provides some views on how managers should think about risk when facing challenging decision-making situations with large uncertainties and high values at stake. The article is based on the thesis that managers are well qualified for the proper understanding and management of such risk and uncertainties, but that there has been a lack of suitable conceptual frameworks available. We argue that the key to success in this respect is to be open to broader risk perspectives than the common damage probability view. Risk also has to reflect uncertainties, knowledge, and potential surprises. A security case taken from the oil and gas industry is used to illustrate the discussion. The main aim of the article is to guide managers and analysts on the conceptualization of risk and on which techniques and principles to adopt in order to understand, assess, manage, and communicate risk in relation to situations of large uncertainties and high values at stake.
Fatigue crack propagation is a stochastic phenomenon due to the inherent uncertainties originating from material properties, environmental conditions and cyclic mechanical loads. Stochastic processes thus offer an appropriate framework for modelling and predicting crack propagation. In this paper, fatigue crack growth is modelled and predicted by a piecewise-deterministic Markov process associated with deterministic crack laws. First, a regime-switching model is used to express the transition between the Paris regime and rapid propagation that occurs before failure. Both regimes of propagation are governed by a deterministic equation whose parameters are randomly selected in a finite state space. This one has been adjusted from real data available in the literature. The crack growth behaviour is well-captured and the transition between both regimes is well-estimated by a critical stress intensity factor range. The second purpose of our investigation deals with the prediction of the fatigue crack path and its variability based on measurements taken at the beginning of the propagation. The results show that our method based on this class of stochastic models associated with an updating method provides a reliable prediction and can be an efficient tool for safety analysis of structures in a large variety of engineering applications. In addition, the proposed strategy requires only little information to be effective and is not time-consuming.
This article describes a Monte Carlo–based approach for reconstructing missing information in a dataset used by General Electric for reliability analysis, which contains data coming from field observations at inspection of gas turbine components. The approach is based on a combination of maximum likelihood estimation technique to estimate the failure model parameters, Fisher information matrix to estimate the confidence intervals on the estimated parameters, and a double-loop Monte Carlo approach to estimate the missing equivalent starts (i.e. data of turbine state without the relative equivalent starts). The proposed methodology reduces the uncertainty in the estimation of the parameters of the turbine. The results of the application of the novel approach to a real industrial dataset are discussed along with a sensitivity analysis for the quantification of the robustness of the methodology to deal with different sizes of datasets.
It is a common industrial practice that machines work under variable operational conditions. This article develops an imperfect maintenance model for machines working under variable operational condition, assuming that the failure time distribution of machines under different operational conditions follows the accelerated failure time model. To deal with the case that the future operational conditions’ information over the planning time interval are unknown in advance, it proposes a preventive maintenance policy which is based on current hazard rate function. An example is studied, and the results are compared with the cases that the whole operational condition information is given in advance, and the influence of operational condition is not considered. The results reveal the necessity of integrating operational condition with preventive maintenance optimization. In addition, the performance of the proposed policy has been explored with different parameters of operational conditions.
A system subject to an accumulating deterioration and continuous monitoring is analyzed in this article. The system deterioration is modeled using a gamma process, and the system is considered as failed when its degradation level exceeds a failure threshold. The maintenance team lasts a fixed time to start the maintenance actions. To prevent downtime, an alert signal is sent in advance to the maintenance team when the degradation level of the system exceeds a preventive threshold. At the maintenance time, three maintenance actions can be performed: preventive replacement, corrective replacement, and imperfect repair. We assume that the repair is imperfect, in a sense that it reduces a part of the degradation accumulated by the system from the last maintenance action. Under these assumptions, integral equations fulfilled by different performance measures are obtained. Numerical examples are given that illustrate the analytical results.
The delay time model is a practical way to model random occurrences of failures and the effect of inspection and maintenance actions on the reliability of a repairable system. The delay time model involves two random variables describing the time of initiation of defects and time to failure after the defect initiation. This article presents a clear and structured approach to the evaluation of maintenance cost using the theory of stochastic renewal processes. This article derives the mean, variance, skewness and kurtosis of the maintenance cost in a finite time horizon. Furthermore, the probability distribution of cost is accurately estimated using the Hermite polynomial model. Using the cost distribution, the value at risk is estimated and proposed as a measure to optimize the maintenance program.
The hygiene, safety, and environment risk assessment is becoming a major challenge for companies in the area of security. It constitutes a precondition in the definition of the strategy to be adopted. The vagueness, uncertainty of input parameters, disputes over opinions between decision-makers, and the absence of integrated models of overall hygiene, safety, and environment risk assessment constitute a handicap in the assessment of risks’ acceptability. In this article, we propose an integrated model based on fuzzy logic to assess the overall hygiene, safety, and environment risk of machines. This model allows to organize the machines into hierarchy, to put in order management systems according to priority of each machine, and to classify the actions to be implemented by priority within each system. The proposed model is performed in a fuzzy logic toolbox of MATLAB using Mamdani algorithm. A case study is carried out in the Mineral Waters of Oulmes company in order to test the proposed model. A comparison shows that the proposed model offers more accurate, precise, and best results than those of classical methods.
Survivability is a crucial property for those systems – such as critical infrastructures or military Command and Control Information Systems – that provide essential services, since the latter must be operational even when the system is compromised due to attack or faults. This article proposes a model-driven method and a tool –MASDES– to assess the survivability requirements of critical systems. The method exploits the use of (1) (mis)use case technique and UML profiling for the specification of the survivability requirements and (2) Petri nets and model checking techniques for the requirement assessment. A survivability assessment model is obtained from an improved specification of misuse cases, which encompasses essential services, threats and survivability strategies. The survivability assessment model is then converted into a Petri net model for verifying survivability properties through model checking. The MASDES tool has been developed within the Eclipse workbench and relies on Papyrus tool for UML. It consists of a set of plug-ins that enable (1) to create a survivability system view using UML and profiling techniques and (2) to verify survivability properties. In particular, the tool performs model transformations in two steps. First, a model-to-model transformation generates, from the survivability view, a Petri net model and properties to be checked in a tool-independent format. Second, model-to-text transformations produce the Petri net specifications for the model checkers. A military Command and Control Information Systems has been used as a case study to apply the method and to evaluate the MASDES tool, within an iterative-incremental software development process.
Remaining Useful Life (RUL) estimation plays an important role in implementing a condition-based maintenance (CBM) program, since it could provide sufficient time for maintenance crew to act before an actual system failure. This prognostic task becomes harder when several deterioration mechanisms co-exist within the same system due to the variability and dynamics of its operating environment, since the RUL obviously depends on the mode that the system is following. In this paper, we propose a multi-branch modeling framework to deal with such problems. The proposed model consists of several branches in which each one represents a deterioration mode and is considered as a hidden Markov model. The system’s conditions are modeled by several discrete meaningful states, such as "good", "minor defect", "maintenance required" and "failure", which would be easy to interpret for maintenance personnel. Furthermore, these states are considered to be "hidden" and can only be revealed through observations. These observations are the condition monitoring information in the CBM context. The performance of the proposed model is evaluated through numerical studies. The results show that the multi-branch model can outperform the standard one-branch HMM model in RUL estimation, especially when the "distance" between the deterioration modes is considerable.
The paper describes a framework for testing a class of safety-critical concurrent systems implemented using shared resource specifications. Shared resources contain declarative specifications of process interaction that can be used to derive, in a model-driven way, the most critical parts of a concurrent system. Here, we propose their use to build a state-based model that will help in testing a real implementation of the resource. The framework has been implemented using Erlang and QuickCheck and its source code is available. The paper also provides a novel parametric operational semantics for shared resources with scheduling policy annotations and a methodology to guide test-case generation from the shared resource specifications and a classification of common mistakes. We illustrate our framework by applying it to testing Java implementations of a prototypical automated shipping plant.
Monte Carlo simulation is a useful technique to propagate uncertainty through a quantitative model, but that is all. When the quantitative modelling is used to support decision-making, a Monte Carlo simulation must be complemented by a conceptual framework that assigns a meaningful interpretation of uncertainty in output. Depending on how the assessor or decision maker choose to perceive risk, the interpretation of uncertainty and the way uncertainty ought to be treated and assigned to input variables in a Monte Carlo simulation will differ. Bayesian Evidence Synthesis is a framework for model calibration and quantitative modelling which has originated from complex meta-analysis in medical decision-making that conceptually can frame a Monte Carlo simulation. We ask under what perspectives on risk that Bayesian Evidence Synthesis is a suitable framework. The discussion is illustrated by Bayesian Evidence Synthesis applied on a population viability analysis used in ecological risk assessment and a reliability analysis of a repairable system informed by multiple sources of evidence. We conclude that Bayesian Evidence Synthesis can conceptually frame a Monte Carlo simulation under a Bayesian perspective on risk. It can also frame an assessment under a general perspective of risk since Bayesian Evidence Synthesis provide principles of predictive inference that constitute an unbroken link between evidence and assessment output that open up for uncertainty quantified taking qualitative aspects of knowledge into account.
We talk about dynamic reliability when the reliability parameters of the system, such as the failure rates, vary according to the current state of the system. In this article, several versions of a benchmark on dynamic reliability taken from the literature are examined. Each version deals with particular aspects such as state-dependent failure rates, failure on demand, and repair. In dynamic reliability evaluation, the complete behavior of the system has to be taken into account, instead of the only failure propagation as in fault tree analysis. To this aim, we exploit dynamic Bayesian networks and the software tool RADYBAN (Reliability Analysis with DYnamic BAyesian Networks), with the goal of computing the system unreliability. Because of the coherence between the results returned by dynamic Bayesian network analysis and those obtained by means of other methods, together with the possibility to compute diagnostic indices, we propose dynamic Bayesian network and RADYBAN to be a valid approach to dynamic reliability evaluation.
The linear Wiener process–based degradation model is commonly used for the lifetime assessment and remaining useful life estimation. This article addresses the effects of model mis-specification of the linear Wiener process for the remaining useful life estimation. First, we study the model mis-specification effects on the parameters’ estimation and the lifetime distribution. Then, the effects of model mis-specification on the remaining useful life estimation and the predictive maintenance decision-making are analysed through some numerical examples and a case study. The results show that mis-specifying the linear Wiener process without measurement error as that with measurement error is negligible. However, under the inverse condition, the mis-specification could result in premature maintenances or failure maintenances, which increases the maintenance costs.
The formation of inorganic scale, particularly calcium carbonate (CaCO3), is a persistent and one of the most serious and costly problems in the oil and gas industries. This event may cause partial to complete plugging, block valves, tubing and flowlines, and then reduce the production rates. This article proposes the use of support vector regression to build a nonlinear mapping between a set of variables (surface cladding, material, temperature, pressure, brine composition, and fluid velocity) and the scale build-up. The support vector regression is fed with data gathered from laboratory tests carried out on coupons that simulate realistic downhole conditions encountered in oil well bores from the pre-salt fields in Brazil. The proposed failure prediction framework is comprehensive as it entails the stages of hyperparameter tuning, variable selection, and uncertainty analysis, which are addressed by a combination of particle swarm optimization and bootstrap with support vector regression. The obtained results suggest that the bootstrapped particle swarm optimization + support vector regression is a valuable tool that may be used to support condition-based maintenance-related decisions.
The survival signature has recently been presented as an attractive concept to aid quantification of system reliability. It has similar characteristics as the system signature, which is well established, but contrary to the latter it is easily applicable to systems with multiple types of components. We present an introductory overview of the survival signature together with new results to aid computation. We develop nonparametric predictive inference for system reliability using the survival signature. The focus is on the failure time of a system, given failure times of tested components of the same types as used in the system.
Spring clips type Vossloh SKL14 (fastening system of track) are vulnerable to fatigue damage in lifetime due to excitations caused by traffic loads. This article has tried to develop a method for reliability analysis of spring clips type SKL14, based on fracture mechanics approach. First of all, a linear dynamic analysis of track has been done for dynamic response calculations under various traffic loads. The displacement time histories are applied on finite element method of type Vossloh SKL14 to achieve cyclic stresses. The equivalent stress range is described by lognormal distribution. Then, fracture reliability analysis has been done for crack propagation assessment in spring clip based on Paris’s law. This fatigue crack growth is dominated by a Mode I mechanism. A linear limit state function based on fracture mechanics is derived in terms of random variables. First-order reliability method has been employed for reliability estimation. At the end, the influence of various random variables on overall probability of failure has been studied through sensitivity analysis.
An efficient method for time-dependent fatigue reliability assessment of mechanical components under random loadings is proposed. Normally, fatigue damage induced by random loading is high-cycle fatigue problem. The randomness of high-cycle fatigue damage is treated in two aspects. The first one is the uncertainty quantification from the external random loading. The second one is the uncertainty quantification of the fatigue property of the structural component. The former is characterized by Gaussian distribution derived from the rainflow cycle distribution, median stress-life (S-N) curve, and the linear damage accumulation rule. The latter is described with the probabilistic stress-life (P-S-N) curve based on log-normal distribution. The proposed method has colligated these two aspects together to evaluate the expectation and confidence interval of fatigue reliability. Finally, a numerical example is provided to verify the effectiveness of the developed approach. A comparison with bootstrap method is also carried out. The comparative result has shown rational accuracy of the proposed method.
Due to the inherent uncertainty associated with various factors in the designing stage, considering uncertainty is important in system designs. In this article, a redundancy allocation problem with active strategy and choice of component type is studied where the system engineer faces with insufficient knowledge about exact values of some characteristics of components such as reliability and cost. The impreciseness is considered in terms of fuzzy numbers with triangular and trapezoidal membership functions. To achieve a robust design under different realizations of uncertain parameters, robust models are developed, which is the first attempt in the area of redundancy allocation problems under fuzziness. In worst case, extreme values of uncertain parameters are considered. In the realistic case, the uncertain parameters are dealt with the help of the credibilistic approach of fuzzy programming and the expected value of fuzzy numbers. In other words, the robust model makes a trade-off between the expected value of system reliability as a performance measure, the deviation of system reliability, and the constraint violation where the penultimate one assures the optimality robustness and the last one preserves the feasibility robustness. The proposed models can help system/product designers and managers who are risk-averse to easily deal with the inherent uncertainty in the designing stage. At the end, numerical examples are presented and the results are analyzed.
With the number of load application as the life parameter, the method for modeling the life probability distribution of mechanical components is studied. First, the failure behavior of mechanical components is analyzed, and the method for describing the failure behavior of mechanical components is developed with the stress–strength interference theory. Then, based on the failure behavior, the life probability density function, life cumulative distribution function, reliability, failure rate, and mean life of component with the number of load application as the life parameter, which can embody the effect of parameters including stress, strength and its degradation, and so on, are derived, respectively. Finally, for the cases of different parameters of stress and strength, the life probabilistic distribution characteristic, reliability, failure rate, mean life, and reliable life of components are studied. The result shows that the life probabilistic distribution characteristic of components is dependent on the specific value of parameters describing the failure behavior. With the method and model proposed in this article, as long as the parameters of strength, stress, strength degradation, and so on are known, the life probabilistic distribution functions of components can be calculated, the rules that the reliability and failure rate of component change as the number of load application can be obtained, and the mean life and reliable life of components can be determined.
Over the years, condition monitoring of rotating machines has been extensively applied for enhancing equipment reliability and maintenance cost-effectiveness, through the early detection and reliable diagnosis of incipient machine faults. Earlier studies suggest that bispectrum analysis is a good tool for detecting and distinguishing rotor-related faults in rotating machines, with a significantly reduced number of vibration sensors. Now, the trispectrum analysis is also applied to the measured vibration data, so as to explore the usefulness of this analysis in the diagnosis. It is observed that the trispectrum further improves the reliability of rotating machines’ faults diagnosis. This article presents the results and observations related to the bispectrum and trispectrum analyses for fault(s) diagnosis, through an experimental rig with different faults simulation.
To evaluate and optimize the space tracking, telemetry and command system design, it is important to perform mission reliability analysis of tracking, telemetry and command system. Considering the complexity of tracking, telemetry and command system configuration and mission process, it is nearly infeasible to model and analyze the system by manual work. For accuracy and efficiency reasons, system designers need to have an integrated set of methods and tools for modeling specifications and performing reliability analysis. This article presents an XML-based (extensible markup language–based) schema named reliability modeling language to formally represent data and information, which are necessary for building the mission reliability model of tracking, telemetry and command system. To facilitate the evaluation of mission reliability measures, we propose the improved extended object–oriented Petri net formalism, which is an extension of object-oriented Petri net to perform mission reliability simulation and analysis. The standard descriptive model in reliability modeling language can be automatically and directly transformed into an extended object–oriented Petri net model by applying model transformation rules and algorithm. The proposed approach is illustrated and validated by examples, which consider the complex situation such as component phase dependence, non-exponential failure rate, instantaneous repair as well as different work start and end time. The simulation results show a good approximation compared with the results of analytical models.
Probability of failure on demand is commonly used to measure the performance of safety-instrumented systems in the low (frequency) demand mode, while probability of failure per hour is taken as the measure in the high-demand mode. In current IEC 61508, once per year of the demand frequency is regarded as the borderline between two modes. However, few explanations can be found about why the borderline is here. This study focuses on the intermediate area of the two demand modes, examines the adaptability of probability of failure on demand with demand rates and then proposes a discrimination criteria of demand modes based on the analysis for the adaptability of probability of failure on demand with the Markov method. According to these criteria, in the high-demand mode where probability of failure on demand is not an effective measure, the equipment has higher probability to fall in hazard before a proof test when the safety-instrumented system is unavailable, and otherwise, the safety-instrumented system runs in the low-demand mode. The mean downtime of the safety-instrumented system before its failure is revealed in a proof test can help to locate the borderline, which is thus influenced by the configurations of safety-instrumented systems.
This article concerns fault diagnosis and prognosis for stochastic discrete event systems. For this purpose, partially observed stochastic Petri nets are introduced that include the sensors used to measure events and markings and the Markovian stochastic dynamics used to represent failure processes. Timed observation sequences result from this modeling, and the probabilities of timed and untimed marking trajectories consistent with a given timed observation sequence are systematically computed. Diagnosis in terms of fault probability is obtained as a consequence and compared with the belief of faults that is usually used for diagnosis issues. Confidence factors based on fault probabilities are also proposed. Finally, state estimation and fault prediction are investigated, and probability of future faults is obtained as a consequence. An application case is studied to illustrate the method.
Fault tree analysis is a powerful and computationally efficient technique for safety analysis and reliability prediction. It decomposes an undesired failure into multiple possible root causes by constructing a sub-event tree and spreading it into basic events. Classical reliability theory using probability theory to quantify the uncertainties of basic events encounters many challenges when failure data are limited. In this case, uncertainty quantification should be carried out based on subjective information, such as experts’ assessment or engineers’ experience. As a generalization of probability theory, imprecise probability theory can quantify subjective information as the upper and lower expectations or previsions. In this article, a fault tree analysis algorithm incorporating subjective information into imprecise reliability models of basic events is proposed to calculate the failure interval of lubricating oil warning system.
Underground pipeline structures may contain multiple failure modes in which any of the modes can lead to a system failure. These failure modes are a time-variant process, and failure rate increases with the lapse of time. The failure modes may be correlated due to common random variables. In many cases, failure modes are assumed to be independent, and underground pipeline failure is evaluated by neglecting correlations between failure modes. However, neglecting correlations may lead to a gross error in pipeline reliability analysis. Correlations between time-dependent failure modes due to corrosion-induced deflection, buckling, wall thrust and bending stress for a buried flexible steel pipe have been assessed in this study. Reliability index and system failure probability have been analysed using Monte Carlo simulation. Parametric analysis indicates that soil modulus, soil density, pipe stiffness and external loading are the most influencing random variables. The estimated reliability can be utilised to develop maintenance strategies during the pipe service lifetime in order to avoid unexpected failure or collapse.
Modeling and analyzing the behavior of mechanical systems is a promising solution to achieve higher stability, reliability, availability and operability. The accurate description of the complex working principle in the mechanical system, especially the stepwise operation, is one of the crucial issues in developing a model for monitoring the state of mechanical system. This article introduces a practical method for system behavior modeling and failure analysis of the mechanism with principles of multi-operation, using high-level Petri net as the modeling language. The importance of faults and failure propagation mechanism is investigated by the state vector and the mutation vector. Case study of the proposed method contributes to validating the effectiveness of the method and provides clues for identifying weak links and evaluating reliability of the solar array system.
Hub location problem has been one of the most interesting areas of location problems in recent decades. Hubs are critical centers that have a significant role in networks such as logistic, distribution and transportation networks. Therefore, it is necessary to design reliable hub networks to overcome different interruptions. In this article, we have proposed a new concept in hub location problems called preventive reliable hub location problem which is based on passive defense definitions. When valuable objects are transported via networks, many intentional crimes such as attack or robbery may occur. To make transportation faster and cheaper, it is desired to use hub and spoke networks. In addition, preventive reliable hub location problem is designed to secure the network and prevent loss of values. To make the network more reliable and save all common properties, three new objects have been added to the usual hub networks: fake hub, fake allocation and fake flow. A mixed integer linear programming is used to model preventive reliable hub location problem and solved using literature data. Finally, the results are represented.
Solder joint fatigue is a major concern in circuit board assemblies. A batch of FR4 test boards with different component layouts was assembled. These test boards were then put into an environmental chamber and subjected to thermal cycling. During the test, the samples were removed from the chamber at given intervals and examined by acoustic micro imaging. The solder joint reliability from origin to failure was obtained by processing these acoustic micro imaging data. The impact of different floor plans and component layouts on solder joint reliability were analysed. Remarkably, the results show that the floor plan and component layout have significant influence on solder joint reliability. Components placed in a mirrored configuration in a circuit board assembly have lower reliability than non-overlapping configurations and single-sided assembly. Finally, a stress-based finite element model with two simulation scenarios was carried out to correlate reliability findings with real experimental results.
Combined with advantages of moving least squares approximation, a new method for estimating higher-order conditional moment is established, which is useful for application in importance analysis and provides a supplement of the standard variance-based importance analysis. On the other hand, after obtaining the first four-order moments, the probability density function can be emulated by use of the Edgeworth expansion procedure, thereby a new method to compute the moment independent importance measure index i proposed by Borgonovo is presented in this article. Two examples are employed to demonstrate that it is necessary to analyze higher-order conditional moment in importance analysis. At the same time, we study the feasibility of the Edgeworth expansion-based method for estimating the index i by applying it to these examples.
In the subsea oil and gas industry, new systems and new technologies are often met with skepticism, since the operators fear that they may fail and lead to production loss, costly repair interventions, and hydrocarbon leakages to the sea. Before a new system is accepted, the producer has to convince the operator that it is fit-for-use and has a high reliability. This is often done through a technology qualification program. An important part of the technology qualification program is to predict the failure rate of the new system in its future operational context. Identifying potential problems and estimating the failure rate at an early stage in the system development process are important owing to the high cost of design modifications later in the development process. This article presents a practical approach to reliability prediction of new subsea systems based on available operational data from similar, known systems from the topside environment and a comparison between the two systems. The application of the approach is illustrated by an example of a subsea pump.
Performability relates the performance (throughput) and reliability of software systems whose normal behaviour may degrade owing to the existence of faults. These systems, naturally modelled as discrete event systems using shared resources, can incorporate fault-tolerant techniques to mitigate such a degradation. In this article, compositional fault-tolerant models based on Petri nets, which make its sensitive performability analysis easier, are proposed. Besides, two methods to compensate existence of faults are provided: an iterative algorithm to compute the number of extra resources needed, and an integer-linear programming problem that minimises the cost of incrementing resources and/or decrementing fault-tolerant activities. The applicability of the developed methods is shown on a Petri net that models a secure database system.
When tests are performed in scenarios such as reliability demonstration, two extreme possibilities are to perform all required tests simultaneously or to test all units sequentially. From the perspective of time for testing, the former is typically preferred, but in high potential risk scenarios, for example in the case of possibly disastrous results if a tested unit fails, it is better to have the opportunity to stop testing after a failure occurs. An analogon appears in medical testing, with patients being the ‘units’, if new medication is to be tested to confirm its functionality while possibly severe (side) effects are not yet known. There is a wide range of test scenarios in between these two extremes, with groups of units being tested simultaneously. This article discusses such scenarios in a basic setting, assuming that the total number of required tests has been set, for example based on other criteria or legislation. A new criterion for guidance on suitable test group sizes is presented. Throughout, the aim is for high reliability, with testing stopped in case any unit fails, following which the units will not be approved. Any consecutive actions, such as improvement of the units or dismissing them, are not part of the main considerations in this article. While in practice the development of complex models and decision approaches may appear to be required, a straightforward argument is presented, which leads to results that can be widely applied and easily communicated.
The fragmentation of a network is used to understand the effects of element removals on its cohesion. Minimum information is required to fragment a network, namely: the topology of the network. Continuous fragmentation of a network can be used to uncover important/critical elements in the network. This article proposes a bi-objective optimization model that, when solved, provides the most economical network fragmentation strategies for increasing element fragmentation cost. After description and solution of the model, the manuscript describes, via experimentation, how the results of the model can be used as a surrogate metric for understanding element importance performance in real service networks. The experimentation is complemented with a classical example of social network analysis. The results show that the proposed fragmentation models can be used as a guide to identify sets of elements that contribute to the successful performance of a system.
This article presents reliability-based topology optimization design of a linear micromotor, including multitude cantilever piezoelectric microbimorphs. This design is considered for quasi-static and linear conditions, and a relatively new computational approach called the smoothed finite element method is applied. Since microfabrication methods are used for manufacturing this type of actuator, the uncertainty variables become very important. Hence, these variables are considered as constraints during our topology optimization design process and reliability-based topology optimization is conducted. To avoid the overly-stiff behavior in finite element method modeling, the cell-based smoothed finite element method (as a branch of smoothed finite element method) has been conducted for this problem. Here, after finding the most effective random design variables using the performance measure approach and first-order reliability approximation, the topology optimization procedure is implemented in order to find an optimum piezoelectric volume fraction (as an unknown constraint for the first step) using piezoelectric material with penalization and polarization model and method of moving asymptotes optimizer. After determining problem constraints, topology optimization design is followed. This algorithm is called reliability-based design optimization using independent approach. Numerical tests show that final characters of the optimized model using cell-based smoothed finite element methods are improved compared with standard finite element methods.
Assessing the reliability of systems with mobile components, that is components whose locations and interactions change during the mission of the system, raises a number of specific modeling issues. In this article, we compare two candidate modeling formalisms to do so: AltaRica and PEPA nets. We study their respective advantages and drawbacks and we show benefits of a cross fertilization.
In the UK railway it is necessary to show that risks relating to any design solutions are as-low-as-reasonably-practicable. It is a legal requirement and in certain instances complying with a standard may be enough. However, in other circumstances we may have to perform a formal risk assessment. It seems clear that there is a continuum between the two positions but how do we know what to do and if that is enough? This article seeks to address the question. The UK risk acceptance approach, including the as-low-as-reasonably-practicable principle, is explored and the recent initiative of the common safety method is discussed. An example of a compliance safety process against standards is given using a case study based upon changes to rolling stock. A further example where risk assessment and a cost benefit analysis have been employed to support a safety argument for a non-compliant gradient is then presented, followed by concluding remarks.
In this article the maximum likelihood estimates of the model parameters under step-stress partially accelerated life tests (SSPALT) are obtained assuming the Weibull distribution with Type-II censored data. Also, the confidence bounds of the parameters are obtained. In addition, optimum step stress test plans are developed. The optimum test plan determines the optimal stress change point that minimizes the generalized asymptotic variance of the maximum likelihood estimators for the model parameters. That is, improving the quality of the statistical inference.
In the non-probabilistic uncertainty structural analysis, the input uncertain variables of it, such as loads and material properties, will be propagated to the output responses, which include the displacement, stress and compliance, etc. To measure the effect of these non-probabilistic input variables on the output response, two new uncertainty importance measures on the non-probabilistic reliability index are discussed. For the linear limit state function, the analytical solutions of the importance measures are derived. To reduce computational effort, the discretization method and the surrogate model method are presented to calculate the two importance measures in case of the non-linear limit state. Finally, four examples demonstrate that the proposed importance measures can effectively describe the effect of the input variables on the reliability of the structure system, and the established methods can effectively obtain the two importance measures.
This article considers the problem of evaluating the reliability of systems subject to common-cause failures. The existing efficient decomposition and aggregation approach for common-cause failures analysis generates and solves a number of reduced reliability problems separately and then aggregates results of those problems based on the total probability theorem to generate the entire system reliability. We propose an enhanced decision diagram-based analysis method, which requires less computer storage and is more efficient as compared with the existing efficient decomposition and aggregation solution by generating a single compact diagram to model all the reduced reliability problems sharing their isomorphic sub-decision diagram. By using multiple-valued variables to encode common-cause failures, the decision diagram-based method offers a simple and straightforward model evaluation process that automatically implements the results aggregation process of the efficient decomposition and aggregation method. Application and advantages of the proposed method are illustrated through detailed analyses of an example computer system subject to common-cause failures.
System signatures provide a powerful framework for reliability assessment for systems consisting of exchangeable components. The use of signatures in nonparametric predictive inference has been presented and leads to lower and upper survival functions for the system failure time, given failure times of tested components. However, deriving the system signature is computationally complex. This article presents how limited information about the signature can be used to derive bounds on such lower and upper survival functions and related inferences. If such bounds are sufficiently decisive they also indicate that more detailed computation of the system signature is not required.
Warm standby sparing is a fault-tolerance technique that attempts to improve system reliability while compromising the system energy consumption and recovery time. However, when the imperfect fault coverage effect (an uncovered component fault can propagate and cause the whole system to fail) is considered, the reliability of a warm standby sparing can decrease with an increasing level of the redundancy. This article studies the reliability of a warm standby sparing subject to imperfect fault coverage, in particular, fault level coverage where the coverage probability of a component depends on the number of failed components in the system. The suggested approach is combinatorial and based on a generalized binary decision diagrams technique. The complexity for the binary decision diagram construction is analyzed, and several case studies are given to illustrate the application of the approach.
This article presents methods of safety and reliability analysis based on systemic-structural activity theory, an alternative psychological framework to cognitive psychology. Systemic-structural activity theory understands human activity during task performance as a structured system of mental and motor actions and operations in which cognition, behavior and motivation are integrated by self-regulation mechanisms toward achieving a conscious goal. Systemic-structural activity theory methods of algorithmic, time-structure and complexity analysis incorporating the use of the MTM-1 (method-time measurement system) system to describe motor actions are demonstrated and discussed using the example of a small-serial production operation. These methods, which generate detailed models of human activity during task performance, are particularly useful at the early stages of the design and development process.
In nuclear power plants, probabilistic risk assessment insights contribute to achieve a safe design and operation. In this context, the decision-making process must be robust and uncertainties must be taken into account and controlled. In general, the uncertainties in a nuclear probabilistic risk assessment context can be categorized as either aleatory or epistemic. The epistemic uncertainty, which can be subdivided into parameter and model uncertainties, is recognized to have an important impact on actual results of probabilistic risk assessment. Traditionally, the approach of an epistemic uncertainty analysis in nuclear probabilistic risk assessment often relies on the probabilistic approach in which parameter uncertainty is treated by using an assigned probability distribution, e.g. the log-normal one, and model uncertainty can be taken into account through sensitivity studies. Such an approach has been recognized in several recent researchs to present limitations regarding the impacts of assigning probability distribution in case of rare operating feedback data. In order to overcome such a limitation, in this article we propose a comprehensive approach for uncertainty analysis from the parameter and model uncertainties modeling to the final step of the decision-making process using the Dempster-Shafer theory, which is recognized to be more general than the probabilistic approach. We also show that the traditional probabilistic approach, currently used in probabilistic risk assessment practice, can be totally integrated in this framework. Finally, the proposed framework is illustrated and compared with the traditional approach through a practical example from EDF Nuclear Power Plants probabilistic risk assessment application. Some discussions and conclusions for industrial probabilistic risk assessment contexts will be given.
Among the energy resources, the energy obtained from nuclear power plants is very important for the prosperity of any country. Living probabilistic safety assessment is a growing field that provides a high level of safety for nuclear power plants. Living probabilistic safety assessment consists of different techniques, among them this article presents a method to update reliability data. This method is based on Binomial likelihood function and its conjugate beta distribution for demand failure probability, and Poisson likelihood function and its conjugate gamma distribution for operational failure rate. The method uses generic data for beta and gamma prior distribution, which is updated by using the reliability data update method. Reliability data update is a computer-based program used to update nuclear power plant data according to changing conditions. By updating the living probabilistic safety assessment it is possible to get an online risk monitor system that can be helpful in severe accident conditions, as in Fukushima accident, to make the man–machine system friendly.
This article proposes the use of affine arithmetic as an alternative approach for assessing the effects of uncertainties of transition rates on the steady-state probabilities for each possible state of a system represented by a Markov model. Affine arithmetic is an extension of interval arithmetic, able to track "the dependency between variables throughout calculations" and to provide strict bounds. Several examples illustrate the proposed approach. Results are compared with other approaches, such as interval arithmetic, Monte Carlo simulation or solving linear systems of simultaneous equations.
The conventional power system is going through a paradigm change towards smart grid, incorporating technological innovations for sensing, communicating, applying intelligence and exercising control through feedback. Recently, phasor measurement units (PMUs) have been used extensively for the purpose of sensing and communication. In many system planning studies, planners experience conflicts while taking into account the requirements related to communication systems, environment and geographic configurations owing to lack of proper spatial co-ordination. A geographic information system (GIS) provides a rich set of functions to view the power system network and to explore the geospatial relations.
The main contribution of this article is to investigate the impact of topological attributes on commissioning phasor measurement units to ensure reliability through different phasor measurement unit connectivity configuration.
Comparative studies have been done amongst different phasor measurement unit connectivity configuration for determining the most reliable data exchange methodology. Case studies related to the eastern grid of India corroborate the potential use of geographic information system taking pragmatic spatial aspects into account for phasor measurement unit placement.
Currently, a high percentage of accidents in railway systems are accounted to human factors. As a consequence, safety engineers try to take into account this factor in risk assessment. However, human reliability data are very difficult to quantify, thus, qualitative methods are often used in railway system’s risk assessments. Modeling of human errors through probabilistic approaches has shown some limitation concerning the quantification of qualitative aspects of human factors. The proposed article presents an original method to account for the human factor by using evidential networks and fault tree analysis.
Statistical flowgraph models have proven useful for analysis and modeling of complex systems viewed as multistate processes that lead to outcomes such as degraded operation or failure. This article provides an engineering-oriented introduction to statistical flowgraph models: system representation, setting up a flowgraph model, parameter estimation, solution of the model (using either a frequentist or Bayesian approach), and interpretation of model outputs. The method is illustrated with a model for piping reliability in a nuclear power plant, and compared with alternative solution methods.
Reliability analysis has become an integral part of system design and operation. This is especially true for systems performing critical tasks, such as mass transportation systems. This explains the numerous advances in the field of reliability modeling. More recently, some studies involving the use of Bayesian networks have been proven relevant to represent complex systems and perform reliability studies. In previous works, a generic methodology was introduced for developing a decision support tool to evaluate complex systems maintenance strategies. This article deals with development of such a decision tool dedicated to the maintenance of Paris metro rails. Indeed, owing to fulfillment of high-performance levels of safety and availability (the latter being especially critical at peak hours), operators need to estimate, hour by hour their ability to prevent or to detect broken rails. To address this problem, a decision support tool was developed, the aim of this article is to evaluate, compare and optimize various operating and maintenance strategies.
The aim of this article is double: propose a methodology for a probabilistic prognosis, and examine how the prognosis result impacts the maintenance process. First, the prognosis problem is mathematically defined: it consists in computing the distribution of the remaining useful life of the system conditionally to on-line available information. Considering on-line information allows to provide a specific prognosis for each system according to its life. Second, a global methodology is proposed when the state of the system and its degradations are modeled using a Markov process. This method is basically a two-step technique. On one hand, it requires the computation of the conditional law of the system regarding the available observations. On the other hand, it involves the computation of the reliability of the system. Some reliability computation techniques are proposed when the Markov process is a piecewise deterministic Markov process. The method is illustrated on an aeronautic example: a pneumatic valve within the bleed air system, used to provide regulated air (pressure, temperature) in the cabin. Eventually, the prognosis result is used to help maintenance optimization on an illustrative example. It highlights that the prognosis mainly improves the maintenance decision if the on-line available information is accurate enough.
The modeling of joint probability distributions of correlated variables and the evaluation of reliability under incomplete probability information remain a challenge that has not been studied extensively. This article aims to investigate the effect of copulas for modeling dependence structures between variables on reliability under incomplete probability information. First, a copula-based method is proposed to model the joint probability distributions of multiple correlated variables with given marginal distributions and correlation coefficients. Second, a reliability problem is formulated and a direct integration method for calculating probability of failure is presented. Finally, the reliability is investigated to demonstrate the effect of copulas on reliability. The joint probability distribution of multiple variables, with given marginal distributions and correlation coefficients, can be constructed using copulas in a general and flexible way. The probabilities of failure produced by different copulas can differ considerably. Such a difference increases with decreasing probability of failure. The reliability index defined by the mean and standard deviation of a performance function cannot capture the difference in the probabilities of failure produced by different copulas. In addition, the Gaussian copula, often adopted out of expedience without proper validation, produces only one of the various possible solutions of the probability of failure and such a probability of failure may be biased towards the non-conservative side. The tail dependence of copulas has a significant influence on reliability.
In this article, a generalized procedure for estimating probabilistic fatigue life of steel plate railway bridge girders with welded connections, considering plate breathing and a loading spectrum, is presented. The procedure combines the probabilistic S–N curve with Palmgren-Miner’s fatigue damage accumulation rule. One of the features of the study is determination of effect of the modeling error associated with the S–N curve on estimation on the fatigue life of bridge girders. Expressions for probability density functions of number of cycles to failure and accumulated fatigue damage are obtained in the closed form and Monte Carlo simulation, for the cases of without and with modeling error considered in fatigue life estimation, respectively. The use of the proposed procedure is illustrated by considering two railway plate girder bridges designed according to Indian Railway Standards. From the results obtained, it is noted that plate breathing is an important mechanism to be considered while estimating the fatigue life of the railway bridges. The results of the present investigation clearly bring out the importance of carrying out fatigue reliability analysis of different bridge spans, while establishing railway line reliability (as reliability of the weakest span would govern the line reliability).
Non-probabilistic reliability sensitivity analysis for structural systems plays an important role in determining key design variables that affect structural reliability strongly. Traditional non-probabilistic model assumes that all interval variables are mutually independent. However, this assumption may not be true in practical engineering. In this article, the dependency of interval variables is introduced into the non-probabilistic model by using both inequality and equality constraints. The non-probabilistic index model and optimization method for structural systems with interval variables, whose state of dependence is determined by constraints, are proposed on the basis of the existing non-probabilistic index theory. The linear optimization model is alternative when nonlinear optimization model cannot find any solution. Non-probabilistic reliability sensitivity analysis model and optimization method for structural systems, with the interval variables whose state of dependence is determined by constraints, are established based upon the finite difference theory. The proposed method is demonstrated via several examples.
The stochastic gamma process model is widely used in modeling a variety of degradation phenomena in engineering structures and components. If degradation in a component population can be accurately measured over time, the statistical estimation of gamma process parameters is a relatively straight-forward task. However, in most practical situations, degradation data are collected through in-service and non-destructive inspection methods, which invariably contaminate the data by adding random noises (or sizing errors) to the data. Therefore, a proper estimation method is needed to filter out the effect of sizing errors from the measured degradation data.
This article presents an efficient method for estimating the parameters of the gamma process model based on a novel use of the Genz transform and quasi-Monte Carlo method in the maximum likelihood estimation. Examples presented show that the proposed method is very efficient compared with the Monte Carlo method currently used for this purpose in the literature.