Quantifying unquantifiable: The outcome of a clinical case must be quantified to make it Successful

The today’s efforts of system engineers to assure adequate and reliable performance of Medical Devices (MD) and instrumentation, not to mention the performance of the medical personnel, are, as a rule, unquantifi able. It is argued by this author that the successful outcome of a medical mission or a more or less typical clinical extraordinary situation cannot be expected, if it is not quantifi ed, and that because of the various inevitable intervening uncertainties, this quantifi cation should be done on the probabilistic basis. Nothing and nobody is perfect. In effect, the difference between a highly reliable and insuffi ciently reliable medical equipment/instrumentation, or between the performance of a highly qualifi ed medical doctor and a mediocre physician is “merely” the difference in the level of their, actually, never-zero, probability of failure. It is important therefore that such a probability is assessed in advance and made adequate for the medical equipment and clinical tasks of importance. This probability cannot be high, of course, but, as far as a medical instrumentation or devices are concerned, does not have to be lower than necessary either: it has to be predicted and made adequate for a particular medical product and application. Devices that “never fail” are most likely “over-engineered”, i.e., are more robust than they could and should be, and, because of that, could be more costly than necessary.

often exhibit nonetheless premature fi eld failures. Are these methodologies and practices, and particularly the accelerated test procedures, adequate [6]?
• Do electronic industries need new approaches to qualify their products, and if they do, what should be done differently [7]?
• Could the existing practices be improved to an extent that if the product passed the reliability tests, there is a way to assure that it will satisfactorily perform in the fi eld [8]?
• Could the operational (fi eld) reliability of an electronic product, and, particularly, in medical electronics, where high reliability of devices is especially critical [9,10], be assured, if it is not predicted, i.e., not quantifi ed [11,12]?
• And if such quantifi cation is found to be necessary, could that be done on the deterministic, i.e. on a nonprobabilistic basis [13]?
• Should MD manufacturers keep shooting for an unpredictable and, perhaps, unachievable very long, such as, say, twenty years or so, product lifetime or, considering that every fi ve years a new generation of devices, including MDs, are developed and appear on the market and that such long time predictions are rather shaky, to say the least, should the manufacturers settle for a shorter, but well substantiated, predictable, trustworthy, physically and economically feasible, and, to an extent possible, assured lifetime, with an adequate anticipated probability of failure [14,15]?
• And how such a lifetime should be related to the acceptable (adequate and, if appropriate, even specifi ed) probability of failure for a particular product and application [16,17]?
• Since understanding the reliability physics underlying the possible electronic materials and device failure is critical, and so is the accelerated testing in making a viable electron device into a reliable product, is there and alternative to, or at least a suitable modifi cation of, the currently widely used highly accelerated life testing (HALT), a "black box" that supposedly improves reliability, but does not quantify it, even on a deterministic basis [18]?
• Is a highly focused and a highly cost effective failureoriented-accelerated test (FOAT) the right accelerated life test [7,[18][19][20] and the right extension and modifi cation of HALT [21]?
• Considering that the principle of superposition does not work in reliability engineering, how to establish the list of the crucial accelerated tests, the adequate, i.e., physically meaningful, stressors and their combinations and levels?
• Boltzmann-Arrhenius-Zhurkov (BAZ) equation [22][23][24][25][26][27][28][29] was recently suggested as a suitable analytical model that could be used to bridge the gap between what one observes as the experimental FOAT data and what will most likely happen in actual operating conditions for the device of interest. What are the merits and the shortcomings of this kinetic model?
• The best engineering product is, as is known, the best compromise between the requirements for its reliability, measurable cost effectiveness and, also measurable, short-as-possible time-to-market [30]; but what about reliability and its appropriate quantifying characteristics? It goes without saying that to make any optimization possible, reliability of such product should also be quantifi ed, but what is the simplest and the trustworthy way for doing that?
• Bathtub curve [31], the experimental "reliability passport" of a mass-fabricated product, refl ects the inputs of two critical irreversible processes -the statistics-of-failure process that results in a reduced failure rate with time (this is particularly evident from the infant mortality portion of the curve) and physics-of-failure (aging, degradation) process that leads to an increased failure rate with time (this trend is explicitly exhibited by the wear out portion of the bathtub diagram). Could these two critical processed be separated [32]? The need for that is due to the obvious incentive to minimize the role and the rate of aging, and this incentive is especially signifi cant for products like lasers, solder joint interconnections and others, which are characterized by long wear out portions and when it is economically infeasible to restrict the product's lifetime to the steady-state situation, when the two irreversible processes in question compensate each other.
• A related question has to do with the fact that real time degradation is a very slow process. Could physically meaningful and cost-effective methodologies for measuring and predicting the degradation (aging) rates and consequences be developed? Could BAZ model [33] be applied to provide the quantitative assessment here?
• What is the possible role of analytical ("mathematical") modeling predictive modeling [34]? If the predictions based on computer simulations and analytical modeling (these two modeling approaches are based, as is known, on different assumptions and used different calculation methods and techniques) agree, then there is a good reason to believe that the obtained data are suffi ciently accurate and trustworthy [35].
• And how the above questions could/should be answered and the taken approaches modifi ed and extended in application to the outer space engineering [36,37]?

HF and Its role
Human error (HE) (see, e.g., [38,39]) affects, to a greater or lesser extent, all aspects of human activity. Ability to understand the nature of various critical HEs and minimize The long-term HCF should always be considered vs. the elevated short-term MWL that the human has to cope with to successfully complete a critical task or withstand an offnormal (emergency) situation. It is argued that both traditional cognitive/Mental Workload (MWL) [40][41][42][43][44][45][46][47][48][49][50] and Human Capacity Factor (HCF)  should be considered, when quantifying the most likely outcome of a HITL related mission, medical case, or an extraordinary situation. The famous 2009 US Airways "miracle-on-the-Hudson" successful ditching [73] and the infamous 1998 Swiss Air "UN-shuttle" disaster [73] are good illustration to this statement. The input data in the publication [73] are hypothetical, but realistic, and it is the approach, and not the numbers, that is, in the author's opinion, the major merit of the analysis. It attracted quite a number of references in the ergonomics literature. As the co-inventor of the calculus, the great mathematician Gottfried Leibnitz put it, "there are things in this world, far more important than the most splendid discoveries-it is the methods by which they were made." It has been shown particularly that it was the exceptionally high HCF of the captain Sullenberger ("Sully") and his crew that made a reality what seemed to be, at the fi rst glance, a "miracle". The highly professional and, in general, highly qualifi ed Swiss Air crew exhibited inadequate performance (quantifi ed in our analysis as a relatively low HCF level) in a much less challenging off-normal situation they encountered with. The Swiss Air crew made several serious HEs and, as a result, crashed the aircraft. In addition to the application of the suggested new Double-Exponential-Probability-Distribution-Function (DEPDF) based approach [62], it has been shown, using a well-known convolution approach in the applied probability [2], that the probability of safe landing/ditching can be evaluated by comparing the (random) operation time (that consists of the decision making time and the actual landing/ditching time) with the "available" anticipated, also random, of course, time needed for landing.
A similar approach can be used, when evaluating, say, an outcome of a surgery, and such as effort is considered by the author at present as future work. The developed formalisms, after trustworthy input data are obtained (using, e.g., fl ight simulators [70] and/or by applying Delphi method (see, e.g., [2]) might be applicable even beyond the vehicular or medical domain and can be employed in various HITL situations, when a long term high HCF is imperative and the ability to quantify it in comparison with the short term anticipated MWL is desirable.
It has been suggested  that MWL vs. HCF is always considered as a suitable a way to quantify human performance.
In the simplest case such a failure should be attributed to an insuffi cient HCF, when a human has to cope with a relatively qualities that would enable him/her to cope with the high MWL. Note that adequate trust that is briefl y addressed below is often also an important HCF.
It is noteworthy that the ability to evaluate the "absolute" level of the MWL, important as it might be for numerous existing non-comparative evaluations, is less critical in our MWL vs. HCF approach: it is the relative levels of the MWL and the HCF, and the comparative assessments and evaluations of their levels and likelihoods that are important. The author does not intend, of course, to come up with an accurate, complete, ready-to-go, "off-the-shelf" type of a MWL vs. HCF methodology, in which, as they say, all the i's are dotted and the t's are crossed, but rather intends to show how the powerful and fl exible PPM methods and techniques could be effectively employed to quantify the role of the HF by comparing, on the probabilistic basis, the actual and/or possible MWL and the available or required HCF levels, so that the adequate, suffi cient and quantifi ed, preferably on the probabilistic basis, safety factor is assured. Note that testing on a fl ight simulator [67] and possible accelerated/preliminary testing in health care are analogous to the HALT and FOAT in electronics reliability engineering, including medical electronics.
Here how the major principles ("the ten commandments") of our HCF vs. MWL approach could be summarized and formulated: 1. HCF is viewed as an appropriate quantitative measure (not necessarily and not always probabilistic) of the human ability to cope with an anticipated elevated short term MWL; 2. It is the relative levels of the MWL and HCF (whether deterministic or random) that determine the probability of human non-failure in a particular HITL situation; 3. Such a probability cannot be low, but need not be higher than necessary either: it has to be adequate for a particular anticipated application and situation; 4. When adequate human performance is imperative, ability to quantify it is highly desirable, is even a must, especially if one intends to optimize and assure adequate human performance;

5.
One cannot assure such a performance by just conducting routine today's human psychology based efforts (which might provide appreciable improvements, but do not quantify human behavior and performance; in addition, these efforts might be too and unnecessarily costly), and/or by just following the existing "best practices" that are not aimed at a particular situation or an application; the events of interest are certainly rare events, and "best practices" might not be applicable; 6. MWLs and HCFs should consider, to an extent possible, the most likely anticipated situations; obviously, the MWLs are and HCFs should be different for a jet fi ghter pilot, for a pilot of a commercial aircraft, or for a helicopter pilot, as well as for different healthcare related cases, performers and situations, and therefore should be approached, assessed and, if necessary and appropriate, even specifi ed differently; 7. PPM is an effective means for improving the state-ofthe-art in the HITL fi eld: nobody and nothing is perfect, and the difference between a failed human performance and a successful one is "merely" in the level of the probability of their non-failure; this statement is true for practically any fi eld of human activity; 8. FOAT on a fl ight simulator is viewed as an important constituent part of the PPM concept in various HITL and aerospace engineering related situations, but could and, perhaps, even should be considered and conducted for many healthcare related endeavors; such accelerated testing will certainly improve our understanding of the factors underlying possible failures; this effort might be complemented by the Delphi (experts' opinion) effort; 9. Extensive predictive modeling (PM) and especially PPM is another important constituent of the approach, and, in combination with the highly focused and highly cost effective FOAT, is a powerful and effective means to quantify and perhaps nearly eliminate human failures in a number of critical missions and off-normal/ extraordinary situations; 10. Consistent, comprehensive and psychologically meaningful PPM assessments can lead to the most feasible HITL qualifi cation (certifi cation) methodologies, practices and specifi cations.
Our HCF vs. MWL approach considers elevated (offnormal) random relative HCF and MWL levels with respect to the ordinary (normal, pre-established) deterministic HCF and MWL values. These values could and should be established on the basis of the existing human psychology practices.
As has been indicated, adequate trust [73][74][75][76] is an important HCF constituent. It is shown particularly [76], using the DEPDF based approach, that the entropy of this distribution, when applied to the trustee (a human, a technology, a methodology or a concept), can be viewed as an appropriate quantitative characteristic of the propensity of a decision maker to an under-trust or an over-trust judgment and, as a consequence of that, to the likelihood of making a mistake or an erroneous decision. Since Shakespearian "love all, trust a few" and "don't trust the person who has broken faith once" and to the today's ladygaga's "trust is like a mirror, you can fi x it if it's broken, but you can still see the crack in that mother f*cker's refl ection", the importance of human-human trust was addressed by numerous writers, politicians and psychologists. It was the 19th century South Dakota politician Frank Craine who seems to be the fi rst who indicated the importance of an adequate trust in human relationships: "You may be deceived if you trust too much, but you will live in torment unless you trust enough".
The analysis in [76] is, in a way, an extension and a generalization of the recent Kaindl and Svetinovic [75] Citation: Suhir  publication, and addresses some important aspects of the HITL problem for safety-critical missions and extraordinary situations. It is argued that the role and signifi cance of trust can and should be quantifi ed when preparing such missions, including healthcare related (such as, e.g., surgical) missions. The author is convinced that otherwise the concept of an adequate trust simply cannot be effectively addressed and included into an engineering or a medical technology, design methodology or a human activity, when there is a need to assure a successful and safe outcome of a particular engineering or a medical effort or an aerospace or a military mission.
It has been shown, particularly [76], that the calculated entropy of the DEPDF for the random HCF, when applied to the trustee, can be viewed as an appropriate quantitative characteristic of the propensity of a human to an undesirable under-trust or an over-trust. Captain Sullenberger, the above mentioned hero of the miracle-on-the-Hudson event did possess such a quality. He "avoided over-trust": 1) in the ability of the fi rst offi cer, who ran the aircraft when it took off La Guardia airport, to successfully cope with the situation, when the aircraft struck a fl ock of Canada Geese and lost engine power, and took over the controls, while the fi rst offi cer began going through the emergency procedures checklist in an attempt to fi nd information on how to restart the engines; and 2) in the possibility, with the help of the air traffi c controllers at LaGuardia and at Teterboro, he also "avoided under-trust"(as FDR has put it, "the only thing that we should fear, is fear itself"): 1) in his own skills, abilities and extensive experience that would enable him to successfully cope with the situation (57-year-old Capt. "Sully" was a former fi ghter pilot, a safety expert, an instructor and a glider pilot); that was the rare case when "team work" was not the right thing to pursue; 2) in the aircraft structure that would be able to successfully withstand the slam of the water during ditching and, in addition, would enable slow enough fl ooding after ditching (it turned out that the crew did not activate the "ditch switch" during the incident, but Capt. Sully later noted that it probably would not have been effective anyway, since the water impact tore holes in the plane's fuselage much larger than the openings sealed by the switch); 3) in the aircraft safety equipment that was carried in excess of that mandated for the fl ight; 4) in the outstanding cooperation and excellent cockpit resource management among the fl ight crew who trusted their captain and exhibited outstanding team work (that is where such work was needed and was useful) during landing and the rescue operation; 5) in the fast response from and effective help of the various ferry operators located near the USS Intrepid museum and the ability of the rescue team to provide timely and effective help; and 6) in the good visibility as an important contributing factor to the success of his effort. As is known, the crew was later awarded the Master's Medal of the Guild of Air Pilots and Air Navigators for successful "emergency ditching and evacuation, with the loss of no lives…a heroic and unique aviation achievement…the most successful ditching in aviation history."

Future work
We would like to suggest several possible next steps (future work) that could be conducted using, when necessary, simulators to correlate the accepted DEPDF with the existing practice and to make this distribution applicable for the evaluation of the roles of the MWL and HCF not only to the general fi eld of ergonomics science [77], in various HITL related navigation situations, including avionic [78], automotive driving [79,80], railway obstruction [81], and even outer space related missions [82][83][84][85], but to various medical electronic devices and critical health care tasks, missions and problems as well. These areas have a lot in common, as well as, of course, numerous differences, as well as quite a few critical specifi cs.