Cite this asWu J, Zha P (2022) Clinical trials cannot provide sufficient accuracy for studying weak factors necessary for curing chronic diseases. Glob J Cancer Ther 8(1): 021-033. DOI: 10.17352/2581-5407.000044
Copyright License© 2022 Wu J, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Chronic diseases are still known as incurable diseases, and we suspect that the medical research model is unfit for characterizing chronic diseases. In this study, we examined accuracy and reliability required for characterizing chronic diseases, reviewed implied presumptions in clinical trials and assumptions used in statistical analysis, examined sources of variances normally encountered in clinical trials, and conducted numeric simulations by using hypothetical data for several theoretical and hypothetical models. We found that the sources of variances attributable to personal differences in clinical trials can distort hypothesis test outcomes, that clinical trials introduce too many errors and too many inaccuracies that tend to hide weak and slow-delivering effects of treatments, and that the means of treatments used in statistical analysis have little or no relevance to specific patients. We further found that a large number of uncontrolled co-causal or interfering factors normally seen in human beings can greatly enlarge the means and the variances or experimental errors, and the use of high rejection criteria (e.g., small p values) further raises the chances of failing to find treatment effects. As a whole, we concluded that the research model using clinical trials is wrong on multiple grounds under any of our realistic theoretical and hypothetical models, and that misuse of statistical analysis is most probably responsible for failure to identify treatment effects for chronic diseases and failure to detect harmful effects of toxic substances in the environment. We proposed alternative experimental models involving the use of single-person or mini optimization trials for studying low-risk weak treatments.
Medicine started emerging as modern medicine after the Industrial Revolution in the 18th century. Over the last 150 years, medicine has accomplished some astonishing achievements. Most achievements are in the areas of treating acute diseases such as bodily injuries, infections, poisoning, pains, trauma, etc. In each of those cases, drugs are not used to restore impaired or lost balance in the body. Despite the success in treating acute diseases, medicine has failed to find cures for chronic diseases. The main evidence for its failure includes: (1) Nearly half of all adult Americans suffer from at least one chronic disease. This is equivalent to approximately 45% or 133 million of the population; (2) nearly all chronic diseases are officially listed as incurable diseases in medical references. A long list of chronic diseases is still without a cure. In addition, many types of cancer are considered incurable and terminal; (3) in 2009, 7 out of 10 deaths in the U.S. can be attributed to chronic diseases. Heart disease, cancer, and stroke account for more than half of all deaths each year. We estimated that the total number of premature annual deaths attributed to chronic diseases is about 30 million in the world based on total death data [1,2].
The failure of finding cures is best reflected in cancer. A systematic review concluded the complete response of rates of chemotherapy for a later stage of cancer have remained static and locked at about 7.4% . The response rate of thyroid cancer treatment was 22.1%-27.1%, with complete response rates being 2.5%-2.8% . A recent study examined the most promising cancer treatment methods, and concluded: “The claimed ‘targeted’ therapies that may or may not extend remission of cancer for a few months should not be accepted any longer as ‘cure’ by oncologists, scientists or patients” . The prevalent chronic diseases in the U.S. has become a huge burden on the U.S. economy. In a study done by the Milken Institute, the annual economic impact on the U.S. economy of the most common chronic diseases is calculated to be more than $1 trillion, which could balloon to $5.7 trillion by 2050 .
We see that medicine advances on two distinctive tracks. It is capable of achieving good achievements in the treatment of acute diseases. However, it fails to find cures for chronic diseases. The clear separation line between the two kinds of diseases seems to indicate that the performance difference is related to the medical research and practicing models. In this article, we explore if the population-based model such as clinical trials has some inherent limitations that prevent medical researchers from finding cures.
Our purpose is to examine the performance of clinical trials and statistical methods in the context of characterizing chronic diseases.
We suspect that human individuals introduce very large variances to any measured health properties so that clinical trials are unfit for studying chronic diseases. To prove it, we will use the following model assumptions:
Treatment: s1~N(μ1, σ12) that affects a trial outcome True error: ε~N(0, σE2)
Other causal or interfering factors: s2, s3,…, SK. s2~N(μ1, σ22)
s2, s3,…, sk represent anything that could influence measured health properties relevant to the trial outcomes. They may be substantial cause factors, independent causal factors, indirect causal factors, covariates (independent factors or confounding factors), etc. The only qualification criterion is that they can affect the intended treatment so that they must be considered in practice.
In a clinical trial, treatment s1 must be much larger than the total combined effects of ε and all s2, s3,…, sk so that s2, s3,…, sk can be ignored or treated as part the error ε for convenience.
The model assumption in our study is that s1 is close to ε and also close to one or all of s2, s3,…, sk. For example, in a clinical trial to study a caner treatment, the trial outcome may be judged by observing patients’ average survival times. A large number of factors shown in Table 1 are known to affect patients’ survival times.
Those factors may be traced to genotypes, lifestyle, diets, physical activities and exercise, toxic compound levels in the body, viral infections, gut microbiota, other health problems, etc. In this study, it is further assumed that they affect patients’ survival times randomly. Each of such factors may appear in some patients but not in other patients.
Our question is whether a randomized controlled trial can accurately detect the effects of s1 and what could be done to increase the chance of actually detecting the treatment effect which is similar to or weaker than other causal and interference factors. To answer this question, we used a randomized controlled trial model and a mini optimization trial model to evaluate their respective performance. The basic design of the two types of trials is shown in following Tables 2,3.
Model A: Randomized Controlled Trial is shown in the below table.
The human subjects are allocated to the two arms randomly. The table shows only one potential way of allocation solely for illustration purposes. The effects of s1 on health properties are closer or even smaller than any of those 3 interfering factors s2, s3, s4. For example, s2 may be exercise, s3 is a dietary adjustment, s4 is stress management, etc. They affect patients’ survival times like chemotherapy or other treatment (s1).
Model B: An Optimization Trial is used where all s1, s2, s3, s4 causal factors, and interfering factors are used as one single treatment package for the purpose of raising total treatment effects.
In this optimization design, all other non-treatment causal and interfering factors (s5, s6, …. sk) must be sufficiently small and thus can be bundled into the error term. We call this design an optimization trial because as many important factors are used as the treatment to deliver the maximum treatment effects. Here, all-important causal factors and interfering factors (s1, s2, s3, s4) that would be identified and used are bundled as one treatment package.
We then evaluate how the optimization trial increases the chance to determine the true effects of the treatment package and how to increase the chance of finding cures for chronic diseases.
Our focus is on how to determine the true treatment effect when the treatment is influenced by one or more interfering factors. Our initial focus is the accuracy and reliability required to detect the true effect of the treatment.
The clinical trial developing history reveals that most early clinical trials were used to investigate malnutrition, infections, and wounds (except rheumatism). No effort has been made to understand the inherent limitations of clinical trials in history. We also note that the functional approach used in the machine is inherently incompatible with the population approach (C, Sup.). The population approach cannot be used in diagnosing and repairing machines made of different blueprints. A population approach may be used to study properties of only “nearly identical units.” Differences, if any, must not cause any functional imbalance, structural misfits, fuel imbalance, flow imbalance, heat imbalance, etc. The population approach has not been used to address mechanical problems.
Whether a health problem can be studied by clinical trials depends on the purpose of the study. A threshold requirement is that the effect of the treatment’s on health property is sufficiently larger than the experimental error. This requirement can be satisfied in cases studying strong treatment effects such as pain-killers, surgery, antibiotic drugs, sedative drugs, etc. In those cases, differences among persons will not significantly alter results.
“Chronic diseases are defined broadly as conditions that last 1 year or more and require ongoing medical attention or limit activities of daily living or both” . We show the level of balance required in a human body is much higher than the degree of matches between parts in a machine. Health problems can arise from small biochemical imbalances, which result in small changes in structure, shape, and capacity of body components (A, Sup.). As shown in those examples, the deviations in biochemical and cellular processes for causing chronic diseases are “infinitesimally small.” Net departures from ideal numbers are often in a tenth percent to a few percent of the ideal personal number. Most net conversion rates must be of the right values, and small departures from ideal numbers in either way can be the cause of chronic diseases.
The population approach is extended to all areas of medicine, but one problem that has never been studied is personal variations. Two big sources are different genotypes and phenotypes [7,8]. The chance of a match between two unrelated persons is like that of a DNA match (1 in 113 billion based on 9 loci; 1 in 400 trillion in 13 loci). In addition, personal differences further arise from different emotional states. The personal differences that are important to health may be expressed, in the alternative, such as diet, lifestyle, emotional state, culture, environment, sex, medication history, etc. Personal differences are reflected in reference ranges of laboratory tests for human beings, which are established by empirical methods. The reference ranges, reflect measured variances in any health properties in a population, depending on personal differences in genotypes, phenotypes, daily fluctuations, and measurement error. It could be infinitely large. Each of the health properties of a person may fall a distinctive position of the correspondent populations reference range. No person would have all of his health properties match the population’s means. Differences between two persons can be inferred from differences in body size/shape, organ size/shape, structural strengths, skin colors, physical capacities, emotional conditions, etc. Differences are also reflected in diagnostic data, image data, health conditions, disease histories, etc.
Reference ranges of more than five hundred health properties are published [Access Medicine]. The measured value of each health property for any person will fall a distinctive point of the range shown in Table S1 (D, Sup.). If each reference range is divided into N levels which could be resolved by detection resolution, the total number of variants of all health ranges would be the product of all possible numbers of all reference ranges. The recognized chemicals in the reference range table are not complete. All departures of a person’s measurements of health properties from the population’s means are necessary to correct a genetic weakness or to maintain the phenotype and thus are presumed to be important in maintaining health and prevention of diseases.
Personal differences must be considered in treating chronic diseases. First, when persons are sufficiently different, they cannot be treated as the same or similar units in a clinical trial because their differences can interfere with the measured health properties in the trial. Second, the values of health properties cannot be used as parameters for predicting chronic diseases. Such health properties cannot be correlated to conversion rates of metabolites and net size on of final size of tissue structures. Conversion rates of metabolites normally depend on multiple parameters. Health properties may fluctuate on a daily, weakly, monthly or yearly basis within the lowest and highest ranges. Chronic diseases may arise when health properties in a person depart from optimal values for sufficient duration. Cures to such diseases would require correcting such small departures. Finally, personal differences, which is being reflected in the health properties shown in Table S1, affect both the disease process and healing process. Personal numbers such as vitamins levels, heavy metals, HDL, LDL, cholesterol, platelet count, red blood cell, white blood cell count, fatty acids levels, glucose levels, triglycerides, etc. can be altered by changing a large number of lifestyle factors.
From conducting a review of clinical trial development history, we found that none of the old studies we could find discussed personal differences, interfering factors, and their effects on a weak treatment effect (B, Sup.). In a traditional clinical trial, the treatment effect is much stronger than the experimental error so interfering factors will not alter the trial outcome (Figure 1). Absolute reference in our figures is an imagined health property that could be measured when the treatment is not applied. Since a chronic disease is caused by lost balances, the absolute references define the disease state when those deviated balances such as excessive omega 6/3 fatty acid ratio, excessive heavy metal levels, physical inactivity, abnormal gut microbiota, lack of dietary fibers, abnormal emotional state, etc. are not corrected.
An absolute reference exists in a patient, but could not be applied to a population. It may be used to a small number of “sufficiently similar patients” only if the research focus is limited to a small number of interfering factors.
In studying a strong treatment effect (Figure 1), an assumption can be made that all persons can be treated as identical units because the strong treatment effect cannot be distorted by interfering effects in meaningful amounts. Any differences caused by personal differences are so small that they can be properly neglected. In this situation, randomization is sufficiently good. The justification of the use of clinical trials is a good approximation. After the error and interfering factors are summed up, resulting in a new distribution underline 3 (E, Sup.), the treatment effect is still much stronger than the combined effects of the error and the interfering factor. Even if many interfering factors exist, their effects could still be neglected.
In studying a chronic disease (Figure 2), the treatment effect is weak relative to two interfering factors shown under the line 2. When the two interfering factors and the error are summed up, they generate an apparent error distribution under the line 3. The mean of this apparent error is the sum of the means of the error and the means of the two interfering factors. Without considering the interfering factors, the trial is to find the differences between the treatment and the error under line 1. If the interfering factors are considered, the trial determines the treatment effect under line 4 relative to the apparent error under line 3. The trial may be unable to find the treatment effect if the data comes out with the treatment’s effect at a lower tail region and the error at the upper tail region.
In the worst situation (Figure 3), the effect of one or more interfering factors is larger than the effect of a treatment. In this case, the error under the line 1 and the interfering factor under the line 2 merge to become a large apparent error with large variances under line 3. The treatment and the apparent error have a large overlap region (if the profile under 3 is moved onto line 4 horizontally). A trial may come out with the treatment effect falling at the lower tail region while the apparent error at the upper tail region, resulting in a finding that treatment is negative relative to the control. This result is clearly against the model assumes that the treatment has a weak effect indicated by the letter A.
When the weak treatment overlaps the apparent experimental error as shown in Figures 2,3, the trial is meaningless. Nothing can correct this problem that arises from breaching the basic presumption that the treatment effect must be much larger than the experimental error.
Figure 4 shows how an optimization trial by including the interfering factor (which appears in Figure 3) as part of the treatment will dramatically improve the chance to determine the treatment effect. Optimization with both the original treatment and the interfering factor will reduce the variances of the apparent error and increase the difference (designated by A+B) between the mean of the whole treatment package and the control.
In studying chronic disease, all persons must be regarded as different.
Cancer provides the best example in this regard. Each tumor is unique due to the genetic and epigenetic basis and exogenous exposures such as dietary and lifestyle factors . If a treatment protocol developed from the population data can be used to cure the disease of a particular person, one would have to wrongly argue that the health properties of the person are unimportant to diseases, and phenotypes can be freely changed. Health properties are not quantities that can be summed up and averaged, and a treatment protocol based on population data cannot be applied to any specific person as cures for chronic diseases. This might be the reason why medicine could not find cures by using clinical trials.
If a statistical analysis of clinical trial data yields a “significant difference” over a large number of interfering factors, such a treatment must be very strong. It could be unlikely for such a strong treatment to correct many weak causal causes for chronic diseases. This might be a reason that treatment protocols from clinical trials can control symptoms quickly, but are unable to restore sophisticated balances in human bodies.
Massive differences among individual persons are anticipated to affect the accuracy and reliability of clinical trials required for studying and characterizing chronic diseases. In a large clinical trial, a measured health property such as survival time or hazard ratio depends on the nature of the disease, the effect of the treatment, all uncontrolled interfering factors, and their interactions. Naturally, all those factors are added to the error term. The final conclusion of the trial depends on the treatment effect relative to the bloated error term. If many factors are not controlled, the presumption that the treatment effect is much larger than the experimental error fails and the result is incorrect. We will show a list of uncontrolled factors that can be seen in clinical trials.
The above table shows only a few exemplar factors that normally influence chronic diseases including cancer. The exact working mechanisms are unimportant to our analysis. Those factors affect a treatment result for chronic disease if the treatment is evaluated by measuring a health property such as survival time, hazard ratio, chemical analysis data, structure’s size, biochemical process speeds, etc. They affect measured health properties by causal effects or by influencing one or more causal factors. Some factors may work like confounding factors.
Variances of each factor arise also from an error in measuring the factor and the mechanisms at which the factor affects the measured health property. For example, it is impossible to accurately measure the intensity, amount, and duration of exercise. Even if exercise were used with precise accuracy, actually delivered effects on the health property would depend on personal conditions.
Surgery is considered a factor influencing cancer cell re-population by different mechanisms. Exercise is found to be an important adjunct therapy in the management of cancer-based on a large number of studies . Physical inactivity is one important cause of most of 35 chronic diseases . Chronic stress can dramatically speed up cancer metastasis [11,12]. A prior surgery can dramatically alter the body’s ability to resist cancer return growth speed [13-16]. Age affects cancer incidence rate by a sixth power . Age, body mass index, dietary saturated fat, and EPA and DHA omega-3 fatty can affect the body’s inflammation potential . Many uncontrolled factors may be magnitudes stronger than treatment’s effects when their effects are looked at in the long term.
In clinical trials, most of those factors are not controlled or cannot be accurately controlled. For example, surgery cannot be well controlled. If patients in a typical trial have been operated on previously, the amount of tissue loss and surgical locations are dictated by medical needs. Ages may be classified by age groups but their effects cannot be well controlled due to personal differences. Besides, two persons at the same age may have very different biological ages. Most lifestyle factors cannot be measured accurately and thus are anticipated to have different effects. Since people have different lifestyles, their prior lifestyles may have residual effects on health properties after their lifestyles are changed per required treatment.
If a clinical trial is designed to study a weak factor, tens to hundreds of other uncontrolled factors with similar levels of effects are “bundled” into the error term. All of those factors affect human subjects in both the treatment and the control; and due to randomization, they do not cause a meaningful difference between the treatment’s mean and the control’s mean. Each interfering factor raises both the mean and variances of the apparent experimental error term (Figures 2,3). We will show that statistical analysis not only fails to correct such a problem but makes the problem worse by failing to recognize weak treatment effects.
A randomized control trial does not automatically deliver a precise estimate of the average treatment effect, and it yields an unbiased estimate for the sample selected for the trial . They discussed many problems but did not discuss the inherent biases that arise when a treatment effect is weak while multiple interfering factors exist. Accordingly, no attempt has been made to understand the merit of using the multiple factors optimization method.
One common type of statistical analysis is to compare the mean of treatment with control by conducting a hypothesis test. Our simulation shows that the statistical outcome depends on the degree of data dispersion. In cancer cases, if the survival times become more widely dispersed, the point (often call t- statistic or F- statistic, etc.) for rejecting the null hypothesis will shift toward a high value. This means that a weak treatment effect will be rejected as random errors at high chances (see all examples in F-I, Sup.). This can be seen from Figures 2,3 as well.
In conducting a hypothesis test by using t distribution, a health property is observed before treatment and after treatment. The paired difference is used in conducting a hypothesis test. The rejection point depends on how patients respond to the treatment similarly. If they respond to the treatment in the exactly same way, even a small treatment’s effect can be recognized. However, large differences in patients’ responses will cause the rejection point to move toward a large value for the same p-value and thus fails to recognize the effect of the treatment (F, Sup.). In conducting two populations’ mean test, large differences within each treatment group will cause the rejection point to shift toward a large value (G, Sup.).
In conducting variance analysis, uncontrolled interfering factors affect the health property to be measured. The test outcome depends on the ratio of the variances of the treatment to the variances of the random error. If interfering factors are not controlled, they will go into the error term and thus reduce the ratio of treatment variances to error variances. The uncontrolled factors cause the F statistic to shift to a lower value so that the F test will be more likely to accept the null hypothesis (H, Sup.).
Interfering effects of uncontrolled factors cannot be corrected by any other statistical analysis method including χ2 goodness-of-fit test, common frequency fest (J-K, Sup.). Some statistical methods take into account only sampling drawing errors, and others may address specific problems, but none have the power to correct this fundamental flaw that must be addressed by raising measurement accuracy. The problem cannot be cured by any methods such as randomization and stratification (L-M, Sup.). Simpson’s Paradox is also powerful proof that different persons cannot be treated as the same in a clinical trial (N, Sup.).
Prior studies on the benefits of randomization in clinical trials are focused on how randomization reduces systematic biases [20,21] and prevents selection biases . When human subjects are randomized, all interfering factors that affect trial outcomes can be similarly allocated to the treatment group and the control group. This similarity in their effects allows for statistical inferences on the treatment effects . While those points are correct in the context of studying a strong treatment as shown in Figure 1, the method does not work when their effects are closer to the treatment’s effect. They did not consider how combining multiple factors as a single treatment can dramatically raise the capability to detect treatment effects.
Simpson’s Paradox can be fully explained by the interference factors. An effect occurs when the marginal association between two categorical variables is qualitatively different from the partial association between the same two variables after controlling for one or more other variables. The real cause of Simpson’s Paradox is large variances at personal levels, and health properties from different persons cannot be studied in a model. In characterizing a chronic disease, each person must be treated as a unique system. A distinctive regression curve is presumed to exist for each person. When data from different people are pooled in conducting a regression analysis, it is an attempt to find a regression curve among different systems. Such a regression curve cannot be right except by accident. The regression pattern will change when one or more important factors are controlled. Parameters from the regression may be applicable to a population, but the population does not have diseases. Thus, any treatment developed using population data cannot cure diseases for any specific person. The pattern of Simpson’s Paradox implies that such regression data is improperly combined.
The problem discussed above is rooted in the fact that massive personal differences in clinical trials affect a measured health property. No statistical analysis, and nor any other methods under the Sun can correct this problem, which is like a bad laboratory report which is based on data generated by using an erratic household scale. Causal and interfering factors include health factors that patients can correct and factors that patients cannot change. Some interfering factors are called covariates; and some examples include sex, age, trial site, disease characteristics, disease prognosis, etc. . A presumed fix is to achieve balance among treatment and control arms with the hope that the conclusions of a clinical study are not sensitive to covariates. However, none of the proposed methods in the CHMP Guideline can correct the biases of clinical trials because those measures cannot reduce the variances of the error term. In another study, attempts have been made to evaluate different methods for correcting baseline imbalances . They focused only on pre- and post- treatment scores and how different analytical methods affect biases but did not address how interference factors distort the true effects of weak factors. The problem cannot be addressed by co-variance analysis.
We also show that health properties are not the types of things that can be summed up and averaged (O, Sup.). Good personal health is achieved by maintaining sophisticated balances. Beneficial effects and adverse or negating effects happen in different patients, and they cannot be averaged in reality. This unique problem arises in the context of characterizing chronic diseases. It is safely assumed that chronic diseases are caused by imbalances, which can be caused by a disturbance in two opposite directions. Each biochemical pathway must be maintained at a proper speed relative to other pathways, and changing the speed of this pathway in either way can disturb this balance. The same amount of qualitative change from a right pathway speed in one person cannot be used to compensate for the same amount of change in an opposite way in another person. However, statistical analysis is based on an assumption that health properties are fungible and transferable and thus can be summed up and averaged. This assumption cannot hold in reality. An identical amount of departure from the population’s mean has different impacts on different patients. The same amount of change may cure, hurt or kill a patient, depending on the specific conditions of the person.
Statistical analysis is based on an oversimplified and unrealistic assumption that health properties can be treated as fungible property. The statistical analysis adds negating effects to the sum of the treatment and thus lowers the treatment’s mean. This also results in wrong results like the sum of 20% positive effects and 20% negative effects is equal to no effect. In reality, one can avoid negating effects by avoiding applying the treatment to mismatched patients and can deliver 20% positive effects. For those obvious reasons, a treatment protocol developed from a clinical trial predictably fails to work on real human beings. Optimization focusing on a single patient is the only way to avoid this fundamental flaw. This problem is less critical when health properties among “sufficiently similar subjects” are summed up or averaged to get rid of fluctuations caused by uncontrollable errors.
Based on a hypothetical model study, where each of the k factors can influence the health property by the same amount, using k factors as a treatment is superior to the treatment using a single factor (P, Sup.). If each of the k factors has the same treatment effect and same variances in the health property and is similar to the experimental error, using an optimization trial to optimize the heath property by using k factors will raise treatment effect by k times, and raises the T statistic, Z statistic, and F statistic by about k*√k. The sensitivity and ability of a hypothesis test to detect the true treatment effect increase with the number of interfering factors.
When the total number of factors is increased from 1 to 2, 5, 10, and 100, all statistics will increase by 2.8, 11.2, 32, and 1000 times.
In conducting a valid experiment, one fundamental requirement is that the accuracy and reliability of detection technologies for detecting a treatment’s effect must be sufficiently higher than those for detecting the experimental error. This requirement can hold only in studying strong treatments that can stand out over the apparent experimental error. In medical research, this requirement becomes that detectable treatment’s effect must be much larger than the apparent experimental error. The failure of this presumption in clinical trials can be rephrased as one that the alternative hypothesis (the effect attributed to treatment) is too close to the apparent experimental error so that the data set tends to come out with its test statistic falling within an acceptance region. This results in an outcome of failing to recognize a weak treatment effect.
All statistical analysis methods are premised on the model assumptions, and every model assumption including the test hypothesis must be correct . The fluctuations caused by beneficial and adverse/negating effects among different patients are not the same as drawing errors or true random errors in typical statistical models. The effects of uncontrolled factors may be merged into the experimental error only if the total experimental error is still sufficiently smaller than the treatment’s effect. Big data dispersion in the statistical analysis may not be ignored . When the experimental error in a clinical trial is close to the treatment’s effect, such a trial will generate meaningless results.
Lack of required accuracy and reliability is inherent in clinical trials used to characterize chronic diseases. Chronic diseases, by definition, progress slowly. This means that changes in any measured health property such as hazard ratio, organ function, survival time, or other measured chemical data in any given time interval are infinitesimally small. Thus, the accuracy and reliability required to accurately characterize chronic diseases are much higher than those for studying acute diseases.
Compared with mechanical systems such as cars, planes, etc, human beings are the most unfit subjects for clinical trials because of a massive number of genetic differences and phenotypes [7,8]. In addition, the personal differences are further increased by different emotional states of human beings. Since the massive personal differences in clinical trials interfere with the accurate assessment of any health properties, it is impossible to detect weak and slow-delivering treatment effects. By using clinical trials, medical researchers cannot accurately determine what can cure chronic diseases and what harms personal health in the long term.
Statistical analysis has been widely abused in a long history . Misuse of statistical analysis in medical research is a well-known problem that has been discussed in a large number of studies [28-32]. Problems discussed in those cited studies are in addition to the model flaws we have discussed above.
Our simulation results from all different models consistently show that the effects of clinical trials are one-way biased when the trial is used to evaluate a weak treatment. The averaging operation tends to reduce the treatment mean and this effect is not reflected in any assumption in basic statistical models. The statistical mean, µs, must be smaller or much smaller than µb, the actual beneficial mean when the treatment is correctly used only to -matched patients (For example, Vitamin D is used only on those with Vitamin D deficiency but not on those with Vitamin D excess). This effect is described by a degrading factor g=µs/µb, which reflect the degree of “indiscriminate application” of the treatment. This value is in the range of 0 to 1. Statistical analysis is unfit for studying chronic diseases. If a measured health property is influenced by multiple interfering factors, a study focusing on a single treatment with other factors randomized will increase the chance to reject the treatment as having no effects on the health property. Hundreds of trials, with each focusing on one single factor, will result in failure to find an effect for any of the factors.
Clinical trials distort hypothesis tests by enlarging the error term and statistical analysis reduces the treatment’s effects by averaging effect. They both work in favor of rejecting the treatment. If clinical trial results in rejection of the null hypothesis, the finding will likely stand except that the true treatment effects may be actually larger than determined values. However, if a hypothesis test outcome is acceptance of the null hypothesis, it may be wrong due to the negating effects and interfering effects. Therefore, conclusions in a good number of published studies should be interpreted differently. This one-way bias can be traced to the irreconcilable conflicts among massive personal differences, required high measurement accuracy and reliability, weak and slow effects of treatments for chronic diseases, and the unique roles of imbalances in chronic diseases.
Personal health is influenced by diets, nutrition, exercises, mind regulation, chronic stress, fears, etc. Many of those factors work like double-edged swords: they can benefit some patients, but hurt others if they are misused to destroy some established balances. The effects of nutrition and diets are expected to be highly random and unpredictable due to different personal lifestyles. In such a trial, the apparent error is inflated by dietary factors such as uncontrolled interfering factors.
Findings from a clinical trial represent only an abstract population and are inapplicable to real patients as far as chronic diseases are concerned. A large number of factors in diet, lifestyles, exercise and emotional states, etc. can affect cancer outcomes, and thus, each study focusing on one single or a few factors will result in rejecting each factor as a potential treatment.
By creating false acceptances, misused statistical analysis keeps rejecting weak and slow-delivering treatment effects. This explains why a clinical trial could not positively affirm a single lifestyle factor’s curative benefits even though it is found to be a significant risk factor of the disease in other types of long-term studies.
Clinical trials are primarily responsible for promoting mainly surgery, synthetic drugs, radiation as “scientifically valid” treatments and rejecting potentially tens of thousands of non-medical weak and slow treatments, which would be one to two orders magnitude more powerful if they are used collectively in optimization trials.
Clinical trials are most probably the main culprits that preclude mankind from finding cures for chronic diseases. It is reasonable to infer that clinical trials are in part responsible for creating current national health epidemics in the U.S., China, and many other nations in the world.
A serious problem is the cumulative toxic effects of environmental pollutants, contaminants, food additives, pesticide residues, herbicides, industrial chemicals, etc. By focusing on a single toxic agent in each trial, each such study cannot catch a weak and slow-delivering toxic agent. However, multiple toxic agents always work together in human bodies. A negative finding could be “caused” by the interference of other similar or stronger toxic agents and similar or stronger interfering effects. Most known toxic substances co-exist in human bodies. If a hundred similar factors are studied at the same time, Z statistic, T statistic, and F statistic could be 1000 times more than a counterpart in the clinical trial focusing on a single factor. When a large number of similarly harmful factors attack the human in the control, each of the toxic agents is naturally hidden as “the experimental error.” However, several, tens, or even hundreds of toxic agents can slowly damage human bodies. This single toxic agent can be identified only if all those toxic substances are not present in trial subjects. Findings from studying one or few toxic agents at a time do not reflect the real damages of multiple toxic agents to the human body.
To find a cure for chronic disease, a required capability is determining which factors can speed up the disease’s progression and which can slow down or reverse its progression. Considering massive differences among human subjects and a large number of interfering factors, clinical trials are unfit for establishing treatment protocols. Optimization trials using multiple factors as treatment provide much better chances for finding cures for chronic diseases. We will show three huge gains below.
First. the biggest gain from using an optimization trial is to avoid negating effects caused by indiscriminate application of the treatment. For a single factor treatment, an optimization trial can raise beneficial effects by (1/g), where g=µs/µb. It can be 1 to any reasonable number (See treatments C to G in Table 7S, Sup.). In clinical trials, the same treatment is indiscriminately used on all patients in the treatment group, many lifestyle factors can disturb various balances in two opposite directions. If those factors are randomly used against all patients in the treatment group or subgroup, their true beneficial effects on some patients can be “nullified” by their negating effects on other patients (per the analysis for the model in Table S7). In an optimization trial, controllable factors are used as part of treatment and are used on only the patients who need them. Sufficiently similar patients are selected in such a trial.
Second, we have shown that a large number of interfering factors directly interfere with clinical trials. They have different levels of effects and different variances. They can be used as part of a treatment package for chronic diseases. Thus, a wise strategy is to include multiple factors that would affect disease outcomes as a treatment package. The apparent error distribution in a patient can be estimated by the mean, μt=μ1+μ2+….,+μk, and variances, σt2=σE2+σ12+σ22+,….,σk2 if all interfering factors are not used as part of the treatment. An improvement can be achieved by using Model B. By bundling all controllable co-causal and interfering factors as a treatment package, the total treatment’s effect is raised by k times while the error variances are reduced by about √k according to Medal B analysis.
The total gain in treatment effects existing in an optimization trial over a clinical trial is (1/g)*k while all test statistics such as T statistic, Z statistic, and F statistic used in hypothesis tests are increased by (1/g)*k*√k, where 1/g is attributed to avoiding negating effects, k is attributed to the additive effect of multiple treatment factors, and √k is attributed to a reduction in the error variances. Their collective impacts could be huge. This conclusively shows why medicine could not find “scientific evidence” for any treatment based on a single lifestyle factor.
This conclusion is backed up surprisingly by all of the simulation results for three hypothesis tests in every model we used in Supplement. Thus, we assume the gain is an inherent estimate (but not a precise number due to the complexity of the human body). Moreover, the actual gain is predicted to be more than (1/g)*k or (1/g)*k*√k.
If interfering factors are matched to patients, true gain in the treatment effect is more than k times. We assume that some adverse effects which cannot be directly measured are not reflected in the negating effects and thus could not get into the g value. The true treatment effects could be further raised by avoiding adverse side effects. In contrast, optimization trials are good for using lifestyle factors, natural remedies, and mild or safe environmental factors, they do not implicate serious side effects. Even though the variances of the treatment’s mean X could approach zero, the √k term most probably cannot be ignored by approximation in studying chronic diseases.
The inevitable conclusion of clinical trial’s invalidity is strongly resonant with ancient medical practices such as herbal formulations and practices under the ancient holistic model. This ancient holistic model stresses the need to work on the whole body by using a large number of natural compounds or multiple treatment methods.
Based on the strength of our evidence as a whole, we reject clinical trials as a misused wrong experimental method for studying weak and slow health properties and propose optimization trials as a replacement. Optimization trial is suitable for studying weak, slow-delivering, and natural remedies, but may not be used to study the side effects of single or a few synthetic drugs.
One solution is using a single human subject in a clinical trial. In this case, a control cannot be found because no two persons are similar in the world. Thus, the person’s condition before the treatment is used as a control. This is essentially what was once used in ancient medical systems. The treatment effect is assessed by comparing the health properties before the treatment and after the treatment. One problem is that if the treatment lasts a long time, the aging process can interfere with trial results and other previously used treatments may influence the current treatment. Some practical adjustments must be made. The trustworthiness of findings should be established by replicating the same trial several times. Acceptance of this approach would require examining the rationale of using clinical trials. The notion that treatment is good for all people in the population is clearly incorrect as far as chronic diseases are concerned.
An alternative solution is controlling all influencing factors in a mini-trial so that the variances from interfering factors are minimized. It is difficult to get rid of the massive personal differences which are presumed to interfere with trial outcomes. What could be achieved in practice is using “sufficiently similar subjects” in the mini-trial. To investigate a treatment in a trial, all significant co-causal and interfering factors including those listed in Table 3 and other known factors should be controlled. For example, relevant genetics, diet, exercises, toxic agent levels, medication use history, sex, age, race, emotional states, etc. are controlled. When the variances from those factors are controlled, the trial’s sensitivity will be dramatically increased. By using sufficiently similar subjects, weak, slow-delivering and natural treatment effects can be detected with increased sensitivity. To see whether the treatment works on patients with similar important health properties, a second or third mini-trial is conducted. After a series of mini-trials has been done, a researcher can see when the treatment works and under what conditions the treatment works.
Personal genetics and emotional states are difficult to control. Personal genetics can be controlled by selecting human subjects. To control those factors, one should focus on their nexuses to measure health properties. If a treatment works on a particular biochemical process, subjects with known genes that control the process should be selected or voided, but other genes with little effects may be neglected.
Emotional states should be stressed if they are predicted to play significant roles in influencing measured health properties. When human subjects are nearly identical, variances attributable to personal differences will be dramatically reduced. In personalized medicine, randomization, subject selection bias, statistical analysis has limited utility and should not control experimental designs.
When clinical trials involve a small number of sufficiently similar patients, statistical analysis should not be concerned. When all significant factors are controlled, measured health properties may be treated as ordinary variables and thus statistical analysis can be avoided or used as a mere causal check. P-values 0.5±0.15 (or any other suitable numbers) may be used because the trustworthiness of trial findings is established by replicating mini-trials. For a single person trial, statistical analysis cannot be used in most situations unless the study purpose is examining things caused by instruments or sampling technologies, and the trustworthiness of findings should be established by repeating the same or similar trial. All details on controlled factors should be documented for replicating the trial.
The single-person or mini optimization trial can be used to study a combination of factors. Cancer is clearly responsive to lifestyle changes involving a large number of factors. When tens to hundreds of factors are controlled, their combined effects are added up in some ways while co-causal and interfering factors are dropped out from the error term. All co-causal and interfering factors are used to promote healing in the treatment arm. Such an experimental design will dramatically increase the detection sensitivity of the trial and raise the treatment’s effect.
Optimization trials are superior to clinical trials for studying weak effects. By recognizing the validity of single-person trials and mini optimization trials, personal medical miracles can be conveniently studied. The research focus is not on experimental designs, evidence quality, statistical analysis, selection bias, rejection criteria, etc, but the delivery of predictable cures which can be tailored to all specific patients including “minority patients.” This mission cannot be accomplished by the indiscriminate application of treatments in clinical trials.
Our findings are not applicable to clinical trials, the findings of which are not used as the basis for treating diseases. If the purposes of research are to explore costs and resource allocations, their validity is not subject to the same analysis. Also, if clinical trials are used to study disease mechanisms as a way to control health costs, they still provide useful information for policymakers.
It has routinely assumed that measured health property in a trial is mainly attributed to treatment. However, this presumption is always breached in studying chronic diseases. Thus, our findings are applicable to any clinical trial. When a weak treatment plus at least one interfering factor affects the measured health property, the validity of trial outcomes depends on the relative size of the treatment to those of the interfering factor. Moreover, concerning chronic diseases, health properties are different from person to person. This implies that a true cure must be formulated for each specific patient, and treatment established by population trials cannot restore balance for specific patients except by accident.
We note that the effects of interfering factors are not linearly additive, their effects may vary in degrees, their variances are not similar, their distributions may be not normal, and many factors may interact with each other in complex ways.
However, they affect the mean and variances of the experimental error in certain ways. The effects from all interfering factors are added up linearly or non-linearly. When the causal and interfering factors are bundled into the error term, they ruin the trial. If they are bundled into a treatment, test statistics increase as a result of the addition of all co-causal and interfering factors and are further enlarged by an empirical multiplying factor that is attributed to the reduced variances of the apparent experimental error.
If a clinical trial’s design breaches any core assumption, its findings are incorrect for that reason. If the breach is sufficient to change trial outcomes, the trial is invalid without regarding the validity of our findings. Thus, whether or not those assumptions used in our models hold will not affect our conclusions. Our findings underscore the importance of adhering to model presumptions in designing clinical trials and conducting statistical analysis.
By examining the machine repairing model and accuracy and reliability requirements for studying chronic diseases, we found that the one-treatment-for-a-population approach is flawed as far as it is used in studying chronic diseases. Clinical trials are good only if the treatments under study are sufficiently strong or when all human subjects can be treated as “nearly identical units” as in classical probability trials or classical clinical trials. None of the two conditions are met when clinical trials are used to characterize chronic diseases. Randomized clinical trials are unable to deliver the required accuracy and reliability due to the massive personal differences attributable to genotypes, phenotypes, and emotional states of individual persons.
We further found that clinical trials and statistical analysis are fundamentally flawed on multiple grounds as revealed in numerous hypothetical models such as a multiple causes/treatments model, multiple interfering factor model, two population means hypothesis test, paired data hypothesis test, F-test in variance analysis, etc. We found that health properties are not the types of fungible things that can be summed up and averaged because all human beings must be treated as different things. Beneficial effects and adverse effects happen in different persons with different meanings, and cannot be averaged in reality. Statistical analysis degrades the performance of the treatment by averaging beneficial and negating effects within each treatment or subgroup. This averaging operation dramatically degrades the treatment effects. In conducting statistical analysis, the poor accuracy problem becomes one in that the total experimental error is closer or even larger than the treatment’s effects under the alternative hypothesis. Both the means and the variances of randomized and uncontrolled co-causal and interfering factors are added to those of the error term as an apparent error. When the apparent “error” is far too large relative to the effects of the treatment, the data set tends to come out with test statistics falling on the region of acceptance of the null hypothesis, thus resulting in false acceptance of the null hypothesis or false rejection of true treatment effects. Those fatal flaws are expected to be present under most circumstances. No statistical method, no other methods under the Sun, can ever correct this great bias that arises from breaching the core presumption used in the statistical model. Thus, clinical trials are invalid and have been misused in studying chronic diseases.
Our model analysis shows that optimization trials can dramatically increase chances to determine treatment effects than randomized clinical trials. Based on a multiple interfering factor model, where k co-causal or interfering factors can influence a measured health property by the same degree, a treatment package using all k factors is much better than using a single treatment. If each of the k factors has the same treatment effect and same variances, an optimization trial to evaluate the heath property by using all k factors will raise the total treatment effect by (1/g)*k times than a randomized trial (where g is a degrading factor caused by misapplication of treatment to patients, with its value from 0 to 1), and raises T statistic, Z statistic or F statistic by about (1/g)*k*√k. Assuming that a treatment has no negating effects, when the total number of the factors is increased from 1 (without any interfering factor) to 2, 5, 10, and 100, T statistic, Z statistic, and F statistic will increase by approximately 2.8, 11.2, 32, and 1000 times. Moreover, by avoiding negating effects, an optimization trial using k factors as a treatment package can raise treatment effect potentially by one to several orders of magnitude relative to randomized clinical trials. The gain cannot be eliminated by increasing the patient number in the trial. The findings show why studies using clinical trials cannot produce “scientific valid” evidence in support of using a single lifestyle factor as a cure for chronic disease.
The misuse of clinical trials is predictably responsible for the failure to find treatment effects for chronic diseases and failure to identify harmful effects of toxic compounds in the environment. In sum, the clinical trial should be rejected because it offers no chance to find cures under any of our theoretical and practical models mimicking real diseases. Our findings may be similarly applicable to randomized controlled trials used in social sciences, environmental studies, life sciences, etc. as long as those required conditions are met.
Supplementary information is provided.
Subscribe to our articles alerts and stay tuned.