![]() |
spa |
|
Example simulation 2. Exponential decay. Many biological and chemical events follow an exponential decay model. For more information on this model, see Example model 2. Exponential decay. We'll compare three ways to express this model. In all cases, one of the parameters is the starting point, which we will call Start . The second parameter quantifies how rapidly the curve decays. We will compare three ways to express this value, as a rate constant in units of inverse time, as a time constant in units of time, or as a log(rate constant). The three equations are:
Which distribution is closer to Gaussian, rate constants or time constants? Is it better to express the exponential decay equation in terms of rate constant, time constant, or log(rate constant)? We'll answer the question by simulating data. First, we need to choose some parameters. We chose a curve that starts at Y=100 and decays exponentially towards 0 with a rate constant (koff) of 0.3 min-1 and a half-life of a bit more than 2 minutes (ln(2)/koff). Our simulations generated 10 data points equally spaced between 0 and 20 minutes, adding Gaussian random error with a standard deviation of 10. The graph below shows three sample simulations. We simulated 5000 sets of data, and fit each data set to the exponential decay model expressed in three ways. The distribution of the rate constant, time constant, and log(rate constant) are shown in the following figures, which also superimpose ideal Gaussian distributions.
At first glance, all three distributions look roughly Gaussian. Looking more carefully, you can see that the distribution of time constants is skewed to the right. Careful scrutiny reveals that the rate constant distribution is also a bit skewed. These impressions can be confirmed by a normality test. The results are shown in the following table.
The KS value is the largest discrepancy between the actual cumulative distribution and an ideal cumulative Gaussian distribution (expressed as fractions). See The results of normality tests. None of the distributions are far from Gaussian. The distribution of time constants is the furthest from Gaussian; the distribution of log(rate constant) is closest. The P value answers the following question: If the true distribution is Gaussian, what is the chance of obtaining a KS value as large as, or larger than, we observed. The distribution of both rate constants and time constants deviated significantly from the Gaussian ideal, while the distribution of log(rate constants) is indistinguishable from Gaussian. Because we simulated so many data sets, the KS test has the power to detect even modest deviations from a Gaussian distribution. How accurate are the confidence intervals of time constants and rate constants?Before understanding the complexities of the confidence intervals reported by nonlinear regression, first review the meaning of confidence intervals reported by linear regression. For example, assume that a linear regression presented the CI of the slope as 8.2 to 11.3. If you can accept all the assumptions of the analysis, this means that you can be 95% certain this range includes the true slope. More precisely, if you analyze many data sets, you'd expect that 95% of the confidence intervals will contain the true value, and 5% will not. When analyzing a particular experiment, the true value is unknown so you can't know whether or not the confidence interval includes the true value. All you can know is that there is a 95% chance that the interval contains the true value. With nonlinear regression, the situation is not so simple. The confidence intervals reported by Prism (and virtually all other nonlinear regression programs) are based on some mathematical simplifications. They are called "asymptotic" or "approximate" confidence intervals. They are calculated assuming that the equation is linear, but are applied to nonlinear equations. This simplification means that the intervals can be too optimistic. While labeled 95% confidence intervals, they may contain the true value less than 95% of the time. Using simulated data, we can ask how often the reported 95% confidence interval contains the true value. To do this, we changed the script to also save the SE of the best-fit value. We then imported the best-fit values and standard errors into Microsoft Excel, computed the high and low limit of the confidence interval (see Confidence intervals of best-fit values and asked whether or not each interval contained the true value. When analyzing data, you don't know the true value so can't know whether the confidence interval contains the true value or not. With simulated data, you know the true value, so can answer that question. The results are:
No matter how we expressed the model, the confidence intervals contained the true value almost 95% of the time. The difference between 95% confidence and 93% confidence is unlikely to alter your interpretation of experimental results. |
| All contents copyright © 1999 by GraphPad Software, Inc. All rights reserved. |