curvefit.com. Guide to nonlinear regression.Try our software free for 30 days.StatMate leads you step by step through power and sample size calculations.InStat is a less cumbersome alternative to typical heavy-duty statistical programs. With InStat, even a statistical novice can analyze data in just a few minutes.Prism is a powerful combination of basic biostatistics, curve fitting and scientific graphing in one comprehensive program.GraphPad Software. Data analysis and biostatistics resources.


spa

Table of contents
Intro to regression
Nonlinear regression
Curve fitting with Prism
Interpreting the results
Comparing two curves
Distributions of best-fit values


s

Why care?
Simulations
Dose-response example
Exp. decay example
Detailed instructions
Radioligand binding
Saturation binding
Competitive binding
Kinetics of binding
Dose-response curves
Enzyme kinetics
Standard curves
More information
Search curvefit.com


curvefit.com was created by GraphPad Software, Inc. Send comments or questions to the author of these pages, Dr. Harvey Motulsky, president of GraphPad Software.

In April 2003, GraphPad released Prism 4 and published Fitting Models to Biological Data using Linear and Nonlinear Regression. This book includes all the information that comprises curvefit.com, and much more. You can read this book as a pdf file.

Example simulation 2. Exponential decay.

Many biological and chemical events follow an exponential decay model. For more information on this model, see Example model 2. Exponential decay. We'll compare three ways to express this model. In all cases, one of the parameters is the starting point, which we will call Start . The second parameter quantifies how rapidly the curve decays. We will compare three ways to express this value, as a rate constant in units of inverse time, as a time constant in units of time, or as a log(rate constant). The three equations are:

MathType Equation

Which distribution is closer to Gaussian, rate constants or time constants?

Is it better to express the exponential decay equation in terms of rate constant, time constant, or log(rate constant)? We'll answer the question by simulating data.

First, we need to choose some parameters. We chose a curve that starts at Y=100 and decays exponentially towards 0 with a rate constant (koff) of 0.3 min-1 and a half-life of a bit more than 2 minutes (ln(2)/koff). Our simulations generated 10 data points equally spaced between 0 and 20 minutes, adding Gaussian random error with a standard deviation of 10. The graph below shows three sample simulations.

We simulated 5000 sets of data, and fit each data set to the exponential decay model expressed in three ways. The distribution of the rate constant, time constant, and log(rate constant) are shown in the following figures, which also superimpose ideal Gaussian distributions.

At first glance, all three distributions look roughly Gaussian. Looking more carefully, you can see that the distribution of time constants is skewed to the right. Careful scrutiny reveals that the rate constant distribution is also a bit skewed. These impressions can be confirmed by a normality test. The results are shown in the following table.

Model
Rate constant
Time constant
Log(Rate constant)
KS 0.06359 0.07169 0.01339
P value P<0.0001 P<0.0001 P > 0.10

The KS value is the largest discrepancy between the actual cumulative distribution and an ideal cumulative Gaussian distribution (expressed as fractions). See The results of normality tests. None of the distributions are far from Gaussian. The distribution of time constants is the furthest from Gaussian; the distribution of log(rate constant) is closest. The P value answers the following question: If the true distribution is Gaussian, what is the chance of obtaining a KS value as large as, or larger than, we observed. The distribution of both rate constants and time constants deviated significantly from the Gaussian ideal, while the distribution of log(rate constants) is indistinguishable from Gaussian.

Because we simulated so many data sets, the KS test has the power to detect even modest deviations from a Gaussian distribution.

How accurate are the confidence intervals of time constants and rate constants?Before understanding the complexities of the confidence intervals reported by nonlinear regression, first review the meaning of confidence intervals reported by linear regression.

For example, assume that a linear regression presented the CI of the slope as 8.2 to 11.3. If you can accept all the assumptions of the analysis, this means that you can be 95% certain this range includes the true slope. More precisely, if you analyze many data sets, you'd expect that 95% of the confidence intervals will contain the true value, and 5% will not. When analyzing a particular experiment, the true value is unknown so you can't know whether or not the confidence interval includes the true value. All you can know is that there is a 95% chance that the interval contains the true value.

With nonlinear regression, the situation is not so simple. The confidence intervals reported by Prism (and virtually all other nonlinear regression programs) are based on some mathematical simplifications. They are called "asymptotic" or "approximate" confidence intervals. They are calculated assuming that the equation is linear, but are applied to nonlinear equations. This simplification means that the intervals can be too optimistic. While labeled 95% confidence intervals, they may contain the true value less than 95% of the time.

Using simulated data, we can ask how often the reported 95% confidence interval contains the true value. To do this, we changed the script to also save the SE of the best-fit value. We then imported the best-fit values and standard errors into Microsoft Excel, computed the high and low limit of the confidence interval (see Confidence intervals of best-fit values and asked whether or not each interval contained the true value. When analyzing data, you don't know the true value so can't know whether the confidence interval contains the true value or not. With simulated data, you know the true value, so can answer that question. The results are:

Model Fraction of "95% confidence intervals" that contain the true value.
Rate constant
93.72%
Time constant
94.44%
Log(rate constant)
94.94%

No matter how we expressed the model, the confidence intervals contained the true value almost 95% of the time. The difference between 95% confidence and 93% confidence is unlikely to alter your interpretation of experimental results.                                                                                                                                                                                                                                                                                              

How to compare parameter distributions using Prism


All contents copyright © 1999 by GraphPad Software, Inc. All rights reserved.