# Statistics Topics

Alpha Risk
The probability of rejecting a null hypothesis when it is true. It is the probability of making a Type I error.
Arithmetic Mean
The arithmetic mean of n numbers is the sum of the numbers divided by n.
Autocovariance
This is the degree to which a function is correlated with itself as a function of time.
Autoregressive Moving Average Model
Essentially an all-pole infinite impulse response filter with some additional interpretation placed on it.
Average Deviation
The absolute value of the difference from the mean for each data value, summed, then divided by the number of values.
Bernoulli Trial
An experiment with only two possible outcomes, such as success or failure, heads or tails, or good or bad.
Beta Distribution
A distribution used for continuous random variables which are constrained to lie between 0 and 1.
Beta Risk
The probability of not rejecting a null hypothesis when it is false. It is the probability of making a Type II error.
Binomial Distribution
A distribution which gives the probability of observing successes in a fixed number of independent Bernoulli trials.
Capability Indices
Indices computed in the process capability procedure to measure how well a sample of data conforms to process specifications.
Class Boundary
A point that is the left endpoint of one class interval, and the right endpoint of another class interval.
Class Interval
The non-overlapping intervals of a histogram.
Complementary Probability
Considering probabilites in decimal form, the sum of two probabilites equal to one.
Concordant
A pair of cases for two ordered data variables in which values for the first case are either both higher or both lower than the values of the variables for the second case.
Conditional Probability
The probability of an event occurring given that another event also occurs.
Confidence Interval
A range of values which is believed to include, with a preassigned degree of confidence, the true characteristic of the lot or universe a given percentage of the time.
Confidence Level
The degree of desired trust or assurance in a given result.
Confidence Limits
The upper and lower extremes of the confidence interval.
Continuous Data
The values belonging to it may take on any value within a finite or infinite interval.
Controlled Experiment
An experiment that uses the method of comparison to evaluate the effect of a treatment by comparing treated subjects with a control group, who do not receive the treatment.
Covariance
A measure of the joint variability of a pair of numeric variables.
Cumulative Probability
The probability that a random variable will be less than or equal to a specified value.
Data
A series of facts or statements that may have been collected, stored, processed or manipulated but have not been organized.
Data Mining
Using automated data anlysis techniques to find themes or relationships.
Data Processing
The execution of a systematic sequence of operations performed upon data. Synonymous with information processing.
Deciles
The 10th, 20th, 30th, ...90th percentile points.
Discordant
A pair of cases for two ordered data variables in which the value of one variable for the first case is higher or lower than its value in the second case, and the relative relationship is switched for the second variable.
Distribution
A probability function which describes the relative frequency of occurrence of data values when sampled from a population.
Double Blind Experiment
Neither the subjects nor the people evaluating the subjects knows who is in the treatment group and who is in the control group.
End Point Convention
In histograms, you need to decide where to count values that are on the exact boundary between two intervals: either in the left or in the right interval.
Experimental Probability
The chances of something happening, based on repeated testing and observing results.
Frequency Distribution
An organized display of a set of data that shows how often each different piece of data occurs.
Gaussian Curve
Normal Distribution.
Gaussian Distribution
A continuous probability distribution that often gives a good description of data that cluster around the mean.
Geometric Mean
A statistic calculated by multiplying n data values together and taking the n-th root of the result.
Harmonic Mean
The harmonic mean of two numbers a and b is 2ab/(a + b).
Histogram
A graphical display showing the distribution of data values in a sample by dividing the range of the data into non-overlapping intervals and counting the number of values which fall into each interval.
Inter-Quartile Range
For a list of numbers this is the upper quartile minus the lower quartile.
Joint Probability
The probability of two or more events happening at the same time.
Law of Averages
The average of independent observations of random variables that have the same probability distribution is increasingly likely to be close to the expected value of the random variables as the number of observations grows.
Law of Large Numbers
In repeated, independent trials with the same probability p of success in each trial, the percentage of successes is increasingly likely to be close to the chance of success as the number of trials increases.
Least Squares
Any statistical procedure that involves minimizing the sum of squared differences.
Lower Quartile
The 25th percentile, calculated by ordering the data from smallest to largest and finding the value which lies 25% of the way up through the data.
Maximum Likelihood Estimate
The most accurate maximum likelihood estimate is, by definition, the mode of a data set.
Mean
The sum of all values in the data, divided by the number of values.
Mean Time to Failure
The measured operating time of a system or component divided by the number of failures that occurred during that time.
Mode
The most frequently occurring value in a sequence of numbers.
Multimodal Distribution
A distribution with more than one mode.
Negative Binomial Distribution
A discrete probability distribution useful for characterizing the time between Bernoulli trials. Sometimes called the Pascal distribution.
Normal Distribution
A continuous probability distribution that often gives a good description of data that cluster around the mean. Also known as Gaussian Distributon or Bell Curve.
Odds
A statement of the probabilities that an event will or will not happen.
Ordinal Data
A set of data is said to be ordinal if the values belonging to it can be put in order or have a rating scale attached.
Pareto Distribution
A distribution used for random variables which are constrained to be greater or equal to 0.
Partial Correlation
A measure of the strength of the relationship between two or more numeric variables having accounted for their joint relationship with one or more additional variables.
Pascal Distribution
A discrete probability distribution useful for characterizing the time between Bernoulli trials.
Poisson Distribution
A distribution often used to express probabilities concerning the number of events per unit.
Population
The total number of unique values.
Probability
A number between 0 and 1 which represents how likely an event is to occur.
Process Capability
A measurable property of a process to the specification.
Pure Error
Variability between observations made at the same values of the independent variable or variables.
Quartiles
Statistics which divide the observations in a numeric sample into 4 intervals, each containing 25% of the data.
Random Experiment
An experiment or trial whose outcome is not perfectly predictable, but for which the long-run relative frequency of outcomes of different types in repeated trials is predictable.
Randomised Controlled Experiment
A controlled experiment in which the assignment of subjects to the treatment group or control group is done at random, eg by drawing straws.
Rayleigh Distribution
An example is the variation of wave height in a sea where swell is the main component.
Relative Standard Deviation
A measure of precision, calculated by dividing the standard deviation for a series of measurements by the average measurement.
Residual
The observed value minus the predicted value.
Residual Plot
A plot of the residuals from the regression against the explanatory variable.
Sample
A set of observations, usually considered to have been taken from a much larger population.
Sample Size
The number of elements in a sample from a population.
Sample Survey
A survey based on the responses of a sample of individuals, rather than the entire population.
Skewed Distribution
A distribution that is not symmetrical.
Spatial Sampling
Sampling in two or more dimensions.
SPC
Abbreviation of Statistical Process Control.
Standard Deviation
Standard deviation is the square root of the variance.
Standard Error
The standard deviation divided by the square root of the number of data values.
Standard Normal Distribution
A normal distribution with a mean equal to 0 and a standard deviation equal to 1
Statistic
Anything that can be calculated from a sample of data.
Statistical Model
A statistical model is used to describe the relationship between a dependent variable Y and one or more independent variables.
Statistical Process Control
Statistical techniques to measure and analyse the extent to which a process deviates from a set standard.
Statistics Books
Lists all Statistics Books in the Encyclopaedia
Statistics Calculations
Lists all Statistics Calculations in the Encyclopaedia
Statistics Conversions
Lists all Statistics Conversions in the Encyclopaedia
Lists all Statistics Weblinks in the Encyclopaedia
Student′s t Distribution
A probability distribution which is very similar in shape to the standard normal distribution.
Summary Statistics
A single number representation of the characteristics of a set of data. Usually given by measures of central tendency and measures of dispersion.
Target Population
The entire group a researcher is interested in, the group about which the researcher wishes to draw conclusions.
Theoretical Probability
The chances of events happening as determined by calculating results that would occur under ideal circumstances.
Time Series
A sample of data values collected at equally spaced points in time.
t test
A hypothesis test based on Student′s t distribution.
Type I Error
Incorrectly rejecting a true null hypothesis. The probability of such an error is the alpha risk.
Type II Error
Not rejecting a false null hypothesis. The probability of such an error is called the beta risk.
Unbiased
Having no bias.
Uncontrolled Experiment
An experiment in which there is no control group.
Upper Bound
A plausible upper limit to the true value of a quantity, usually not a true statistical confidence limit.
Upper Quartile
The 75th percentile, calculated by ordering the data from smallest to largest and finding the value which lies 75% of the way up through the data.
Variance
The square of the difference from the mean for each data value, summed and divided by one less than the number of values.
Weibull Distribution
A distribution used for random variables which are constrained to be greater or equal to 0.

Subjects: Mathematics