Observed information

2013-06-18T18:34:52Z

38.109.87.242: /* Fisher information */ Elaboration that Fisher information corresponds to a single observation distributed according to the hypothetical model.

{{Refimprove|date=February 2008}}

The '''sample mean''' or '''empirical mean''' and the '''sample covariance''' are [[statistic]]s computed from a collection of data on one or more [[random variables]]. The sample mean is a [[vector (mathematics)|vector]] each of whose elements is the sample mean of one of the random variables{{spaced ndash}}that is, each of whose elements is the [[arithmetic average]] of the observed values of one of the variables. The sample covariance matrix is a square [[Matrix (mathematics)|matrix]] whose ''i, j'' element is the sample covariance (an estimate of the population covariance) between the sets of observed values of two of the variables and whose ''i, i'' element is the sample variance of the observed values of one of the variables. If only one variable has had values observed, then the sample mean is a single number (the arithmetic average of the observed values of that variable) and the sample covariance matrix is also simply a single value (the sample variance of the observed values of that variable).

==Sample mean==
{{main|Arithmetic mean|date=February 2013}}

Let <math>x_{ij}</math> be the ''i''th independently drawn observation (''i=1,...,N'') on the ''j''th random variable (''j=1,...,K''). These observations can be arranged into ''N''
column vectors, each with ''K'' entries, with the ''K'' ×1 column vector giving the ''i''th observations of all variables being denoted <math>\mathbf{x}_i</math> (''i=1,...,N'').

The '''sample mean vector''' <math>\mathbf{\bar{x}}</math> is a column vector whose ''j''th element <math>\bar{x}_{j}</math> is the average value of the ''N'' observations of the ''j''th variable:

:<math> \bar{x}_{j}=\frac{1}{N}\sum_{i=1}^{N}x_{ij},\quad j=1,\ldots,K. </math>

Thus, the sample mean vector contains the average of the observations for each variable, and is written

:<math> \mathbf{\bar{x}}=\frac{1}{N}\sum_{i=1}^{N}\mathbf{x}_i. </math>

==Sample covariance==
{{move portions|Estimation of covariance matrices|section=y|small=left|date=February 2013}}
The '''sample covariance matrix''' is a ''K''-by-''K'' [[Matrix (mathematics)|matrix]] <math>\textstyle \mathbf{Q}=\left[ q_{jk}\right] </math> with entries

:<math> q_{jk}=\frac{1}{N-1}\sum_{i=1}^{N}\left( x_{ij}-\bar{x}_j \right) \left( x_{ik}-\bar{x}_k \right), </math>

where <math>q_{jk}</math> is an estimate of the [[covariance]] between the {{math|j}}th
variable and the {{math|k}}th variable of the population underlying the data.
In terms of the observation vectors, the sample covariance is
:<math>\mathbf{Q} = {1 \over {N-1}}\sum_{i=1}^N (\mathbf{x}_i-\mathbf{\bar{x}}) (\mathbf{x}_i-\mathbf{\bar{x}})^\mathrm{T},</math>

Alternatively, arranging the observation vectors as the columns of a matrix, so that
:<math>\mathbf{F} = \begin{bmatrix}\mathbf{x}_1 & \mathbf{x}_2 & \dots & \mathbf{x}_N \end{bmatrix}</math>,
which is a matrix of ''K'' rows and ''N'' columns.
Here, the sample covariance matrix can be computed as
:<math>\mathbf{Q} = \frac{1}{N-1}( \mathbf{F} - \mathbf{\bar{x}} \,\mathbf{1}_N^\mathrm{T} ) ( \mathbf{F} - \mathbf{\bar{x}} \,\mathbf{1}_N^\mathrm{T} )^\mathrm{T}</math>,
where <math>\mathbf{1}_N</math> is an ''N'' by {{math|1}} vector of ones.
If the observations are arranged as rows instead of columns, so <math>\mathbf{\bar{x}}</math> is now a 1×''K'' row vector and <math>\mathbf{M}=\mathbf{F}^\mathrm{T}</math> is an ''N''×''K'' matrix whose column ''j'' is the vector of ''N'' observations on variable ''j'', then applying transposes
in the appropriate places yields
:<math>\mathbf{Q} = \frac{1}{N-1}( \mathbf{M} - \mathbf{1}_N \mathbf{\bar{x}} )^\mathrm{T} ( \mathbf{M} - \mathbf{1}_N \mathbf{\bar{x}} ).</math>

==Discussion==
The sample mean and the sample covariance matrix are [[Bias of an estimator|unbiased estimates]] of the [[mean]] and the [[covariance matrix]] of the [[random vector]] <math>\textstyle \mathbf{X}</math>, a row vector whose ''j''th element (''j = 1, ..., K'') is one of the random variables.<ref name="JohnsonWichern2007">{{cite book|author1=Richard Arnold Johnson|author2=Dean W. Wichern|title=Applied Multivariate Statistical Analysis|url=http://books.google.com/books?id=gFWcQgAACAAJ|accessdate=10 August 2012|year=2007|publisher=Pearson Prentice Hall|isbn=978-0-13-187715-3}}</ref> The sample covariance matrix has <math>\textstyle N-1</math> in the denominator rather than <math>\textstyle N</math> due to a variant of [[Bessel's correction]]: In short, the sample covariance relies on the difference between each observation and the sample mean, but the sample mean is slightly correlated with each observation since it's defined in terms of all observations. If the population mean <math>\operatorname{E}(\mathbf{X})</math> is known, the analogous unbiased estimate

:<math> q_{jk}=\frac{1}{N}\sum_{i=1}^N \left( x_{ij}-\operatorname{E}(X_j)\right) \left( x_{ik}-\operatorname{E}(X_k)\right), </math>

using the population mean, has <math>\textstyle N</math> in the denominator. This is an example of why in probability and statistics it is essential to distinguish between [[random variable]]s (upper case letters) and [[Realization (probability)|realizations]] of the random variables (lower case letters).

The [[maximum likelihood]] [[Estimation of covariance matrices|estimate of the covariance]]

:<math> q_{jk}=\frac{1}{N}\sum_{i=1}^N \left( x_{ij}-\bar{x}_j \right) \left( x_{ik}-\bar{x}_k \right) </math>

for the [[Gaussian distribution]] case has ''N'' in the denominator as well. The ratio of 1/''N'' to 1/(''N'' − 1) approaches 1 for large ''N'', so the maximum likelihood estimate approximately equals the unbiased estimate when the sample is large.

==Variance of the sample mean==
{{main|Standard error of the mean}}
For each random variable, the sample mean is a good [[estimator]] of the population mean, where a "good" estimator is defined as being efficient and unbiased. Of course the estimator will likely not be the true value of the [[Statistical population|population]] mean since different samples drawn from the same distribution will give different sample means and hence different estimates of the true mean. Thus the sample mean is a [[random variable]], not a constant, and consequently has its own distribution. For a random sample of ''N'' observations on the ''j''th random variable, the sample mean's distribution itself has mean equal to the population mean <math>E(X_j)</math> and variance equal to <math> \frac{\sigma^2_j}{N},</math> where <math>\sigma^2_j</math> is the variance of the random variable ''X''j.

==Weighted samples==
{{main|Weighted mean|date=February 2013}}
{{move portions|Weighted mean|section=y|small=left|date=February 2013}}

In a weighted sample, each vector <math>\textstyle \textbf{x}_{i}</math> (each set of single observations on each of the ''K'' random variables) is assigned a weight <math>\textstyle w_i \geq0</math>. Without loss of generality, assume that the weights are [[Normalizing constant|normalized]]:

:<math> \sum_{i=1}^{N}w_i = 1. </math>

(If they are not, divide the weights by their sum).
Then the [[weighted mean]] vector <math>\textstyle \mathbf{\bar{x}}</math> is given by

:<math> \mathbf{\bar{x}}=\sum_{i=1}^N w_i \mathbf{x}_i.</math>

and the elements <math>q_{jk}</math> of the weighted covariance matrix <math>\textstyle \mathbf{Q}</math> are
<ref name="Galassi-2007-GSL">Mark Galassi, Jim Davies, James Theiler, Brian Gough, Gerard Jungman, Michael Booth, and Fabrice Rossi. [http://www.gnu.org/software/gsl/manual GNU Scientific Library - Reference manual, Version 1.15], 2011.
[http://www.gnu.org/software/gsl/manual/html_node/Weighted-Samples.html Sec. 21.7 Weighted Samples]</ref>

:<math> q_{jk}=\frac{\sum_{i=1}^{N}w_i}{\left(\sum_{i=1}^{N}w_i\right)^2-\sum_{i=1}^{N}w_i^2}
\sum_{i=1}^N w_i \left( x_{ij}-\bar{x}_j \right) \left( x_{ik}-\bar{x}_k \right) . </math>

If all weights are the same, <math>\textstyle w_{i}=1/N</math>, the weighted mean and covariance reduce to the sample mean and covariance above.

==Criticism==
The sample mean and sample covariance are widely used in statistics and applications, and are extremely common measures of [[Location parameter|location]] and [[statistical dispersion|dispersion]], respectively, likely the most common: they are easily calculated and possess desirable characteristics.

However, they suffer from certain drawbacks; notably, they are not [[robust statistics]], meaning that they are sensitive to [[outliers]]. As robustness is often a desired trait, particularly in real-world applications, robust alternatives may prove desirable, notably [[quantile]]-based statistics such the [[sample median]] for location,<ref>[http://www.edge.org/q2008/q08_16.html#kosko The World Question Center 2006: The Sample Mean], [[Bart Kosko]]</ref> and [[interquartile range]] (IQR) for dispersion. Other alternatives include [[Trimmed estimator|trimming]] and [[Winsorising]], as in the [[trimmed mean]] and the [[Winsorized mean]].

==See also==

*[[Unbiased estimation of standard deviation]]
*[[Estimation of covariance matrices]]
*[[Scatter matrix]]

==References==

<references/>

[[Category:Covariance and correlation]]
[[Category:Estimation for specific parameters]]
[[Category:Summary statistics]]
[[Category:Matrices]]
[[Category:U-statistics]]

formulasearchengine - User contributions [en]

Observed information