<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://en.formulasearchengine.com/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=164.67.64.43</id>
	<title>formulasearchengine - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://en.formulasearchengine.com/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=164.67.64.43"/>
	<link rel="alternate" type="text/html" href="https://en.formulasearchengine.com/wiki/Special:Contributions/164.67.64.43"/>
	<updated>2026-05-02T05:16:55Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.43.0-wmf.28</generator>
	<entry>
		<id>https://en.formulasearchengine.com/index.php?title=Essential_dimension&amp;diff=22074</id>
		<title>Essential dimension</title>
		<link rel="alternate" type="text/html" href="https://en.formulasearchengine.com/index.php?title=Essential_dimension&amp;diff=22074"/>
		<updated>2012-05-27T06:55:59Z</updated>

		<summary type="html">&lt;p&gt;164.67.64.43: disambiguated lattice&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- The \,\! are to keep the formulas rendered as PNG instead of HTML. Please don&#039;t remove them.--&amp;gt;&lt;br /&gt;
In statistical [[decision theory]], where we are faced with the problem of estimating a deterministic parameter (vector) &amp;lt;math&amp;gt;\theta \in \Theta&amp;lt;/math&amp;gt; from observations &amp;lt;math&amp;gt;x \in \mathcal{X},&amp;lt;/math&amp;gt; an [[estimator]] (estimation rule) &amp;lt;math&amp;gt;\delta^M \,\!&amp;lt;/math&amp;gt; is called &#039;&#039;&#039;[[minimax]]&#039;&#039;&#039; if its maximal [[Risk function|risk]] is minimal among all estimators of &amp;lt;math&amp;gt;\theta \,\!&amp;lt;/math&amp;gt;. In a sense this means that &amp;lt;math&amp;gt;\delta^M \,\!&amp;lt;/math&amp;gt; is an estimator which performs best in the worst possible case allowed in the problem.&lt;br /&gt;
&lt;br /&gt;
==Problem setup==&lt;br /&gt;
Consider the problem of estimating a deterministic (not [[Bayes estimator|Bayesian]]) parameter &amp;lt;math&amp;gt;\theta \in \Theta&amp;lt;/math&amp;gt; from noisy or corrupt data &amp;lt;math&amp;gt;x \in \mathcal{X}&amp;lt;/math&amp;gt; related through the [[conditional probability distribution]] &amp;lt;math&amp;gt;P(x|\theta)\,\!&amp;lt;/math&amp;gt;. Our goal is to find a &amp;quot;good&amp;quot; estimator &amp;lt;math&amp;gt;\delta(x) \,\!&amp;lt;/math&amp;gt; for estimating the parameter &amp;lt;math&amp;gt;\theta \,\!&amp;lt;/math&amp;gt;, which minimizes some given [[risk function]] &amp;lt;math&amp;gt;R(\theta,\delta) \,\!&amp;lt;/math&amp;gt;. Here the risk function is the [[expected value|expectation]] of some [[loss function]] &amp;lt;math&amp;gt;L(\theta,\delta) \,\!&amp;lt;/math&amp;gt; with respect to &amp;lt;math&amp;gt;P(x|\theta)\,\!&amp;lt;/math&amp;gt;. A popular example for a loss function is the squared error loss &amp;lt;math&amp;gt;L(\theta,\delta)= \|\theta-\delta\|^2 \,\!&amp;lt;/math&amp;gt;, and the risk function for this loss is the [[mean squared error]] (MSE).&lt;br /&gt;
&lt;br /&gt;
Unfortunately in general the risk cannot be minimized, since it depends on the unknown parameter  &amp;lt;math&amp;gt;\theta \,\!&amp;lt;/math&amp;gt; itself (If we knew what was the actual value of &amp;lt;math&amp;gt;\theta \,\!&amp;lt;/math&amp;gt;, we wouldn&#039;t need to estimate it).  Therefore additional criteria for finding an optimal estimator in some sense are required. One such criterion is the minimax criteria.&lt;br /&gt;
&lt;br /&gt;
==Definition==&lt;br /&gt;
&#039;&#039;&#039;Definition&#039;&#039;&#039; : An estimator  &amp;lt;math&amp;gt;\delta^M:\mathcal{X} \rightarrow \Theta \,\!&amp;lt;/math&amp;gt; is called &#039;&#039;&#039;minimax&#039;&#039;&#039; with respect to a risk function &amp;lt;math&amp;gt;R(\theta,\delta) \,\!&amp;lt;/math&amp;gt; if it achieves the smallest maximum risk among all estimators, meaning it satisfies&lt;br /&gt;
&lt;br /&gt;
: &amp;lt;math&amp;gt;\sup_{\theta \in \Theta} R(\theta,\delta^M) = \inf_\delta \sup_{\theta \in \Theta} R(\theta,\delta). \, &amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Least favorable distribution==&lt;br /&gt;
Logically, an estimator is minimax when it is the best in the worst case. Continuing this logic, a minimax estimator should be a [[Bayes estimator]] with respect to a prior least favorable distribution of &amp;lt;math&amp;gt;\theta \,\!&amp;lt;/math&amp;gt;. To demonstrate this notion denote the average risk of the Bayes estimator &amp;lt;math&amp;gt;\delta_{\pi} \,\!&amp;lt;/math&amp;gt; with respect to a prior distribution &amp;lt;math&amp;gt;\pi \,\!&amp;lt;/math&amp;gt; as&lt;br /&gt;
&lt;br /&gt;
: &amp;lt;math&amp;gt;r_\pi = \int R(\theta,\delta_{\pi}) \, d\pi(\theta) \, &amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Definition:&#039;&#039;&#039; A prior distribution &amp;lt;math&amp;gt;\pi \,\!&amp;lt;/math&amp;gt; is called least favorable if for any other distribution &amp;lt;math&amp;gt;\pi &#039; \,\!&amp;lt;/math&amp;gt; the average risk satisfies &amp;lt;math&amp;gt;r_\pi \geq r_{\pi &#039;}  \, &amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Theorem 1:&#039;&#039;&#039; If &amp;lt;math&amp;gt;r_\pi = \sup_\theta R(\theta,\delta_\pi), \, &amp;lt;/math&amp;gt; then:&lt;br /&gt;
&lt;br /&gt;
#&amp;lt;math&amp;gt;\delta_{\pi}\,\!&amp;lt;/math&amp;gt; is minimax.&lt;br /&gt;
#If &amp;lt;math&amp;gt;\delta_{\pi}\,\!&amp;lt;/math&amp;gt; is a unique Bayes estimator, it is also the unique minimax estimator.&lt;br /&gt;
#&amp;lt;math&amp;gt;\pi\,\!&amp;lt;/math&amp;gt; is least favorable.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Corollary:&#039;&#039;&#039; If a Bayes estimator has constant risk, it is minimax.  Note that this is not a necessary condition.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Example 1, Unfair coin:&#039;&#039;&#039; Consider the problem of estimating the &amp;quot;success&amp;quot; rate of a [[Binomial distribution|Binomial]] variable, &amp;lt;math&amp;gt;x \sim B(n,\theta)\,\!&amp;lt;/math&amp;gt;. This may be viewed as estimating the rate at which an [[Fair coin|unfair coin]] falls on &amp;quot;heads&amp;quot; or &amp;quot;tails&amp;quot;. In this case the Bayes estimator with respect to a [[Beta distribution|Beta]]-distributed prior, &amp;lt;math&amp;gt;\theta \sim \text{Beta}(\sqrt{n}/2,\sqrt{n}/2) \, &amp;lt;/math&amp;gt; is&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\delta^M=\frac{x+0.5\sqrt{n}}{n+\sqrt{n}}, \, &amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
with constant Bayes risk&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;r=\frac{1}{4(1+\sqrt{n})^2}  \, &amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
and, according to the Corollary, is minimax.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Definition:&#039;&#039;&#039; A sequence of prior distributions &amp;lt;math&amp;gt; {\pi}_n\,\!&amp;lt;/math&amp;gt; is called least favorable if for any other distribution &amp;lt;math&amp;gt;\pi &#039;\,\!&amp;lt;/math&amp;gt;,&lt;br /&gt;
:&amp;lt;math&amp;gt;\lim_{n \rightarrow \infty} r_{\pi_n} \geq r_{\pi &#039;}. \, &amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Theorem 2:&#039;&#039;&#039; If  there are a sequence of priors &amp;lt;math&amp;gt; \pi_n\,\!&amp;lt;/math&amp;gt; and an estimator &amp;lt;math&amp;gt;\delta\,\!&amp;lt;/math&amp;gt; such that&lt;br /&gt;
&amp;lt;math&amp;gt;\sup_{\theta} R(\theta,\delta)=\lim_{n \rightarrow \infty} r_{\pi_n} \,\!&amp;lt;/math&amp;gt;,  then :&lt;br /&gt;
&lt;br /&gt;
#&amp;lt;math&amp;gt;\delta\,\!&amp;lt;/math&amp;gt; is minimax.&lt;br /&gt;
#The sequence &amp;lt;math&amp;gt;{\pi}_n\,\!&amp;lt;/math&amp;gt; is least favorable.&lt;br /&gt;
&lt;br /&gt;
Notice that no uniqueness is guaranteed here. For example, the ML estimator from the previous example may be attained as the limit of Bayes estimators with respect to a [[Uniform distribution (continuous)|uniform]] prior, &amp;lt;math&amp;gt;\pi_n \sim U[-n,n]\,\!&amp;lt;/math&amp;gt; with increasing support and also with respect to a zero mean normal prior &amp;lt;math&amp;gt;\pi_n \sim N(0,n \sigma^2) \,\!&amp;lt;/math&amp;gt; with increasing variance. So neither the resulting ML estimator is unique minimax nor the least favorable prior is unique.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Example 2:&#039;&#039;&#039; Consider the problem of estimating the mean of &amp;lt;math&amp;gt;p\,\!&amp;lt;/math&amp;gt; dimensional   [[Normal distribution|Gaussian]] random vector, &amp;lt;math&amp;gt;x \sim N(\theta,I_p \sigma^2)\,\!&amp;lt;/math&amp;gt;. The [[Maximum likelihood]] (ML) estimator for &amp;lt;math&amp;gt;\theta\,\!&amp;lt;/math&amp;gt; in this case is simply &amp;lt;math&amp;gt;\delta_{ML}=x\,\!&amp;lt;/math&amp;gt;, and its risk is&lt;br /&gt;
&lt;br /&gt;
: &amp;lt;math&amp;gt;R(\theta,\delta_{ML})=E{\|\delta_{ML}-\theta\|^2}=\sum \limits_1^p E{(x_i-\theta_i)^2}=p \sigma^2. \, &amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Image:MSE of ML vs JS.png|thumb|right|350px|MSE of maximum likelihood estimator versus James–Stein estimator]]&lt;br /&gt;
&lt;br /&gt;
The risk is constant,  but the ML estimator is actually not a Bayes estimator, so the Corollary of Theorem 1 does not apply.  However,  the ML estimator is the limit of the Bayes estimators with respect to  the prior sequence &amp;lt;math&amp;gt;\pi_n \sim N(0,n \sigma^2) \,\!&amp;lt;/math&amp;gt;, and, hence,  indeed minimax according to Theorem 2 .      Nonetheless, minimaxity does not always imply [[Admissible decision rule|admissibility]]. In fact in this example, the ML estimator is known to be inadmissible (not admissible) whenever &amp;lt;math&amp;gt;p &amp;gt;2\,\!&amp;lt;/math&amp;gt;. The famous [[James–Stein estimator]] dominates the ML whenever &amp;lt;math&amp;gt;p &amp;gt;2\,\!&amp;lt;/math&amp;gt;. Though both estimators have the same risk &amp;lt;math&amp;gt;p \sigma^2\,\!&amp;lt;/math&amp;gt; when &amp;lt;math&amp;gt;\|\theta\| \rightarrow \infty\,\!&amp;lt;/math&amp;gt;, and they are both minimax, the James–Stein estimator has smaller risk for any finite &amp;lt;math&amp;gt;\|\theta\|\,\!&amp;lt;/math&amp;gt;. This fact is illustrated in the following figure.&lt;br /&gt;
&lt;br /&gt;
==Some examples==&lt;br /&gt;
In general it is difficult, often even impossible to determine the minimax estimator. Nonetheless, in many cases a minimax estimator has been determined.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Example 3, Bounded Normal Mean:&#039;&#039;&#039; When estimating the Mean of a Normal Vector &amp;lt;math&amp;gt;x \sim N(\theta,I_n \sigma^2)\,\!&amp;lt;/math&amp;gt;, where it is known that &amp;lt;math&amp;gt;\|\theta\|^2 \leq M\,\!&amp;lt;/math&amp;gt;. The Bayes estimator with respect to a prior which is uniformly distributed on the edge of the bounding [[sphere]] is known to be minimax whenever &amp;lt;math&amp;gt;M \leq n\,\!&amp;lt;/math&amp;gt;. The analytical expression for this estimator is&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\delta^M=\frac{nJ_{n+1}(n\|x\|)}{\|x\|J_{n}(n\|x\|)}, \, &amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;math&amp;gt;J_{n}(t)\,\!&amp;lt;/math&amp;gt;, is the modified [[Bessel function]] of the first kind of order &#039;&#039;n&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
==Asymptotic minimax estimator==&lt;br /&gt;
&lt;br /&gt;
The difficulty of determining the exact minimax estimator has motivated the study of estimators of asymptotic minimax --- an estimator &amp;lt;math&amp;gt;\delta&#039;&amp;lt;/math&amp;gt; is called &amp;lt;math&amp;gt;c&amp;lt;/math&amp;gt;-asymptotic (or approximate) minimax if&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\sup_{\theta\in\Theta} R(\theta,\delta&#039;)\leq c \inf_\delta \sup_{\theta \in \Theta} R(\theta,\delta).&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For many estimation problems, especially in the non-parametric estimation setting, various approximate minimax estimators have been established. The design of approximate minimax estimator is intimately related to the geometry, such as the [[metric entropy number]], of &amp;lt;math&amp;gt;\Theta&amp;lt;/math&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==Relationship to Robust Optimization==&lt;br /&gt;
[[Robust optimization]] is an approach to solve optimization problems under uncertainty in the knowledge of underlying parameters,.&amp;lt;ref name=kassam/&amp;gt;&amp;lt;ref name=ben_tal/&amp;gt; For instance, the [[Minimum mean square error|MMSE Bayesian estimation]] of a parameter requires the knowledge of parameter correlation function. If the knowledge of this correlation function is not perfectly available, a popular minimax robust optimization approach&amp;lt;ref name=verdu/&amp;gt; is to define a set characterizing the uncertainty about the correlation function, and then pursuing a minimax optimization over the uncertainty set and the estimator respectively. Similar minimax optimizations can be pursued to make estimators robust to certain imprecisely known parameters. For instance, a recent study dealing with such techniques in the area of signal processing can be found in.&amp;lt;ref name=nisar_book/&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In R. Fandom Noubiap and W. Seidel (2001) an algorithm for calculating a Gamma-minimax decision rule has been developed,&lt;br /&gt;
when Gamma is given by a finite number of generalized moment conditions. Such&lt;br /&gt;
a decision rule minimizes the maximum of the integrals of the risk function&lt;br /&gt;
with respect to all distributions in Gamma. Gamma-minimax decision rules are of interest in robustness studies in Bayesian statistics.&lt;br /&gt;
&lt;br /&gt;
==References==&lt;br /&gt;
*E. L. Lehmann and G. Casella (1998), &#039;&#039;Theory of Point Estimation,&#039;&#039; 2nd ed. New York: Springer-Verlag.&lt;br /&gt;
*F. Perron and E. Marchand (2002), &amp;quot;On the minimax estimator of a bounded normal mean,&amp;quot; &#039;&#039;Statistics and Probability Letters&#039;&#039; &#039;&#039;&#039;58&#039;&#039;&#039;: 327–333.&lt;br /&gt;
*J. O. Berger (1985), &#039;&#039;Statistical Decision Theory and Bayesian Analysis,&#039;&#039; 2nd ed. New York: Springer-Verlag. ISBN 0-387-96098-8.&lt;br /&gt;
*R. Fandom Noubiap and W. Seidel (2001), &#039;&#039;An Algorithm for Calculating Gamma-Minimax Decision Rules under Generalized Moment Conditions,&#039;&#039; Annals of Statistics, Aug., 2001, vol. 29, no. 4, p.&amp;amp;nbsp;1094–1116&lt;br /&gt;
* {{Cite journal&lt;br /&gt;
 |first=C. |last=Stein |authorlink=Charles Stein (statistician)&lt;br /&gt;
 |year=1981&lt;br /&gt;
 |title=Estimation of the mean of a multivariate normal distribution&lt;br /&gt;
 |journal=[[Annals of Statistics]]&lt;br /&gt;
 |volume=9 |issue=6 |pages=1135–1151&lt;br /&gt;
 |doi=10.1214/aos/1176345632 |mr=630098 | zbl = 0476.62035&lt;br /&gt;
}}&lt;br /&gt;
{{Reflist|refs=&lt;br /&gt;
&amp;lt;ref name=verdu&amp;gt;S. Verdu and H. V. Poor (1984), &amp;quot;On Minimax Robustness: A general approach and applications,&amp;quot; IEEE Transactions on&lt;br /&gt;
Information Theory, vol. 30, pp. 328–340, March 1984.&amp;lt;/ref&amp;gt;&lt;br /&gt;
&amp;lt;ref name=kassam&amp;gt;S. A. Kassam and H. V. Poor (1985), &amp;quot;Robust Techniques for Signal Processing: A Survey,&amp;quot; Proceedings of the IEEE, vol. 73, pp. 433–481, March 1985.&amp;lt;/ref&amp;gt;&lt;br /&gt;
&amp;lt;ref name=ben_tal&amp;gt;A. Ben-Tal, L. El Ghaoui, and A. Nemirovski (2009), &amp;quot;Robust Optimization&amp;quot;, Princeton University Press, 2009.&amp;lt;/ref&amp;gt;&lt;br /&gt;
&amp;lt;ref name=nisar_book&amp;gt;M. Danish Nisar. [http://www.shaker.eu/shop/978-3-8440-0332-1 &amp;quot;Minimax Robustness in Signal Processing for Communications&amp;quot;], Shaker Verlag, ISBN 978-3-8440-0332-1, August 2011.&amp;lt;/ref&amp;gt;&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
{{DEFAULTSORT:Minimax Estimator}}&lt;br /&gt;
[[Category:Decision theory]]&lt;br /&gt;
[[Category:Estimation theory]]&lt;/div&gt;</summary>
		<author><name>164.67.64.43</name></author>
	</entry>
</feed>