When studying local extreme points of functions of several (real) variables, a typical textbook exercise is to consider the polynomial
Here we have and
, so the only critical point of
is
Since
and the Hessian of
is
, it follows that
is a local minimum of
and, since it is the only critical point, it is in fact an absolute minimum with
being a polynomial, it is reasonable to expect that there is an algebraic explanation as for why
is its minimum, and why it lies at
. After all, this is what happens in one variable: If
and
, then
and obviously has a minimum at
, and this minimum is
The polynomial of the example above can be analyzed this way as well. A bit of algebra shows that we can write
and it follows immediately that has a minimum value of
, achieved precisely when both
and
, i.e, at
(One can go further, and explain how to go in a systematic way about the `bit of algebra’ that led to the representation of as above, but let’s leave that for now.)
What we did with is not a mere coincidence. Hilbert’s 17th of the 23 problems of his famous address to the Second International Congress of Mathematicians in Paris, 1900, asks whether every polynomial
with real coefficients which is non-negative for all (real) choices of
is actually a sum of squares of rational functions. (A rational function is a quotient of polynomials.) A nonnegative polynomial is usually called positive definite, but I won’t use this notation here.
If Hilbert’s problem had an affirmative solution, this would provide a clear explanation as for why is non-negative.
Hilbert himself showed that a non-negative polynomial is a sum of squares of polynomials in either of the following cases:
is a polynomial in only one variable,
is a quadratic polynomial (as in our example above), in any number
of variables,
is a quartic polynomial in two variables.
Emil Artin solved Hilbert’s problem (in the affirmative) in 1927. The stronger conclusion, that is a sum of squares of polynomials, is unfortunately false in general. Here is a counterexample:
That this polynomial is non-negative follows from the inequality between the arithmetic and the geometric means, see example 1 below. This polynomial appears in the paper of Marie-Francoise Roy, The role of Hilbert’s problems in real algebraic geometry, in Proceedings of the ninth EWM Meeting, Loccum, Germany 1999, and one can also check that it is not a sum of squares of polynomials by applying the algorithm I refer to below.
Although Hilbert himself was aware that some non-negative polynomials are not a sum of squares of polynomials, I believe it was the logician Rafael Robinson who exhibited the first explicit example of such a
, in his nice paper Some definite polynomials which are not sums of squares of real polynomials, in Selected questions of algebra and logic(collection dedicated to the memory of A. I. Mal′cev) (Russian), pp. 264–282, Izdat. ‘‘Nauka’’Sibirsk.Otdel.,Novosibirsk, 1973.
Here I quote from the Mathscinet review by W. Ellison:
The key result is the following lemma: A polynomial
of degree at most 3 that vanishes at 8 of the 9 points
with
must also vanish at the 9th point.
The polynomial
vanishes to the 2nd order at each of the 9 points and we can find a sextic curve
that does not pass through
but does have double points at each of the other 8 points, for example
It follows that the function
is bounded above in the whole
plane. Thus, for a suitable
(in fact one can take
) it follows that
is positive definite and that
It is now clear that we cannot have
with real polynomials
for these polynomials are at most cubics and so would have to vanish at
since they vanish at the other 8 points.
It is interesting to see that several logicians have worked on this problem; notably Robinson, Henkin and Kreisel.
This is a good place to mention that there are algorithms that, given a non-negative polynomial, decide whether it admits a representation as a sum of squares of polynomials, and if so, find such a sum. This is the main result of the nice paper An algorithm for sums of squares of real polynomials, by Victoria Powers and Thorsten Woermann, J. Pure and Appl. Alg. 127 (1998), 99-104. (As of this writing, the paper is also available at Dr. Powers’s webpage.)
Many natural inequalities that appear frequently in analysis can effectively be stated as asserting that certain polynomials are non-negative, so in fact one can (at least in principle) prove these inequalities by simply displaying a representation of the given polynomials as sums of squares.
Examples
1. The inequality between the arithmetic and the geometric means asserts that
for all nonnegative numbers , with equality only when
. This classical result is one of the most quoted inequalities in analysis. Notice that it is equivalent to the assertion that the polynomial
is expressible as a sum of squares: First, replace the with
, to dispense with the restriction that the
must be non-negative. Then, replace
with
, to remove the
-th roots.
As suggested, one may try to prove this inequality by simply exhibiting as a sum of squares. Such a representation was actually found by Hurwitz in 1891, in his paper Über den Vergleich des arithmetischen und des geometrischen Mittels, that can be found in Math. Werke, 505-507, Basel, E. Berkhäuser, 1933.
To write the corresponding representation in a palatable way, let’s introduce some notation. Given a function , let’s write
for the result of adding all the expressions of the form
for all possible rearrangements
of the indices
(there are
such possible rearrangements, where
,
-factorial, is defined by setting
and
for
).
For example, and
.
Write
for and
Notice that each is clearly non-negative, since
Then
Notice that this expression makes it obvious that equality can only occur if the are all equal.
For example,
2. Another fairly used classical inequality is the Cauchy-Schwarz inequality
with equality only if the tuples and
are proportional. This result is usually shown by analyzing the discriminant of a non-negative quadratic polynomial, and it is also widely presented in the more compact form
for
arbitrary vectors in
and
the usual dot product. Letting
the inequality is seen to be equivalent to the assertion that the polynomial is non-negative, and so (as in the previous example) it should be possible to prove this by exhibiting
as a sum of squares. The resulting identity is due to Lagrange, namely,
from which the case of equality also follows immediately.
3. Let me close with an example of a classical inequality that leads to non-negative polynomials for which I do not know whether they admit representations as a sum of squares of polynomials. This inequality is due to Minkowski: For non-negative , we have
As in example 1 above, we can rewrite this as the assertion that a polynomial is non-negative, namely,
That the inequality holds is a consequence of the inequality between the arithmetic mean and the geometric mean. For example, expanding the first product, one finds exactly terms with exactly
many factors of the form
and
many factors
. For any
with
precisely
many of these terms contain the factor
so, by the arithmetic mean-geometric mean inequality, the sum of these terms is at least
times the product of the
to the power
times the product of the
to the power
(since
), which is precisely the corresponding term when we expand the
-th power we are subtracting in
.
However, it is not clear to me that is a sum of squares of polynomials. The problem trying to apply the representation I displayed in example 1 is similar to the problem one finds trying to apply that representation to the polynomial
namely, since for example
is not a perfect sixth power, the representation from example 1 gives us
as a sum of squares of functions involving cubic roots rather than polynomials.
Update (Dec. 22, 2012): Péter E. Frenkel and Péter Horváth have shown (June, 2012) that Minkowski’s inequality can indeed be proved by showing that as above is sum of squares, see Minkowski’s inequality and sums of squares, at the arXiv (Their paper has appeared in Cent. Eur. J. Math. 12 (2014), no. 3, 510–516).
For Young’s inequality with rational exponents, see Purely “algebraic” proof of Young’s inequality (December, 2012), at Math.Stackexchange.
(Sep. 6, 2017): For more on sum-of-squares proofs of the AM-GM inequality, see Kazumasa Fujiwara and Tohru Ozawa, Identities for the Difference between the Arithmetic and Geometric Means, Int. Journal of Math. Analysis, Vol. 8, 2014, no. 31, 1525 – 1542. See also these posts at MathOverflow and Math.Stackexchange: 1, 2, 3.
[…] closely related to the inequality between the arithmetic and the geometric means, see this post for further discussion: […]
Many thanks for all of this. I was thinking on this problem and, actually, I produced another expression for
as a sum of squares which is more complicated than Hurwitz’. So, thanks again for your link to Hurwitz’ work.
Now, concerning your question at the end, there is a well-known algorithm that determines whether a positive polynomial is a sum of squares. Have you already tested it, say for
?
I agree that this may be confusing. I have slightly edited that example, hopefully for the better. Thanks!
IMHO it would be clearer to just use
throughout (except in the very first equation). What really confused me was that the sum of squares was not really a sum of squares (until I figured out that you replaced
with
).
[…] closely related to the inequality between the arithmetic and the geometric means, see this post for further […]
[…] on this, positive polynomials, sums of squares of polynomials, and Hilbert’s 17th problem, in this old blog post of […]