Barna’s inequality

December 30, 2014

(Disclaimer: This is planned to be part of a larger set of notes on the dynamics of Newton’s method.)

We present Barna’s proof of the following result: Suppose that f is a polynomial all of whose roots are real and N is is associated Newton’s function. If r is a root of f and its immediate basin of attraction I_r=(a,b) is bounded, then |N'(a)|>1 and |N'(b)|>1.

1. Introduction

If f is a differentiable function, we define N=N_f, Newton’s function for f, by

\displaystyle N_f(x)=x-\frac{f(x)}{f'(x)}

for all values of x such that f'(x)\ne 0. Under reasonable assumptions on f, N can be extended (by continuity) so that it is also defined at those points x such that f(x)=f'(x)=0. Note that if f'(x)\ne 0, then N'(x) exists, and we have

\displaystyle N'(x)=\frac{f(x)f''(x)}{(f'(x))^2}.

The function N is of course the function obtained through the application of the familiar Newton’s method for approximating roots of f. Recall that (under reasonable assumptions on f) the method starts with a guess x_0 for a root of f, and refines this guess successively, with each new guess x_{n+1} being obtained from the previous one x_n by replacing f with its linear approximation at x=x_n, that is, with the line y-x_n=f'(x_n)(x-x_n), and letting x_{n+1} be the value of x where this approximation equals 0, that is, x_{n+1}=N_f(x_n).

One can say much about what makes a function “reasonable”, but for the purposes of this note it suffices that polynomials certainly fall under this category. One easily verifies that if (f is a real valued polynomial and) f(x^*)=0 then there is an open neighborhood I of x^* such that if x_0\in I, then for all n we have that x_n is defined, x_n\in I, and \lim_{n\to \infty}x_n=x^*. This is perhaps easiest to see if we assume that x^* is a simple zero, that is, f(x^*)=0\ne f'(x^*). In this case N'(x^*)=0 and, by continuity, N(x) and N'(x) are defined, and |N'(x)|<1, for all x in a neighborhood (x^*-\epsilon,x^*+\epsilon) of x^*. But if x is in this neighborhood, by the mean value theorem we have that |N(x)-x^*|=|N(x)-N(x^*)|<|x-x^*|. It follows that N(x) is also in this neighborhood, and that successive applications of N result in a sequence that converges to x^*.

The largest interval I about x^* such that for any point x_0 in I the sequence x_0,x_1,\dots,x_{n+1}=N(x_n),\dots is well defined and converges to x^*, we call the immediate basin of attraction of x^*. One can verify that I is open, that N(I)=I, and that if I is bounded, say I=(a,b), then N(a)=b and N(b)=a, so that \{a,b\} is a cycle of period 2 for N. By the observations above, we know that if c is whichever of a and b is closest to x^*, then |N'(c)|\ge 1.

The study of the dynamics of Newton’s method is very interesting. Nowadays, most work on the subject is part of the more general topic of complex dynamics, but the study of the dynamical behavior on \mathbb R deserves to be better known. In this note I present a result of Barna, who in the late 1950s and early 1960s began such a study. The result is an inequality that improves the observation at the end of the previous paragraph. This inequality is useful in studying the chaotic behavior of the method for polynomials of degree at least 4, see for instance [SU84].

2. Barna inequality

Theorem (Barna [Bar61]). Suppose that f is a polynomial all of whose roots are real. If r is a root of f and its immediate basin of attraction I_r=(a,b) is bounded, then |N'(a)|>1 and |N'(b)|>1.

Barna’s argument is completely elementary, but it is not easy to locate it in modern literature.

Read the rest of this entry »

Neugebauer’s theorem

December 30, 2014

The problem of characterizing when a given function is a derivative has been studied at least since 1911 when Young published a paper recognizing its relevance:

W. H. Young. A note on the property of being a differential coefficient, Proc. Lond. Math. Soc., 9 (2), (1911), 360–368.

The paper opens with the following comment (Young refers to derivatives as “differential coefficients”):

Recent research has provided us with a set of necessary and sufficient conditions that a function may be an indefinite integral, in the generalised sense, of another function, and the way has thus been opened to important developments. The corresponding, much more difficult, problem of determining necessary and sufficient conditions that a function may be a differential coefficient, has barely been mooted; indeed, though we know a number of necessary conditions, no set even of sufficient conditions has to my knowledge ever been formulated, except that involved in the obvious statement that a continuous function is a differential coefficient. 

It is more or less understood nowadays that no completely satisfactory characterization is possible. We know that derivatives are Darboux continuous (that is, they satisfy the intermediate value property), and are Baire one functions (that is, they are the pointwise limit of a sequence of continuous functions). But this is not enough: For instance, any function f such that f(x)=\sin(1/x) for x\ne 0, and f(0)\in[-1,1] is Darboux continuous and Baire one, but only the function with f(0)=0 is a derivative.

Andrew Bruckner has written excellent surveys on derivatives and the characterization problem. See for instance:

Andrew M. Bruckner and John L. Leonard. Derivatives. Amer. Math. Monthly, 73 (4, part II), (1966), 24–56. MR0197632 (33 #5797).

Andrew M. Bruckner. Differentiation of real functions. Second edition. CRM Monograph Series, 5. American Mathematical Society, Providence, RI, 1994. xii+195 pp. ISBN: 0-8218-6990-6. MR1274044 (94m:26001).

Andrew M. Bruckner. The problem of characterizing derivatives revisited. Real Anal. Exchange, 21 (1),  (1995/96), 112–133. MR1377522 (97g:26004).

Here I want to discuss briefly a characterization obtained by Neugebauer, see

Christoph Johannes Neugebauer. Darboux functions of Baire class one and derivatives. Proc. Amer. Math. Soc., 13 (6), (1962), 838–843. MR0143850 (26 #1400).

For concreteness, I will restrict discussion to functions f:[0,1]\to\mathbb R, although this makes no real difference. Whenever an interval is mentioned, it is understood to be nondegenerate. For any closed subinterval I\subseteq [0,1], we write I^\circ for its interior, and \mathrm{lh}(I) for its length. Given a point x\in[0,1], we write I\to x to indicate that \mathrm{lh}(I)\to 0 and x\in I.

Theorem (Neugebauer). A function f:[0,1]\to\mathbb R is a derivative iff to each closed subinterval I of {}[0,1] we can associate a point x_I\in I^\circ in such a way that the following hold: 

  1. For all x\in[0,1], if I\to x, then f(x_I)\to f(x), and 

  2. For all  closed subintervals I,I_1,I_2 of {}[0,1], if I=I_1\cup I_2 and {I_1}^\circ\cap {I_2}^\circ=\emptyset, then \mathrm{lh}(I)f(x_I)=\mathrm{lh}(I_1)f(x_{I_1})+\mathrm{lh}(I_2)f(x_{I_2}).

Read the rest of this entry »