# Non-linear equation, numerical methods

Iteration methods for the solution of non-linear equations.

By a non-linear equation one means (see [1]–[3]) an algebraic or transcendental equation of the form

$$ \tag{1 } \phi ( x) = 0, $$

where $ x $ is a real variable and $ \phi ( x) $ a non-linear function, and by a system of non-linear equations a system of the form

$$ \tag{2 } \left . \begin{array}{c} \phi _ {1} ( x _ {1} \dots x _ {N} ) = 0 , \\ {\dots \dots \dots } \\ \phi _ {N} ( x _ {1} \dots x _ {N} ) = 0 , \\ \end{array} \right \} $$

that is not linear; the solutions of (2) are $ N $- dimensional vectors $ x = ( x _ {1} \dots x _ {N} ) $. Equation (1) and system (2) can be treated as a non-linear operator equation:

$$ \tag{3 } L ( u) = f $$

with a non-linear operator $ L $ acting from a finite-dimensional vector subspace $ H _ {N} $ into $ H _ {N} $.

Numerical methods for the solution of a non-linear equation (3) are called iteration methods if they are defined by the transition from a known approximation $ u ^ {n} $ at the $ n $- th iteration to a new iteration $ u ^ {n+} 1 $ and allow one to find in a sufficiently large number of iterations a solution of (3) within prescribed accuracy $ \epsilon $. The most important iteration methods for the approximate solution of (3), both of a general and of a special form, characteristic for discrete (grid) methods for solving boundary value problems for equations and systems of partial differential equations of strongly-elliptic type, are the object of study of the present article. Non-linear operator equations connected with the discussion of infinite-dimensional spaces (see, for example [4]–[8]) are a very broad mathematical concept, including as special cases, for example, non-linear integral equations and non-linear boundary value problems. Numerical methods for the approximate solution of them include also methods for their approximation by finite-dimensional equations; these methods are treated separately.

One of the most important methods for solving an equation (3) is the simple iteration method (successive substitution), which assumes that one can replace (3) by an equivalent system

$$ \tag{4 } u = P ( u) , $$

where $ u = ( u _ {1} \dots u _ {N} ) $ is an element of a finite-dimensional normed space $ H _ {N} $ and $ P $, mapping $ H _ {N} $ into $ H _ {N} $, is a contractive operator:

$$ \tag{5 } \exists q < 1 \ \forall u , v \ \| P ( u) - P ( v) \| \leq \ q \| u - v \| . $$

Then by a general contractive-mapping principle (see [1]–[4] and Contracting-mapping principle) equation (1) has a unique solution, the method of simple iteration

$$ \tag{6 } u ^ {n+} 1 = P ( u ^ {n} ) ,\ \ n = 0 , 1 \dots $$

converges for any initial approximation $ u ^ {0} $, and the error $ z ^ {n} $ at the $ n $- th iteration satisfies the estimate

$$ \| z ^ {n} \| \leq \ q ^ {n} ( 1- q ) ^ {-} 1 \| u ^ {0} - u ^ {1} \| . $$

Suppose that some solution $ \overline{u}\; $ of (3) has a surrounding ball $ S ( \overline{u}\; , R ) \equiv \{ {v } : {\| v - \overline{u}\; \| \leq R } \} $ such that the system (3) considered together with the additional condition

$$ \tag{7 } u \in S ( \overline{u}\; , R ) $$

is equivalent to (4) considered together with (7), and that (5) holds for $ u = \overline{u}\; $ and any $ v = \overline{u}\; + z $ with $ \| z \| \leq R $. Then for a choice of an initial approximation $ u ^ {0} $ from $ S ( \overline{u}\; , R ) $ in the method (6) convergence of $ u ^ {n} $ to $ \overline{u}\; $ with an error estimate $ \| z ^ {n} \| \leq q ^ {n} R $ is guaranteed.

For twice continuously-differentiable functions $ \phi _ {i} $, if a good initial approximation to a solution of the system (2) is available, frequently an effective means of improving the accuracy is the Newton–Kantorovich method, in which an equation $ \phi _ {1} ( x _ {1} \dots x _ {N} ) = 0 $ from (2) determining a certain surface $ P _ {i} $ is replaced by the equation of the tangent surface to $ P _ {i} $ at the point $ x ^ {n} $, where $ x ^ {n} $ is a previously obtained approximation to a solution of (2) (see [1]–[5]). Under certain additional conditions the Newton–Kantorovich method leads to an error estimate of the form

$$ \| z ^ {n} \| \leq \ c _ {1} ( c \| z ^ {0} \| ) ^ {2 ^ {n} } , $$

where $ c _ {i} $ and $ c $ are certain constants. At every iteration of this method one has to solve the system of linear algebraic equations with matrix

$$ A ^ {(} n) \equiv \ \left . \left ( \frac{\partial \phi _ {i} }{\partial x _ {j} } \right ) \right | _ {x = x ^ {n} } . $$

Sometimes one keeps this matrix fixed for several iterations, sometimes one replaces the derivatives $ \partial \phi _ {i} / \partial x _ {j} $ by difference approximations.

The Newton–Kantorovich method belongs to the group of linearization methods (3). Another method in this group is the secant method.

A large number of iteration methods (the so-called descent methods) (see [1]–[3], [9], [10]–[13]) are based on replacing the solving of the equations (3) by minimizing a certain functional $ I ( u) $( cf. also Descent, method of). For example, for $ I ( u) $ one can take

$$ \tag{8 } I ( u) \equiv \| L ( u) - f \| ^ {2} . $$

In a number of cases when the initial non-linear equations are the Euler equations for the problem of minimizing a certain functional $ I ( u) $, such a variational formulation of the problem is even more natural; the operators $ L $ in similar situations are gradients of the functional $ I ( u) $ and are called potential operators (see [5]–[6]). Among the several versions of descent methods one can mention the methods of coordinate-wise descent, several gradient methods and, in particular, the method of steepest descent, the method of conjugate gradients, and others, and also their modifications (see [2], [9], [10]–[13]). A number of iteration methods for the solution of the equations (3), describing a certain stationary state, can be treated as discretizations of the corresponding non-stationary problems. Therefore, methods of this class are called stabilization methods (cf. Adjustment method and, for example, [2]). Examples of such non-stationary problems are problems described by a system of ordinary differential equations

$$ A _ {1} \frac{d ^ {2} u }{d t ^ {2} } + A _ {2} \frac{d u }{d t } + L ( u) - f = 0 . $$

The introduction of an additional independent variable is characteristic for the method of differentiation with respect to a parameter (a continuation method, see [5], [13]). Its essence consists in the introduction of an auxiliary parameter $ \lambda \in [ 0 , 1 ] $, the choice of continuously-differentiable functions $ F _ {i} ( x , \lambda ) $, and replacement of (2) by the system

$$ \tag{9 } F _ {i} ( x , \lambda ) = 0 ,\ \ i = 1 \dots N ,\ \ 0 \leq \lambda \leq 1 ; $$

for $ \lambda = 0 $ the system (9) must be easy to solve, and the functions $ F _ {i} ( x , 1 ) $ must coincide with the $ \phi _ {i} ( x) $, $ i = 1 \dots N $. Generally speaking, the system (9) determines

$$ x ( \lambda ) \equiv \ ( x _ {1} ( \lambda ) \dots x _ {N} ( \lambda ) ) $$

as a function of $ \lambda $, and the required solution of (2) is $ x ( 1) $. If (9) is differentiable with respect to $ \lambda $, the result is a system of ordinary differential equations

$$ \tag{10 } \sum _ { i= } 1 ^ { N } \frac{\partial F _ {i} }{\partial x _ {j} } \frac{d x _ {j} }{d \lambda } + \frac{\partial F _ {i} }{\partial \lambda } = 0 ,\ \ i = 1 \dots N . $$

If one solves the Cauchy problem for it on $ 0 \leq \lambda \leq 1 $ with initial conditions that are solutions of the system

$$ F _ {i} ( x , 0 ) = 0 ,\ \ i = 1 \dots N , $$

then one finds a solution of (2). Discretization of (10) with respect to $ \lambda $ leads to a numerical method for solving the system (2).

In the method of continuation with respect to a parameter (cf. Continuation method (to a parametrized family, for non-linear operators)) the system (9) is solved for $ \lambda = \tau \dots m \tau $, $ m \tau = 1 $, and for every one of these values of $ \lambda $ one applies a certain iteration method with an initial approximation that agrees with the approximation obtained by solving the system for the preceding value of $ \lambda $. Both these methods are in essence iteration methods for solving (2) with a special procedure for finding a good initial approximation.

In the case of systems the problem of localizing the solution causes great difficulty. Since the majority of iteration methods converge only in the presence of a fairly good approximation to a solution, the two methods described above often make it possible to dispense with the need to localize a solution directly. For localization one also frequently uses theorems based on topological principles and on the monotonicity of operators (see [4]–[8]).

For the solution of equations (1), which are the simplest special cases of (3), the number of known iteration methods applicable in practice is very large (see, for example, [1]–[3], [12], [14]). Apart from the ones already considered one can mention, for example, the iteration methods of higher order (see [1], [14]), which include Newton's methods as a special case, and the many iteration methods especially oriented to finding real or complex roots of polynomials

$$ p _ {n} ( z) \equiv \ a _ {n} z ^ {n} + a _ {n-} 1 z ^ {n-} 1 + \dots + a _ {0} , $$

where the $ a _ {i} $ are real or complex numbers (see [1], [2]).

The problem of localizing the solutions of (1) reduces to a search for an interval at the ends of which the continuous function $ \phi $ takes values of opposite sign. When $ \phi $ is a polynomial, the task is less subtle, since theoretical bounds are known (see [1]) and there are methods for finding all roots with required accuracy without giving good initial approximations to them (see [12]).

Iteration methods for solving equations (3) that arise in grid analogues of non-linear boundary value problems for partial differential equations are special cases of methods for solving grid systems (see, for example, [13], [15]–[19]). Probably one of the most intensively applied methods for solving (3) is a modified method of simple iteration, which can be can be written in the form

$$ \tag{11 } B u ^ {n+} 1 = B u ^ {n} - \gamma ( L ( u ^ {n} ) - f ) , $$

where (3) is regarded as an operator equation in the $ N $- dimensional space $ H _ {N} = H $, and $ B \in S $, where $ S $ stands for the set of symmetric positive linear operators mapping $ H $ into itself. An expedient study of such methods should proceed not in $ H $, but in the space $ H _ {B} $ with the new scalar product

$$ ( u , v ) _ {H _ {B} } \equiv \ ( B u , v ) ,\ \ \| u \| ^ {2} \equiv \ ( B u , u ) , $$

where $ ( u , v ) $ is the scalar product in $ H $.

If the operator $ L $ satisfies the conditions of strict monotonicity and is Lipschitz continuous:

$$ \tag{12 } \forall u , v \ ( L ( u) - L ( v) , u - v ) \geq \delta _ {0} \| u - v \| _ {B} ^ {2} ,\ \ \delta _ {0} > 0 , $$

$$ \tag{13 } \forall u , v \ \| L ( u) - L ( v) \| {B ^ {- 1 } } ^ {2} \leq \delta _ {2} \| u - v \| _ {B} ^ {2} , $$

then (3) has a unique solution and the method (11), for a suitable choice of $ \gamma $, converges for any $ u ^ {0} $ with error estimate

$$ \tag{14 } \| z ^ {n} \| _ {B} \leq q ^ {n} \| z ^ {0} \| _ {B} , $$

where $ q \equiv q ( \delta _ {0} , \delta _ {2} , \gamma ) < 1 $( see [13], [15]).

In the most general version of this theorem it suffices to require that (12) and (13) hold for a solution $ u $ and all $ v $ in a ball $ S _ {B} ( u , R ) \equiv \{ {v } : {\| v - u \| \leq R } \} $ and that $ u ^ {0} $ lies in this ball (see [13]). In this case the constants $ \delta _ {0} $ and $ \delta _ {2} $ may depend on $ R $. To verify these conditions it suffices, e.g., to localize $ u $ by means of a priori estimates, obtaining $ \| u \| _ {B} \leq R _ {0} $, and then to take $ S _ {B} ( u , R ) \equiv S _ {B} ( u , R _ {1} - R _ {0} ) $, provided that (12) and (13) hold for any $ u $ and $ v $ in $ S _ {B} ( 0 , R _ {1} ) $, where $ R _ {1} > R _ {0} $. The constant $ q $ in (14) can be diminished if $ L $ is differentiable and for its derivatives $ L _ {v} ^ \prime $, represented as the sum of the symmetric part $ L _ {v , \mathop{\rm sym} } ^ \prime $ and the skew-symmetric part $ L _ {v , \textrm{ skew } } ^ \prime $ the inequalities

$$ \forall v \in S _ {B} ( u , R ) \ \sigma _ {0} B \leq L _ {v , \mathop{\rm sym} } ^ \prime \leq \sigma _ {1} B ,\ \sigma _ {0} > 0 ; $$

$$ \forall w \| L _ {v , \textrm{ skew } } ^ \prime w \| _ {B ^ {- 1 } } ^ {2} \leq \sigma _ {2} \| w \| _ {B} ^ {2} $$

are known. Then $ q \equiv q ( \gamma , \sigma _ {0} , \sigma _ {1} , \sigma _ {2} ) $( see [11], [13], [15]). Sometimes in the discussion of certain types of non-linearity it is advisable to use instead of (12) and (13) inequalities like

$$ ( L ( u) - L ( v) , B ( u - v ) ) \geq \overline \delta \; _ {0} \| B ( u - v ) \| ^ {2} ,\ \ \overline \delta \; _ {0} > 0 ; $$

$$ \| L ( u) - L ( v) \| ^ {2} \leq \overline \delta \; _ {2} \| B ( u - v ) \| ^ {2} $$

(see [13]). For the operators $ B $ in (11) one can use, for example, splitting difference operators (the method of alternating directions) or factorized difference operators (the alternating-triangle method, the incomplete matrix factorization method), and others. Most attractive from the asymptotic point of view is the use of operators $ B $ such that the constants $ \delta _ {i} $ or $ \overline \delta \; _ {i} $ do not depend on the dimension of the space $ H _ {N} $( see [13]) and the operators $ B $ are sufficiently simple. Along these lines one succeeds in a number of cases in constructing iteration methods that make it possible to find a solution of (3) with accuracy $ \epsilon $ at the expense of altogether $ O ( N \mathop{\rm log} N | \mathop{\rm log} \epsilon | ) $( or even $ O ( N \mathop{\rm log} N ) $) arithmetic operations for $ \epsilon \approx N ^ {- \alpha } $, $ \alpha > 0 $( see [13]), if the computational work in the evaluation of $ L ( u ^ {n} ) $ for a given $ u ^ {n} $ can be estimated by $ O ( N) $. To verify conditions of the type (12) and (13), in many cases the grid (difference) analogues of the Sobolev imbedding theorems turn out to be very powerful (see [13]). It is important to take into account the specific nature of the non-linearity. For example, when $ L ( u) = \Lambda u + Pu $, where $ \Lambda $ is a positive linear operator and $ P $ a quadratic non-linear operator having the property of "skew symmetryskew-symmetry" (that is, $ ( P u , u ) = 0 $ for all $ u $), one often succeeds in obtaining the constant $ \delta _ {2} ( R) $ in any ball $ S _ {B} ( 0 , R ) $ and $ \delta _ {0} $ depending only on $ \| u \| _ {B} $; then (11) converges for any $ u ^ {0} $( see [13]). In a number of cases one can replace the original problem on the basis of a priori estimates by an equivalent one for which the required conditions hold in the whole space.

#### References

[1] | I.S. Berezin, N.P. Zhidkov, "Computing methods" , Pergamon (1973) (Translated from Russian) |

[2] | N.S. Bakhvalov, "Numerical methods: analysis, algebra, ordinary differential equations" , MIR (1977) (Translated from Russian) |

[3] | J.M. Ortega, W.C. Rheinboldt, "Iterative solution of non-linear equations in several variables" , Acad. Press (1970) |

[4] | M.A. Krasnosel'skii, G.M. Vainikko, P.P. Zabreiko, et al., "Approximate solution of operator equations" , Wolters-Noordhoff (1972) (Translated from Russian) |

[5] | S.G. Mikhlin, "The numerical performance of variational methods" , Wolters-Noordhoff (1971) (Translated from Russian) |

[6] | M.M. Vainberg, "Variational method and method of monotone operators in the theory of nonlinear equations" , Wiley (1973) (Translated from Russian) |

[7] | H. Gajewski, K. Gröger, K. Zacharias, "Nichtlineare Operatorengleichungen und Operatorendifferentialgleichungen" , Akademie Verlag (1974) |

[8] | J.-L. Lions, "Quelques méthodes de résolution des problèmes aux limites nonlineaires" , Dunod (1969) |

[9] | F.P. Vasil'ev, "Lectures on methods for solving extremal problems" , Moscow (1974) (In Russian) |

[10] | B.N. Pshenichnyi, Yu.M. Danilin, "Numerical methods in extremal problems" , MIR (1978) (Translated from Russian) |

[11] | R. Glowinski, "Numerical methods for nonlinear variational problems" , Springer (1984) |

[12] | V.V. Voevodin, "Numerical methods of algebra. Theory and algorithms" , Moscow (1966) (In Russian) |

[13] | E.G. D'yakonov, "Minimization of computational work. Asymptotically-optimal algorithms" , Moscow (1989) (In Russian) |

[14] | J.F. Traub, "Iterative methods for the solution of equations" , Prentice-Hall (1964) |

[15] | A.A. Samarskii, E.S. Nikolaev, "Numerical methods for grid equations" , 1–2 , Birkhäuser (1989) (Translated from Russian) |

[16] | R. Glowinski, J.-L. Lions, R. Trémolières, "Numerical analysis of variational inequalities" , North-Holland (1981) (Translated from French) |

[17] | W. Hackbusch, "Multigrid solution of continuation problems" R. Ansorge (ed.) Th. Meis (ed.) W. Törnig (ed.) , Iterative solution of nonlinear systems of equations , Lect. notes in math. , 953 , Springer (1982) pp. 20–44 |

[18] | F. Thomasset, "Implementation of finite element methods for Navier–Stokes equations" , Springer (1981) |

[19] | W. Hackbusch (ed.) V. Trottenberg (ed.) , Multigrid Methods. Proc. Köln-Porz, 1981 , Lect. notes in math. , 960 , Springer (1982) |

#### Comments

#### References

[a1] | J.W. Daniel, "The approximate minimization of functionals" , Prentice-Hall (1971) |

**How to Cite This Entry:**

Non-linear equation, numerical methods.

*Encyclopedia of Mathematics.*URL: http://encyclopediaofmath.org/index.php?title=Non-linear_equation,_numerical_methods&oldid=49333