The Path to Energy Minimum: Fundamentals of Geometry Optimization

Emre Can Buluz
Jun 21
5 min read

In molecular modeling studies, the first step to realistically reflect the physical behavior of a structure is to bring it to its energetically most stable state. This process is known as geometry optimization, and it involves rearranging the positions of a molecule’s atoms in space in such a way that the system's potential energy is minimized. Whether you are working with small organic compounds or large biomolecules, an optimized structure is essential for a successful molecular dynamics simulation or quantum chemical calculation. This article will cover the fundamental principles of geometry optimization and the algorithms commonly used in this process.

The identification (or “location”) of a stationary point on the potential energy surface (PES) (for more details, refer to our article on PES), i.e., proving the existence of such a point and computing its geometry and energy, is referred to as geometry optimization. The stationary point of interest may be a minimum, a transition state, or—less commonly—a higher-order saddle point. Finding a minimum is typically called energy minimization or simply minimization, whereas locating a transition state is specifically referred to as transition state optimization (1).

Figure 1. Geometry optimization toward a minimum yields the local minimum closest to the input structure. The input structure A′ is moved toward the minimum point A; similarly, B′ proceeds toward B. In contrast, finding a transition state (TS) typically requires a specialized algorithm: this algorithm moves the initial structure A toward the transition state TS (1). — **Figure 1.** Geometry optimization toward a minimum yields the local minimum closest to the input structure. The input structure A′ is moved toward the minimum point A; similarly, B′ proceeds toward B. In contrast, finding a **transition state (TS)** typically requires a specialized algorithm: this algorithm moves the initial structure A toward the transition state TS (1).

Progressing from an input structure toward the nearest minimum on the potential energy surface (PES) is quite straightforward in the case of a one-dimensional PES of a diatomic molecule. In such a scenario, simply adjusting the bond length is sufficient to identify the value corresponding to the lowest energy. However, on any other type of surface, efficient geometry optimization requires a sophisticated algorithm. One needs to determine which direction to move in and how far to proceed in that direction.

In general, it is not possible to reach the nearest minimum from the input structure in a single step. However, with a reasonable initial geometry, modern geometry optimization algorithms typically reach the minimum in about ten steps (1). The most commonly used algorithms in geometry optimization (2) rely on the first and second derivatives of the energy with respect to the geometric parameters.

Figure 2. An illustration representing the geometry optimization of chemical molecules. (The image was generated using AI.) — **Figure 2.** An illustration representing the geometry optimization of chemical molecules. (The image was generated using AI.)

Commonly Used Methods in Geometry Optimization

Newton-Raphson-Simpson-Fourier Method

Newton's method is a classical iterative scheme used to solve a nonlinear system of equations f(x)=0 or to minimize a multivariate function f(x). These root-finding and optimization problems are closely related. Although the method is commonly attributed to Newton or Newton and Raphson, Fourier and Simpson also made significant contributions to its development. To derive the iteration process of Newton’s method for the minimization of a one-dimensional function f(x), a quadratic (second-order) approximation is used instead of a linear one (3).

Since f(xk) is constant, minimizing the second and third terms on the right-hand side of the equation yields the following iteration process:

This Newton scheme is well-defined for minimizing the function f(x), as long as the second derivative at the point xkx is nonzero.

Quasi-Newton (QN) Method

The quasi-Newton (QN) method avoids using the actual Hessian matrix and instead builds up curvature information as the algorithm progresses. In practice, it is usually the inverse of the Hessian (Bᵇ) that is updated, so that the term Hₖ⁻¹ gₖ in the equation is replaced by Bᵇₖ gₖ. Here, Bᵇₖ is shorthand for Bᵇ(xₖ).

The Hessian approximation Bₖ, is derived to satisfy the quasi-Newton condition. Different variants of QN methods define different formulas that satisfy this condition. Since memory is a critical resource in large-scale applications, the matrix Bₖ or Bᵇ is not stored as a direct n × n matrix; instead, it is expressed through several vector operations. In practice, Bₖ is updated by adding a low-rank update matrix Uₖ (3).

Conjugate Gradient (CG) Method

Nonlinear conjugate gradient (CG) methods are another commonly used optimization approach for large-scale problems where memory usage and computational performance are critical. These methods were first developed in the 1960s by combining the linear CG method (an iterative technique for solving the linear system Ax = b, where A is an n × n matrix) with line-search techniques. In each step of the nonlinear conjugate gradient (CG) method, the search vector pₖ is defined by a recursive formula. A line search is then applied. The iteration process that defines the search vectors is given as follows:

Here, p₀ = −g₀ is taken. The parameter βₖ which is scheme-dependent and defines the search vectors, is chosen as follows: If f were a convex quadratic function and the line search were performed exactly (i.e., xₖ + λₖpₖ, pₖ minimizes the function f exactly along the direction of pₖ), the method would reduce to the linear conjugate gradient (CG) method. The reduction to the linear CG method in this special case is important because it is known that the linear CG method terminates in at most nn steps of exact arithmetic (3).

Steepest Descent Method

The classical steepest descent method is one of the oldest methods for minimizing a general nonlinear function (4). The steepest descent method, also known as the gradient descent method, was first proposed by Cauchy (5). In his original paper, Cauchy suggested the use of the gradient as a way to solve a nonlinear equation of the form:

Here, f is a real-valued continuous function that is non-negative and remains continuous at least within certain limits. The basis of the method is the simple observation that a continuous function should decrease, at least initially, when a step is taken along the negative gradient direction. The only challenge then is deciding how to choose the step length.

While this is easily computable for special cases such as convex quadratic functions, in the general case, it usually requires minimizing the function along the negative gradient direction. Despite its simplicity, the steepest descent method has played an important role in the development of optimization theory. Unfortunately, because this method is quite slow in most real-world problems, it is not widely used. Instead, more powerful methods such as the conjugate gradient method or quasi-Newton methods are frequently employed (4).

Geometry optimization is one of the cornerstones of molecular modeling processes and an important step in improving the accuracy of simulations. Obtaining the stable structures of molecules is of great importance not only for theoretical calculations but also for practical applications. This process provides a solid foundation for more accurate energy calculations, dynamic simulations, and molecular interaction analyses. Geometry optimization plays a critical role in obtaining correct results, especially when working with complex systems. The fundamental principles and methods discussed in this article aim to guide researchers and modeling enthusiasts in conducting more efficient and accurate simulations.

References:

1. Errol G. Lewars. (2016). Computational Chemistry: Introduction to the Theory and Applications of Molecular and Quantum Mechanics. Springer.

2. Cramer C. (2004). Essentials of computational chemistry. Wiley.

3. Schlick, T. (2010) Molecular Modeling and Simulation: An Interdisciplinary Guide. 2nd Edition, Springer, Berlin. https://doi.org/10.1007/978-1-4419-6351-2

4. Meza, J. C. (2010). Steepest descent. Wiley Interdisciplinary Reviews: Computational Statistics, 2(6), 719-722. https://doi.org/10.1002/wics.117

5. Cauchy, A. (1847). Méthode générale pour la résolution des systemes d’équations simultanées. Comp. Rend. Sci. Paris, 25(1847), 536-538.

The Path to Energy Minimum: Fundamentals of Geometry Optimization

Recent Posts

Comments