- Research
- Open access
- Published:
Fast computation of function composition derivatives for flatness-based control of diffusion problems
Journal of Mathematics in Industry volume 14, Article number: 15 (2024)
Abstract
The chain rule is a standard tool in differential calculus to find derivatives of composite functions. Faà di Bruno’s formula is a generalization of the chain rule and states a method to find high-order derivatives. In this contribution, we propose an algorithm based on Faà di Bruno’s formula and Bell polynomials (Bell in Ann Math 29:38–46, 1927; Parks and Krantz in A primer of real analytic functions, 2012) to compute the structure of derivatives of function compositions. The application of our method is showcased using trajectory planning for the heat equation (Laroche et al. in Int J Robust Nonlinear Control 10(8):629–643, 2000).
1 Introduction
The chain rule is a standard tool to differentiate function compositions in differential calculus. The first few derivatives can usually be found manually in case of simple functions, but the calculation complexity increases by the order of differentiation. Thus, high-order derivatives are hard to find without the use of a computer.
In the research of flatness-based control we need a high number of derivatives to design open-loop control algorithms for large-scale systems like partial differential equations. An introduction to motion planning for systems governed by partial differential equations can be found in [3]. The smooth bump function
with \(T>0\) and \(\omega > 1\) is a typical example, which need to be differentiated several times to compute the flatness-based input signal. Bump function (1) and its derivatives are essential to find open-loop control signals for diffusion-convection-reaction systems because bump function (1) is a so called Gevrey-class function, which means that it is particularly smooth. Further information about the control design and the Gevrey class can be found in [3] in Chap. 6 and appendix B.1, and in [2, 4, 5].
Function compositions like Eq. (1) can be differentiated numerically, symbolically or with algorithmic differentiation. Numerical approaches like finite differences may not be applicable on high-order derivatives because they depend strongly on the order of approximation. Exact derivatives can be found with symbolic differentiation, which is provided by computer algebra systems like Maxima or Mathematica. However, the computed derivatives need to be transferred to source code for further steps of the control design. The authors of [6] implemented the derivatives of bump function (1) up to a order of 43 with Matlab. Algorithmic differentiation gained a lot of research interest in recent years because it is mainly used in optimization and machine learning to find gradients and Hessians [7].
In this article, we describe an alternative way to find the derivatives of arbitrary function compositions \(\phi (\gamma (t))\). Our proposed approach is based on Bell polynomials and Faà di Bruno’s formula [1] and calculates the structure of all derivatives up to an order n. The algorithm is implemented in Julia programming language [8] and is available on GitHub [9]. Related work about the computation of Bell polynomials and Faà di Bruno’s formula can be found in [10] and [11]. In Sect. 2, we introduce Faà di Bruno’s formula and explain its construction with Bell polynomials. The algorithm to build the structure of derivatives is noted in Sect. 3. The performance of this algorithm and its implementation are examined in Sect. 4. Finally, we demonstrate the application of our method on bump function (1) in Sect. 5.
2 Bell polynomials and Faà di Bruno’s formula
A function composition \(\phi (\gamma (t)) = \phi \circ \gamma (t)\) is differentiated using the chain rule as \(\frac{d}{d t} \phi (\gamma (t)) = \frac{d}{d \gamma} \phi (\gamma (t)) \frac{d}{d t} \gamma (t)\) with the assumption that \(\gamma : \mathbb{R} \rightarrow \mathbb{R}\) and \(\phi : \mathbb{R} \rightarrow \mathbb{R}\) are sufficiently continuously differentiable. An increasing order of differentiation leads to more and more complicated and error-prone manual calculations with an increasing number of terms. The chain rule for the n-th order derivative with \(n \geq 1\) is generalized by Faà di Bruno’s formula
where \(B_{n,k}\) denote the partial Bell polynomials, see [12] and [1, page 17]. The partial Bell polynomials are defined generally by
where all \(j_{i} \in \mathbb{N}_{0}\) and conditions \(\sum_{i}^{n-k+1} j_{i} = k\) and \(\sum_{i}^{n-k+1} i j_{i} = n\) have to be satisfied. Henceforth, we denote the derivatives of γ as \(x_{i} = \gamma ^{(i)}(t)\). Equation (3) is rather a formal description of the Bell polynomials than a way to calculate them. Instead, partial Bell polynomials are calculated recursively by
with \(a_{n,i} = \binom{n-1}{i-1}\) and for \(n \geq 1\), \(k \leq n\). Recursion (4) is initialized by \(B_{0,0}(\cdot )= 1\), \(B_{n,0}(x_{1},\ldots , x_{n+1}) = 0\) for \(n \geq 1\) and \(B_{n,k}(x_{1},\ldots ,x_{n-k+1}) = 0\) for \(n \geq 0\) and \(k > n\). To find \(B_{n,k}\) for any useful combination \((n,k)\) we need to know all previous Bell polynomials, e.g. \(B_{n-1,k-1}\), \(B_{n-2,k-1}\), \(B_{n-2,k-2}\), etc. This means, that the calculation of a n-th order derivative as in Eq. (2) using \(B_{n,k}\) depends on all the derivatives with \(B_{\nu ,k}\) for \(\nu \in \{1,2,\ldots ,n-1\}\). If we consider \(B_{\nu ,k}\) and identify \(k=1\) and \(k=\nu \) in (4), then we yield the first and last monomial
where \(a_{\nu ,\nu} = \binom{\nu -1}{\nu -1} = 1\) and \(a_{\nu ,1} = \binom{\nu -1}{0} = 1\).
We are not interested in executing Faà di Bruno’s formula (2) with Bell polynomials (4) for each time step over and over again. Instead, we calculate once the structure of all partial Bell polynomial \(B_{\nu ,k}\) with recursion (4). Each partial Bell polynomial \(B_{\nu ,k}\) consists of \(m_{\nu ,k}\) monomials
with \(\mu \in \{1,2,\ldots , m_{\nu ,k}\}\), \(\nu \in \{1,2,\ldots ,n\}\) and \(n\geq 1\) as the highest order of all derivatives. We compute and store exponents \(j_{\mu ,i}\) and coefficients \(\alpha _{\nu ,k,\mu}\). In this manner, we have to execute the recursive algorithm (4) only once for building the polynomial structures - the monomial coefficients and the corresponding exponents - and not for every function evaluation of \(\frac{d^{\nu}}{d t^{\nu}} \phi (\gamma (t))\). Finally, the known derivatives \(\frac{d^{\nu}}{ds^{\nu}} \phi (s)\) and \(\frac{d^{\nu}}{dt^{\nu}} \gamma (t)\) are inserted in the computed structures. This means, that the algorithm to build the structure of derivatives is independent of the specific function composition.
3 Algorithmic representation
The procedure to build the polynomial structures is split into two parts: firstly, calculating the exponents \(j_{\mu ,i}\) and secondly, finding the coefficients \(\alpha _{\nu ,k,\mu}\). We begin with the computation of exponents \(j_{\mu ,i}\) of monomial \(\beta _{\nu ,k,\mu}\) from Equation (7) which are stored in vector
The notation of \(j_{\mu ,i}\) of vector \(J_{\nu ,k,\mu}\) is shortened to improve the readability and it refers to \(j_{\nu ,k,\mu ,i}\) with \(i\in \{1,\ldots ,\nu -k+1\}\). In this setting, we interpret the multiplication
with \(i\in \{1,\ldots ,\nu -k+1\}\) as vector addition
with unit vector \(e_{i} = (0_{i-1}, 1, 0_{n-i})^{\top}\). Furthermore, all vectors \(J_{\nu ,k,\mu}\) are summarized as matrix
The number of columns \(m_{\nu ,k}\) (or equally m) increases by ν and k. This behavior is discussed in Sect. 4. Next, we present the algorithm to compute the vectors \(J_{\nu ,k,\mu}\). We implement three nested loops to iterate over \(k \in \{1,2,\ldots ,\nu \}\) (inner loop) and \(\nu \in \{1,2,\ldots , n\}\) (outer loop) where n is the highest order of differentiation.
3.1 Step 1: initialization
We set the maximum order of differentiation n and use identity \(B_{0,0}=1\) to initialize the first element \(\mathcal{J}_{0,0} = [J_{0,0,1}] = [0_{n}]\) as a zero vector. Further, we adopt the identities in (5), (6) and find for each iteration of the loop \(\nu \in \{1,2,\ldots n\}\) the first and last entry as
3.2 Step 2: matrix computation
In the inner loop \(k \in \{2,\ldots ,\nu -1\}\), we define matrix
and apply for all \(i\in \{1,\ldots ,\nu -k+1\}\) the matrix summation
as described in Equation (9). The result of matrix summation (11) is saved for \(i=1\) as \(\tilde{\mathcal{J}}_{\nu ,k} \leftarrow \mathcal{T}\) and for \(i>1\) we apply the concatenation \(\tilde{\mathcal{J}}_{\nu ,k} \leftarrow [\tilde{\mathcal{J}}_{\nu ,k}, \mathcal{T}]\). We use \(\tilde{\mathcal{J}}_{\nu ,k} = [\tilde{J}_{\nu ,k,1}, \ldots , \tilde{J}_{\nu ,k,m} ]\) as temporary storage because we are only interested in the unique columns of \(\tilde{\mathcal{J}}_{\nu ,k}\). Therefore, in the end of each iteration k we filter for unique entries and save the result as
3.3 Step 3: coefficients and final formulation
We derive the coefficients \(\alpha _{\nu ,k,\mu}\) from the definition of the partial Bell polynomials (3) as
where \(j_{\mu ,i}\) (or equally \(j_{\nu ,k,\mu ,i}\)) are the matrix elements of \(\mathcal{J}_{\nu ,k}\) as introduced in Equations (8), (10). Finally, we find all partial Bell polynomials by evaluating
Consequently, we yield the ν-th derivative as
4 Applicability and performance
Several methods exist to calculate derivatives via numerical, symbolic or algorithmic differentiation as mentioned in Sect. 1. Our proposed method shall evince an additional path beside symbolic and algorithmic differentiation. The proposed algorithm in Sect. 3 calculates the polynomial structure of all derivatives up to the n-th order of any function composition \(\phi (\gamma (t)) = \phi \circ \gamma (t)\). Thus, the algorithm does not depend on any specific functions ϕ and γ. However, the derivatives \(\frac{d^{\nu}}{ds^{\nu}} \phi (s)\) and \(\frac{d^{\nu}}{dt^{\nu}} \gamma (t)\) for \(\nu \in \{1,\ldots ,n\}\) must exist and need to be implemented in source code to finally compute the derivatives. Table 1 contains a list of some example functions and their ν-th order derivatives, which can be used for \(\phi (x)\), \(\gamma (x)\), \(\frac{d^{\nu}}{dx^{\nu}}\phi (x)\) and \(\frac{d^{\nu}}{dx^{\nu}}\gamma (x)\), to demonstrate the algorithm. The advantage of these example function is that the derivatives can be directly implemented in source code. The Gamma function in Table 1 is described by \(\Gamma (z) = \int _{0}^{\infty} t^{z-1} \exp (-t) \,dt\).
This means that our approach is not based on numerical, symbolic or algorithmic differentiation. In particular, it does not use symbolic computation to find the derivatives. However, the authors cannot exclude that existing computer algebra systems or symbolic toolboxes does not use ideas or parts of our presented approach.
One of the main challenges of our approach is its performance because each new derivative depends on all of its previous derivatives. The described algorithm consists of two parts: the computation of matrices \(\mathcal{J}_{\nu ,k}\) and coefficients \(a_{\nu ,k,\mu}\). The algorithm to compute \(\mathcal{J}_{\nu ,k}\) consists of three nested loops. In each ν-th outer iteration the number of inner for-loop iterations is found by
This quadratic scaling increases the computing time significantly in case of high-order derivatives like \(n > 20\). The number of iterations \(\operatorname{iter}_{\mathcal{J}}(\nu )\) is listed in Table 2 and the computing time is portrayed in Fig. 1 (a). We observe from the computations of \(\mathcal{J}_{\nu ,k}\), that the number of its columns can be described by
for \(k \in \{1,\ldots ,\nu -1\}\) with \(m_{\nu ,\nu}=1\). This integer sequence is known as A008284 [13]. Thus, for each derivative ν we need
iterations to calculate the coefficients \(\alpha _{\nu ,k,\mu}\). The resulting number of iterations \(\operatorname{iter}_{\alpha}(\nu )\) is listed in Table 2 and the computing time to find the coefficients \(\alpha _{\nu ,k,\mu}\) is displayed in Fig. 1 (b). We find that the computation of \(\alpha _{\nu ,k,\mu}\) performs much slower than \(\mathcal{J}_{\nu ,k}\). We already discussed, that the computation of the ν-th element of \(\mathcal{J}_{\nu ,k}\) depends on all previous ones. In contrast to \(\mathcal{J}_{\nu ,k}\), the ν-th element of \(\alpha _{\nu ,k,\mu}\) is computed separately. Thus, the computation of \(\alpha _{\nu ,k,\mu}\) might be intensively parallelized to reduce the computing time. The computing times in Fig. 1 are measured for the computation of \(\mathcal{J}_{\nu ,k}\) and \(\alpha _{\nu ,k,\mu}\) separately for \(n\in \{1,2,\ldots ,40\}\) derivatives. This procedure is executed 20 times to find the mean computing time and its upper and lower bounds are found for both cases. The high values in Fig. 1 (a) for \(n\in \{2,\ldots ,5\}\) only occur in the first cycle of the 20 executions. The performance tests were carried out with Julia Version 1.9.3, processor Intel Core i5-8350U CPU with 1.70 GHz × 8, Ubuntu 22.04.3 LTS (64-bit).
5 Application example
In this section, we apply the presented methods on bump function (1) to generate the input signal
with \(\hat{\Omega}:=\int _{0}^{T} \Omega _{\omega ,T}(\tau ) \,\mathrm{d}\tau \) for a one-dimensional heat conduction model. This control design is introduced and discussed in [2]. A step-by-step derivation of input signal (14) is noted in [5]. We consider the temperature \(\theta : [t,x] \in [0,T] \rightarrow \mathbb{R}_{\geq 0}\) in a one-dimensional rod with length \(L>0\). Input function (14) shall steer the measured temperature \(y(t) = \theta (t,L)\) of the linear heat equation
from \(y(0)=0\) to \(y(T)=y(0)+1=1\). The input signal is applied on the left side and a thermal isolation is assumed on the right side via the boundary conditions
We set the parameters as \(\omega =2\), \(T=10\) seconds, \(L=1\) meter and the initial value as \(\theta (0,x)=0\). Theoretically, an infinite number of derivatives of bump function (1) is necessary to compute input signal \(u(t)\). We assume \(n=10\) for implementation purposes because the series of input signal (14) converges after few iterations, see also [2, 5]. Bump function (1) is formulated as \(\exp ( f(g(t)) )\) for \(t\in (0,T)\) where
The algorithm builds the polynomial structure once and the computed structure is applied here two times: in the first step on \(f(g(t))=:q(t)\) and in the second step on \(\Omega _{\omega ,T}(t) = \exp (q(t))\). The derivatives of \(f(s)\) and \(g(t)\) are found as
See also the derivatives in Table 1. Our algorithm computes matrices \(\mathcal{J}_{\nu ,k}\) and coefficients \(a_{\nu ,k,\mu}\). We yield the derivatives \(\frac{d^{\nu}}{dt^{\nu}} q(t)\) with Eq. (13). Next, we use the same data \(\mathcal{J}_{\nu ,k}\) and \(a_{\nu ,k,\mu}\) with \(\frac{d^{\nu}}{dt^{\nu}} q(t)\) and Eq. (13) to compute the derivatives \(\frac{d^{\nu}}{dt^{\nu}} \Omega _{\omega ,T}(t) = \frac{d^{\nu}}{dt^{\nu}} \exp (q(t))\). Bump function \(\Omega _{\omega ,T}(t)\) and its computed derivatives Ω̇, Ω̈ and \(\Omega ^{(8)}\), \(\Omega ^{(9)}\), \(\Omega ^{(10)}\) are portrayed in Fig. 2. One notes that the minima and maxima increase with the order of differentiation. The input function \(u(t)\) is computed as in Eq. (14) using the derivatives \(\Omega _{\omega ,T}^{(\nu )}\) and is depicted in Fig. 3.
6 Conclusion
In this contribution, we presented a procedure to compute the polynomial structure of derivatives of function compositions using Bell polynomials and Faà di Bruno’s formula. The presented algorithm is implemented as Julia source code and is applied to an example function from flatness-based control. One of the main challenges of our proposed method is the performance of the computation of coefficients \(\alpha _{\nu ,k,\mu}\). These costs may be reduced in subsequent development of the proposed method using parallelization as briefly noted in Sect. 4. In further research, symbolic and algorithmic differentiation methods need to be compared substantially with Faà di Bruno’s approach to improve the computational differentiation of function compositions.
References
Parks HR, Krantz SG. A primer of real analytic functions. Basel: Birkhäuser; 2012.
Laroche B, Martin P, Rouchon P. Motion planning for the heat equation. Int J Robust Nonlinear Control. 2000;10(8):629–43.
Meurer T. Control of higher–dimensional PDEs: flatness and backstepping designs. Berlin: Springer; 2012.
Utz T, Graichen K, Kugi A. Trajectory planning and receding horizon tracking control of a quasilinear diffusion-convection-reaction system. IFAC Proc Vol. 2010;43(14):587–92.
Scholz S, Berger L, Lebiedz D. Benchmarking of flatness-based control of the heat equation. 2023. arXiv preprint. arXiv:2307.16764.
Fischer F, Gabriel J, Kerschbaum S. coni-a Matlab toolbox facilitating the solution of control problems. Zenodo. 2021. Available: https://zenodo.org/record/6420876.
Baydin AG, Pearlmutter BA, Radul AA, Siskind JM. Automatic differentiation in machine learning: a survey. J Mach Learn Res. 2018;18:1–43.
Bezanson J, Edelman A, Karpinski S, Shah VB. Julia: a fresh approach to numerical computing. SIAM Rev. 2017;59(1):65–98.
Scholz S. BellBruno.jl. Zenodo. 2023. Available: https://doi.org/10.5281/zenodo.7685927.
Di Nardo E, Guarino G, Senato D. A new algorithm for computing the multivariate Faa di Bruno’s formula. Appl Math Comput. 2011;217(13):6286–95.
Taghavian H. A fast algorithm for computing Bell polynomials based on index break-downs using prime factorization. 2020. arXiv preprint. arXiv:2004.09283.
Bell ET. Partition polynomials. Ann Math 1927;29:38–46.
The OEIS Foundation Inc. A008284. 2023. Available: https://oeis.org/A008284 [Accessed: Nov. 09, 2023].
Danish S, Krumbiegel J. Makie.jl: flexible high-performance data visualization for Julia. J Open Sour Softw. 2021;6(65):3349.
Rackauckas C. contributors: SciML/DifferentialEquations.jl. Zenodo. 2022. Available: https://doi.org/10.5281/zenodo.7239171.
Acknowledgements
Stephan Scholz thanks Ferdinand Fischer for the correspondence on Gevrey functions and related approaches.
Funding
Open Access funding enabled and organized by Projekt DEAL. Both authors are funded by the state of Baden-Württemberg (Germany) for their positions at University of Applied Sciences Ravensburg-Weingarten (Hochschule Ravensburg-Weingarten, 88250 Weingarten, Germany).
Author information
Authors and Affiliations
Contributions
SS wrote this article and developed the software. LB proved the results and improved the quality of this article. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Scholz, S., Berger, L. Fast computation of function composition derivatives for flatness-based control of diffusion problems. J.Math.Industry 14, 15 (2024). https://doi.org/10.1186/s13362-024-00143-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13362-024-00143-y