Differentiable ODE Solvers#

WIP

It’s a mess…

These notes need to be cleaned up. For the time being, I’m throwing them below at the speed of thought.

\[\newcommand{\bmx}[1]{\begin{bmatrix}#1\end{bmatrix}}\]

Iterative Methods#

Differentiation Mathematics#

Consider the parameterized dynamical system \(\dot{x} = f(x,p,t)\). Here, \(p\) is the vector of parameters.

Define \(S(p,t) \triangleq \frac{\partial x(p,t)}{\partial p}\). This is known as the sensitivity matrix.

\[\begin{split}\begin{align} \dot{x} &= f(x,p,t),\\ \dot{S} &= \frac{\partial f(x,p,t)}{\partial x} S + \frac{\partial f(x,p,t)}{\partial p}. \end{align}\end{split}\]

Euler’s Formula#

WIP

RK4#

The math for differentiating through Runge-Kutta methods is slightly more involved.

First, here’s the classical Runge-Kutta method (RK4):

(1)#\[\begin{split}\begin{align} x_{n+1} &= x_n + \frac{h}{6}(k_1 + 2k_2 + 2k_3 + k_4), \\ k_1 &\triangleq f(x_n,p,t),\\ k_2 &\triangleq f(x_n+h\frac{k_1}{2},p,t_n+\frac{h}{2}),\\ k_3 &\triangleq f(x_n+h\frac{k_2}{2},p,t_n+\frac{h}{2}),\\ k_4 &\triangleq f(x_n+hk_3,p,t_n+h),\\ \end{align}\end{split}\]

To solve for \(S(p,t)\), there are two possible methods:

Propagate \(S(p,t)\) using RK4
Compute \(\frac{\partial k_i}{\partial p}\) for \(i=\{1,2,3,4\}\).

Method 1: The dynamics for \(S(p,t)\) depend on \(x\) in addition to \(S\). Let’s define a new state:

\[\begin{split}z(p,t) \triangleq \bmx{x(p,t) \\ S(p,t)}\end{split}\]

Next, let’s give the new state some dynamics:

(2)#\[\begin{split}\dot{z} &= g(z,p,t), &= \bmx{f(x,p,t) \\ \frac{\partial f(x,p,t)}{\partial x} S + \frac{\partial f(x,p,t)}{\partial p}}\end{split}\]

Fantastic. Let’s apply (1) to the dynamics in (2):

\[\begin{split}z_{n+1} &= z_n + \frac{h}{2}(k_1 + 2k_2 + 2k_3 + k_4), \\ k_1 &\triangleq g(z_n,p,t),\\ k_2 &\triangleq g(z_n+h\frac{k_1}{2},p,t_n+\frac{h}{2}),\\ k_3 &\triangleq g(z_n+h\frac{k_2}{2},p,t_n+\frac{h}{2}),\\ k_4 &\triangleq g(z_n+hk3,p,t_n+h).\end{split}\]

That’s a good start, but let’s expand the \(z(p,t)\) terms in each \(k_i\) expression to see what’s going on under the hood. Notice that we can add \(z(p_1,t_1) + g(x_2,p_2,t_2)\) together like this:

\[\begin{split}\begin{align} z(p_1,t_1) + g(x_2,p_2,t_2) &= \bmx{x(p_1,t_1) \\ S(p_1,t_1)} + \bmx{f(x_2,p_2,t_2) \\ \frac{\partial f(x_2,p_2,t_2)}{\partial x} S(p_2,t_2) + \frac{\partial f(x_2,p_2,t_2)}{\partial p}} \\ &= \bmx{x(p_1,t_1) + f(x_2,p_2,t_2) \\ S(p_1,t_1) + \frac{\partial f(x_2,p_2,t_2)}{\partial x} S(p_2,t_2) + \frac{\partial f(x_2,p_2,t_2)}{\partial p}} \end{align}\end{split}\]

If you squint, the outcome of that sum has sort of an “\(x\)-component” and an “\(S\)-component” just like the state \(z(p,t)\). We’ll need some abbreviated notation to keep things readable. Let’s define \(k_{i}^x\) as the “\(x\)-component” of \(k_i\) and \(k_{i}^S\) as the “\(S\)-component” portion of \(k_i\).1

\[\begin{split}k_1 &\triangleq \bmx{f(x_n,p,t_n) \\ \frac{\partial f(x_n,p,t_n)}{\partial x} S_n + \frac{\partial f(x_n,p,t_n)}{\partial p}},\\ k_2 &\triangleq \bmx{f(x_n + h\frac{k_1^x}{2},p,t_n+\frac{h}{2}) \\ \frac{\partial f(x_n+\frac{k_1^x}{2},p,t_n+\frac{h}{2})}{\partial x} (S + h\frac{k_1^S}{2}) + \frac{\partial f(x_n+h\frac{k_1^x}{2},p,t_n+\frac{h}{2})}{\partial p}},\\ k_3 &\triangleq \bmx{f(x_n + h\frac{k_2^x}{2},p,t_n+\frac{h}{2}) \\ \frac{\partial f(x_n+\frac{k_2^x}{2},p,t_n+\frac{h}{2})}{\partial x} (S + h\frac{k_2^S}{2}) + \frac{\partial f(x_n+h\frac{k_2^x}{2},p,t_n+\frac{h}{2})}{\partial p}},\\ k_4 &\triangleq \bmx{f(x_n+hk_3^x,p,t_n+h) \\ \frac{\partial f(x_n+hk_3^x,p,t_n+h)}{\partial x} (S_n + hk_3^S) + \frac{\partial f(x_n+hk_3^x,p,t_n+h)}{\partial p}},\\\end{split}\]

A little messier than I expected, but nothing too scary.

Method 2: WIP