Optimal Control of a Spring System
In yesterday's post, I investigated spring systems, inspired by the paper DiffTaichi which is about differentiable physics simulations and how to control them using neural networks.
Since I'm trying to better understand what this paper is about, the goal of this post is to use their simple example as a starting point. What's nice about their work is that the code is open-sourced. In particular, the simple mass spring example is here.
Defining the problem¶
The problem is about three connected springs:
import numpy as np
import holoviews as hv
hv.extension('bokeh', 'matplotlib')
x = np.array([[0.3, 0.5],
[0.3, 0.4],
[0.4, 0.4]])
spring_anchors = np.array([[0, 1],
[1, 2],
[2, 0]])
def plot_points(x, spring_anchors):
"""Plots all x as 2D points, connected by springs indicated by spring_anchors."""
lines = []
for i in range(spring_anchors.shape[0]):
a, b = spring_anchors[i]
lines.append(hv.Curve(x[[a, b], :]))
lines = hv.Overlay(lines)
scatter = hv.Scatter((x[:, 0], x[:, 1])).opts(hv.opts.Scatter(size=30, padding=0.2))
labels = hv.Labels({('x', 'y'): x, 'text': [str(n) for n in range(x.shape[0])]}, ['x', 'y'], 'text')
return scatter * lines * labels
plot_points(x, spring_anchors)
The points have masses and the springs have stiffness as well as spring lengths and follow (damped) Newtonian dynamics (here, I implemented this with numpy and accelerated it using numba). This leads to the following type of behaviour:
import numba
@numba.jit()
def compute_spring_forces(x, spring_anchors, spring_lengths, spring_stiffnesses):
"""Computes the spring forces."""
forces = np.zeros_like(x)
for i in range(spring_anchors.shape[0]):
a, b = spring_anchors[i]
x_a, x_b = x[a], x[b]
dist = x_a - x_b
length = np.linalg.norm(dist) + 1e-4
F = (length - spring_lengths[i]) * spring_stiffnesses[i] * dist / length
forces[a] += -F
forces[b] += F
return forces
@numba.jit()
def time_step_integration(x, v, forces, masses, dt, damping):
"""Integrates the forces using semi-implicit Euler time integration with damping."""
s = np.exp(-dt * damping)
v_new = s * v + dt * forces / masses.reshape(-1, 1)
x_new = x + dt * v_new
return x_new, v_new
@numba.jit()
def forward(x, spring_anchors, spring_lengths, spring_stiffnesses, steps=1024, dt=0.001, damping=20):
new_x = x.copy()
new_v = np.zeros_like(x)
snapshots = []
snapshots.append(new_x.copy())
for t in range(steps):
forces = compute_spring_forces(new_x, spring_anchors, spring_lengths, spring_stiffnesses)
new_x, new_v = time_step_integration(new_x, new_v, forces, masses, dt=dt, damping=damping)
snapshots.append(new_x.copy())
return snapshots
def make_animation(snapshots, n_frames):
hmap = hv.HoloMap({i: plot_points(snapshots[i], spring_anchors) for i in range(0, len(snapshots), n_frames)},
kdims=['snapshot'])
return hv.output(hmap, holomap='scrubber')
spring_stiffnesses = 10 * np.array([1., 1., 1.])
spring_lengths = np.array([0.15, 0.15, 0.1 * 2 ** 0.5])
masses = np.ones_like(spring_stiffnesses)
snapshots = forward(x, spring_anchors, spring_lengths, spring_stiffnesses)
make_animation(snapshots, 20)
Setting an objective¶
The cited paper proposes a way of controlling this simulation. Let's say that our objective is for the triangle constituted by the three points to have a certain area at the end of the simulation and that we can modify the rest lengths of the three simulated springs. How do we accomplish this?
First of all, we need to define a cost function. Which is the difference between the area and the objective value we want to reach. Knowing the final position of the points, we can easily compute the triangle area:
@numba.jit()
def triangle_area(x):
x01 = x[0] - x[1]
x02 = x[0] - x[2]
# Triangle area from cross product
area = np.abs(0.5 * (x01[0] * x02[1] - x01[1] * x02[0]))
return area
@numba.jit()
def cost_function(x):
"""Difference between area and target area of triangle, squared."""
target_area = 0.2
area = triangle_area(x)
return np.square(area - target_area)
triangle_area(snapshots[0]), triangle_area(snapshots[-1])
(0.005000000000000001, 0.008243853015880287)
We can use the cost function to evaluate the last configuration in our first simulation:
print(f'sanity check: {(triangle_area(snapshots[-1]) - 0.2)**2}, cost function: {cost_function(snapshots[-1])}')
sanity check: 0.036770419906195326, cost function: 0.036770419906195326
Gradient descent¶
Now, how do we improve the state of things? A simple way to proceed is to do gradient descent. The gradient of our cost function depends on the three rest length parameters of our triangle. We can use finite differences to approximately compute this gradient. Intuitively, this means putting a number on: if I increase the rest length of spring 1 by a little, how much is the cost going to change?
We can do this by changing the spring length of the first spring by just a little bit and see how the cost changes.
spring_lengths = np.array([0.1, 0.1, 0.1 * 2 ** 0.5])
snapshots = forward(x, spring_anchors, spring_lengths, spring_stiffnesses)
cost0 = cost_function(snapshots[-1])
spring_lengths[0] += 0.001
snapshots = forward(x, spring_anchors, spring_lengths, spring_stiffnesses)
cost1 = cost_function(snapshots[-1])
print(f'initial cost {cost0:.5f}, after changing first rest length {cost1:.5f}')
print(f'gradient d cost / d rest_length approx. {(cost1 - cost0)/0.001:.4e}')
initial cost 0.03803, after changing first rest length 0.03802 gradient d cost / d rest_length approx. -1.1613e-02
We can automate this method to compute this for all three spring lengths.
@numba.jit()
def compute_approximate_gradient(x, spring_lengths, dl=0.001):
gradient = np.zeros_like(spring_lengths)
# starting point
snapshots = forward(x, spring_anchors, spring_lengths, spring_stiffnesses)
ref_cost = cost_function(snapshots[-1])
for rest_length_i in range(3):
spring_lengths[rest_length_i] += dl
snapshots = forward(x, spring_anchors, spring_lengths, spring_stiffnesses)
cost1 = cost_function(snapshots[-1])
gradient[rest_length_i] = (cost1 - ref_cost)/dl
return gradient
approx_grad = compute_approximate_gradient(x, np.array([0.1, 0.1, 0.1 * 2 ** 0.5]))
approx_grad
array([-0.01161341, -0.02329161, -0.02888142])
If you're familiar with gradients, you know that making a small step in the direction of the negative gradient will yield a lower cost.
spring_lengths = np.array([0.1, 0.1, 0.1 * 2 ** 0.5])
learning_rate = 0.01
spring_lengths += -learning_rate * approx_grad
spring_lengths
array([0.10011613, 0.10023292, 0.14171017])
Let's see if the cost is lower (as we expect).
snapshots = forward(x, spring_anchors, spring_lengths, spring_stiffnesses)
print(f'cost before {cost0:.5e}, cost after: {cost_function(snapshots[-1]):.5e}')
cost before 3.80279e-02, cost after: 3.80222e-02
Now we can just iterate this procedure. Hopefully, the cost will just go down. We try to stop when reaching a given error level.
spring_lengths = np.array([0.1, 0.1, 0.1 * 2 ** 0.5])
learning_rate = 0.0001
costs = []
for iters in range(1000):
approx_grad = compute_approximate_gradient(x, spring_lengths)
spring_lengths += -learning_rate * approx_grad
snapshots = forward(x, spring_anchors, spring_lengths, spring_stiffnesses)
cost = cost_function(snapshots[-1])
costs.append(cost)
if iters % 20 == 0:
print(f'iter {iters}: cost {cost:.5e} abs(grad).max(): {np.abs(approx_grad).max()}')
if cost < 1e-3:
learning_rate /= 20
elif cost < 1e-4:
learning_rate /= 20
if cost < 1e-6:
break
iter 0: cost 3.79989e-02 abs(grad).max(): 0.028881420714124695 iter 20: cost 3.73775e-02 abs(grad).max(): 0.032907591919373314 iter 40: cost 3.66764e-02 abs(grad).max(): 0.036820428564642016 iter 60: cost 3.58980e-02 abs(grad).max(): 0.04060547434433376 iter 80: cost 3.50449e-02 abs(grad).max(): 0.04424828076796117 iter 100: cost 3.41202e-02 abs(grad).max(): 0.04773435699673839 iter 120: cost 3.31271e-02 abs(grad).max(): 0.05104915860242032 iter 140: cost 3.20692e-02 abs(grad).max(): 0.05417809002744578 iter 160: cost 3.09503e-02 abs(grad).max(): 0.057106511473857535 iter 180: cost 2.97747e-02 abs(grad).max(): 0.05981974681625729 iter 200: cost 2.85467e-02 abs(grad).max(): 0.06230309135549575 iter 220: cost 2.72713e-02 abs(grad).max(): 0.06454181908291737 iter 240: cost 2.59534e-02 abs(grad).max(): 0.06652118943992896 iter 260: cost 2.45985e-02 abs(grad).max(): 0.06822645366166141 iter 280: cost 2.32121e-02 abs(grad).max(): 0.06964286081604515 iter 300: cost 2.18004e-02 abs(grad).max(): 0.07075566364107833 iter 320: cost 2.03696e-02 abs(grad).max(): 0.07155012426875162 iter 340: cost 1.89262e-02 abs(grad).max(): 0.07201151990283636 iter 360: cost 1.74772e-02 abs(grad).max(): 0.07212514850811133 iter 380: cost 1.60297e-02 abs(grad).max(): 0.07187633455057324 iter 400: cost 1.45912e-02 abs(grad).max(): 0.07125043482003601 iter 420: cost 1.31695e-02 abs(grad).max(): 0.07023284436214784 iter 440: cost 1.17726e-02 abs(grad).max(): 0.06880900253478615 iter 460: cost 1.04088e-02 abs(grad).max(): 0.06696439920233187 iter 480: cost 9.08693e-03 abs(grad).max(): 0.06468458107785832 iter 500: cost 7.81574e-03 abs(grad).max(): 0.061955158218690576 iter 520: cost 6.60450e-03 abs(grad).max(): 0.05876181067957301 iter 540: cost 5.46265e-03 abs(grad).max(): 0.05509029532381124 iter 560: cost 4.39997e-03 abs(grad).max(): 0.050926452795313146 iter 580: cost 3.42646e-03 abs(grad).max(): 0.0462562146462224 iter 600: cost 2.55241e-03 abs(grad).max(): 0.04106561062318034 iter 620: cost 1.78836e-03 abs(grad).max(): 0.035340776102765196 iter 640: cost 1.14510e-03 abs(grad).max(): 0.029067959676872715 iter 660: cost 6.34191e-04 abs(grad).max(): 0.022241165066200174 iter 680: cost 2.66088e-04 abs(grad).max(): 0.014841976378954496 iter 700: cost 5.21528e-05 abs(grad).max(): 0.006851570491115766
How does cost decrease if we plot it?
hv.Curve(costs).redim.label(x='iterations', y='cost function')
Let's look at the final simulation:
make_animation(snapshots, n_frames=20)
Let's check if we've reached the objective and what the rest lengths are now.
triangle_area(snapshots[-1])
0.19927232110562043
Indeed this works: we have reached a triangle area of 0.2.
spring_lengths
array([0.81642608, 0.81785103, 0.86011158])
As a side note, these are not the lengths we expected to find, since the paper mentions [0.600, 0.600, 0.529]. Still, the solution is quite close.
Now, the next question would be: what does it mean for the simulation to be differentiable at every timestep? And how does the neural network control this simulation as described in the paper?
This post was entirely written using the Jupyter Notebook. Its content is BSD-licensed. You can see a static view or download this notebook with the help of nbviewer at 20191030_SpringOptimalControl.ipynb.