How to make your own GDTH-PMP Controller

Brock Marcinczyk
Jan 12
10 min read

Updated: Jan 13

A step-by-step guide for arbitrary robotic manipulator

Required Tools:

A Computer with a Programming Language
A Symbolic Toolbox Package (Not mandatory, but STRONGLY recommended.)

Hello everyone, in this post, I will be discussing how you can make your own personalized Gradient Descent Time Horizon - Pontryagin’s Mininum Principle (GDTH-PMP) control algorithm. This algorithm is strongly intertwined with the fundamental principles of Calculus of Variations, and as such, all the major control law components will explicitly be derived from a variational approach. The post will be divided into two consecutive sections:

Rigid Body Dynamics
- Position Vectors/ Basis Vectors
- State Coupled Inertia
- Energy Formulation
- Euler-Lagrange Equations
Control Law
- Gradient Descent
  - Cost function formulation
    - Hamilton’s Principle
    - Khatib-Weighting
- Optimal Control Formulation
  - Pontryagin’s Minimum Principle Derivation
  - The Optimal Manipulator Equation

Let us now discuss the derivation associated with the arbitrary manipulator. Here, I use an almost identical manipulator to the one used in the post titled, Optimal Control and Structurally-Informed Gradient Optimization of a Custom 4-DOF Rigid-Body Manipulator. I will be deriving the equations of motion using Euler-Lagrange Mechanics. For the system of this type, the Euler-Lagrange method is not only sufficient, but in my view, is decisively superior.

Though Newton-Euler is computationally efficient and real-time, it introduces a level of bookkeeping that becomes increasingly cumbersome, especially for higher DOF manipulators. Even with computer algebra systems, Newton Euler still is very cumbersome since the Manipulator equations of motion have to be derived explicitly from vector products, rather than letting the Calculus do its own thing.

Euler-Lagrange on the other hand, yields

A compact, global description of the system dynamics
Direct access to the mass matrix, Coriolis terms, and gravitational effects

Most importantly, the control law is derived entirely from Variational Calculus, so Euler-Lagrange falls naturally into the scope of this control law.

You may ask yourself this:

“Wait, but isn’t Euler–Lagrange harder than something like Newton’s Second Law?”

Not really. With Euler–Lagrange, all you need are the position vectors and the basis unit vectors. From there, you just take derivatives and let the math do the work. Voilà — you get the equations of motion!

No force diagrams from hell. Just geometry, energy, and calculus doing their thing.

Rigid Body Dynamics

Figure 1.) Robotic Manipulator to be used in GDTH-PMP tutorial

For the manipulator mechanics in this tutorial, I use spherical coordinates instead of DH parameters to define position vectors and basis unit vectors, since they align more naturally with the system geometry and avoid heavy coordinate-frame bookkeeping.

You can reach the same result with DH, but for continuous dynamics and variational formulations, spherical coordinates are simply superior.

Plus the spherical basis vectors arise naturally as the columns of the matrix, so you can literally read them off—no extra derivation required (and yes, the same matrix is on Wikipedia).

You don’t need to build a full coordinate transformation to use spherical coordinates; just be careful with angle accumulation. If your link angles are cumulative—as they are here—then the corresponding θ and 𝜙 terms should simply be added algebraically.

Here, I define the basis vectors in cumulative form as follows in the radial basis:

Similarly, for the angle basis vectors:

So, for the subscripts for θ₁₂ indicate that the angle is accumulative which indicates the motor angle, i.e. θ₁₂ = θ₁₂ = θ₁ + θ₂. your

Now, I define the base frame position relative to the first link as follows

Similarly for link 1 center of mass position:

Then for the second link's center of mass when the actuator is fully retracted:

Finally, for the actuator's center of mass plus the position of the object mass location for when the object is being picked up:

The corresponding derivatives, take the following form, remember the chain rule though, this is important!

The next step is to define the moments of inertia in generalized coordinates. I recommend using this approach—not because it is required, but because it is usually more intuitive to express inertia in the basis-vector form:

For example, the the inertial distribution of the robotic manipulator in the iᵗʰ link is given by:

I also do not recommend using products of inertia, since they tend to be computationally inefficient. With this formulation, you can directly take the inertia values from your CAD model and plug them into the simulation, which makes the plant significantly more accurate in terms of its inertial distribution.

Next, we define the Kinetic Energy as follows:

Finally, we define the potential energy of the system:

Note that the base-frame potential energy cancels out automatically—both in the gradient descent algorithm and in the equations of motion. This happens because the manipulator mount does not experience gravitational motion: the robot is rigidly fixed to the base frame, so its height never changes.

However, the base frame does contribute to the kinetic energy through its inertial distribution, which appears via the familiar parallel-axis terms when computing the mass matrix.

Note that a very good approximation for motor behavior in the sense of mechanical dynamics, is to use virtual springs, and by choice, I used virtual damping (not required!).

Finally, after T and V are defined, we apply the Lagrange equation:

Then, you plug and chug this formulation into the Euler Lagrange equations for each coordinate:

To keep your local computer from hating you, I recommend rewriting the Euler–Lagrange equations in standard manipulator form. Simply put, matrices and computers get along a lot better than giant symbolic expressions as one giant column vector.

Because I’m using virtual springs, the potential energy gradient is basically doing double duty: spring forces plus gravity, wrapped into one clean term.

Before moving on to gradient descent, I strongly recommend integrating the dynamics in their unconstrained, uncontrolled form using a numerical integrator to make sure everything behaves sensibly. Alternatively, you can formally prove correctness like a nerd and earn five imaginary bonus points by using the following formula as an additional sanity check:

Note that the bold S symbol is used to indicate a skew symmetric matrix.

Your plots should look something like the ones above. They don’t need to converge to any fancy target angle yet, but if your dynamics are correct, you should at least see sinusoidal or damped transient behavior—the same kind of response you’d expect from a vibrations or your first transfer-function class.

If your plots look like random noise or blow up to infinity, congrats: your equations are cooked!

That concludes Part 1 of the derivation. The physics derivation is done — now we make it move.

Controls

Now that the dynamics are finished, we can move on to the controls portion of the project. We’ll use Hamilton’s formulation to set up a quadratic cost, and that starts by grabbing the mass matrix from the Euler–Lagrange equations we just derived, and from this, you'll define the Hamiltonian as follows.

And honestly—I'm not obsessed with the cost value itself. The gradient is the real prize, because that’s what actually drives the update law. Uh oh! We're forgetting something, right, the error term!

Here, I define the error in terms of the center of mass for the first link, to sort of act as a "hoisting" function if you will, to keep the robot's first link relatively high when we have to pick up heavy obstacles, and we define the end-effector position vector as follows:

I use “des” to mark the desired position, but the real work is in defining the weights. Rather than hard-coding some massive constant and hoping for the best, e.g. β = 10000, we’ll compute them dynamically using Khatib’s operational-space formulation.

Unlike Lyapunov, who requires an entire proof just to show that a controller is stable, Khatib more or less says: “Hand me your mass matrix and your Jacobian, and we’re good.”

Different philosophies—but one of them gets you to a working controller a lot faster.

To define what I like to call the Khatib weights, we first introduce an intermediate quantity I’ll refer to as “fat-lambda”:

In this expression, ε and ϵ are small regularization terms added for numerical stability, and the absolute value is included for extra safety. Then—and this part matters—define the weights numerically, not symbolically (unless you enjoy watching your computer catch fire):

Finally, we take only the diagonal entries of this matrix so the weights remain in operational space. The off-diagonal terms exist, but for this application, they’re more noise than value.

Alright—finally, the part we’ve all been waiting for: the cost function!

Simply put, the cost is defined as the Hamiltonian plus the β weights packaged into a diagonal matrix. Put those together, and voilà—you get a controllable Hamiltonian that combines the rigid-body dynamics with Khatib-weighted operational-space terms.

Now we’re going to take the gradient with respect to the state vector X:

This is what lets us actually define the gradient-descent update law. To do that, we take the partial derivative of the cost function with respect to each state variable and stack the result into a column vector:

Ok, so we have the partial derivatives, but how can we actually control this Robot with the Gradient Descent if we don't have the initial velocities to define? This part is easy, what we do, is we take the gravity component of the potential gradient, this guy here circled in red:

This gives us the 'moment' component of the equations, which we use in an empirical relationship which is governed by our motor actuation limits, and then with this, we define what I like to call, 'the initial torque vector.'

If you are reading this, and you are following the equations step by step, and wondering, "Wait... what about ϕ?" through classical structural mechanics (what I did) by using helical gear equations and reaction forces from standard statics — and yes, my Marcinczyk Reaction Calculator is built for exactly that! Alternatively, you can use Lagrange multipliers if you can tie kinematic constraints directly to the azimuthal coordinate… but fair warning: that route tends to get messy fast, and it’s a great way to make your computer regret being born.

Next, we initialize the velocities using a classical empirical relationship. If the reaction forces get too large, the robot simply doesn’t move. That’s intentional. That’s physics doing its job.

And yes—if the ratio in that term exceeds one, the solver will throw a big, angry error. Consider it a built-in reality check. Oh—and that sgn term is what actually drives the robot toward the final position. Remember from calculus / pre-calc that a direction vector is just final minus initial? Same idea here.

We take the sign of that difference and use it to tell the robot which way to move. No magic—just basic math doing honest work.

Finally, we use Euler integration to step the kinematics forward in time. On top of that, we can optionally apply gradient smoothing using a momentum-based update rule (again—optional, but nice if you want things to behave).

The (β) terms (not Khatib β's) are used purely as smoothing weights, with different subscripts depending on the state. The same idea applies to the learning rate: η is used to define the rate of deterioration, and α₀ is used as the initial learning rate.

For the initial learning rates, I set the angular terms to a nice, loud 10−3. The reason is simple: my motors can actually handle aggressive motion, so there’s no point in babying the optimizer when the hardware is built to move fast.

The linear actuator, however, is a different story.

So I drop its learning rate to the 3e−8, I don’t want it converging too quickly, because that’s a great way to turn a budget actuator into a smoke machine. Slow, stable, and not on fire beats fast and dead every time.

Finally, once all the gradients converge, you record the time at which that happens and call it T, the terminal time. This value then gets fed directly into the optimal controller.

In conclusion, after all the math, hand-waving, and emotional damage, your algorithm should end up looking something like this:

Note that the stopping condition should be based on joint-space error, not task-space error. Doing it this way lets you shut down each joint at the correct time, instead of relying on end-effector behavior alone. You get much tighter control over how the manipulator actually behaves during gradient descent.

Before you even think about defining the optimal controller, I strongly recommend running what I call the “torture tests.” This means generating around 100 completely random target points within the robot’s reachable workspace and analyzing the resulting trajectories. This step is critical—solver failures can quietly produce incorrect boundary conditions, and if you don’t catch that early, everything downstream becomes fiction.

When I ran my torture tests, I got lucky and ended up with an average Euclidean error of about 10 mm; running the torture tests over and over again yield similar error

magnitudes.

You’ll also notice that the error peaks near actuator singularities—which is exactly what you’d expect. That’s Khatib doing his job: he really doesn’t like singular configurations, and he makes sure the controller knows it too.

Alright, now it’s time to define the optimal controller—this is the spicy part. Here we use Pontryagin’s Minimum Principle to derive the optimized kinematics in generalized coordinates, so we can tell the motors exactly what to do.

See the pattern yet? This whole controller is built around variational calculus. Why? Because variational calculus is cool, and Pontryagin—when you do it symbolically from my experience —gives you smooth, physically sane force and torque profiles instead of jittery numerical nonsense.

We formulate the optimal controller as follows. I’m using shorthand for the state vector here because writing everything out in full gets ugly fast—and the equations separate nicely anyway.

And yep—notice something familiar? That same terminal time T we computed from gradient descent shows up here too. No coincidence. The optimizer finds when to stop, and Pontryagin tells us how to get there.

Alright, time to lay down the formal definitions, because no variational optimal control method is legally allowed to exist without these.

Here, the parameter λ defines the costates, which act like time-varying Lagrange multipliers and time-varying control gains. Here I introduce the control Hamiltonian, denoted with a star to separate the control terms from the dynamics.

The controls Hamiltonian, is given through:

When we use explicit form, we obtain:

Satisfying optimality yields:

Solving for the optimal control input yields the derivative of the costate:

Plugging this back into the Controls Hamiltonian yields:

Then, satisfying the condition of the costate dynamics and the state dynamics yields:

Next, we perform integration with the state dynamics as follows:

Finally, we define the boundary conditions, and decouple the optimal trajectories using the matrix formulation:

q0 and qf are the initial and final states. Invert the matrix, and boom—you get the optimized kinematics. Plug those back into the manipulator equations numerically, and now you’ve got the optimal manipulator

Do it numerically. Your CPU will thank you.

Finally, here’s what happens next: you run this entire pipeline inside a for-loop, once per waypoint, so you can generate the full force/torque trajectories across the task.

Pretty much, it’s a for-loop over the gradient-descent control law across multiple waypoints. Each time the solver converges, the result is plugged back into the manipulator dynamics to generate the corresponding trajectory segment.

That said, trajectory planning still requires care. Your waypoints must be physically reachable by the robot. If you define targets that violate the robot’s kinematics—for example, a Euclidean distance smaller than what the link geometry allows, or a point outside the workspace—the solver will fail or produce meaningless results.

In short: good geometry in, good physics out.

From these results, you should get smooth, bounded force torque profiles; and note that for the profiles; you should get something like this for each trajectory.

https://video.wixstatic.com/video/ee32d1_77b476d10ee041e5a8003ae2c9da6872/720p/mp4/file.mp4

And there it is: the robot casually dropping an arbitrarily heavy 4kg Coke can can into the trash bin.

Here's the supporting MATLAB code consisting of the derivations, torture test, ODE-45 sanity check, and trajectory planning are defined below.

Wikipedia contributors. (2025, December 16). Spherical coordinate system. Wikipedia. https://en.wikipedia.org/wiki/Spherical_coordinate_system