This report documents the current state of, and latest work done towards, an evolving multidisciplinary project that has been passed down through more than 10 years of students within the Cambridge University Engineering Department. The core theme of this ongoing work is to produce a self-balancing rider-less unicycle.
In this iteration of the work, the focus is on a smaller unicycle, and producing an affine controller capable of balancing it via a machine-learning route, rather than by forming a detailed model and applying control theory. This involves applying a technique known as Pilco, for which a Matlab toolbox exists, maintained by the department. This toolbox has already been shown to be successful when applied to a simulation of a large unicycle.
The small unicycle was the subject of a very similar project last year, in which some electrical and mechanical problems were identified and fixed, and many learning experiments were performed, with limited success.
A number of outstanding issues were raised, from which this work addresses:
automating the transfer of data to and from the robot,
dealing with initial orientations by using data from the accelerometer,
and building a more representative unicycle model in simulation.
Numerous new problems were found in the existing software and electronics during attempts to replicate previous experiments.
Since many of these corresponded to misunderstanding the representation of orientation in 3D, an appendix is included that summarizes these ideas.
Rather than continuing to use these slow full-system experiments to expose problems, this work focussed more on an approach of critically reviewing source code, and performing simple hardware tests.
Code review revealed many hard-to-decipher pieces of code, for which new abstractions were written to ease their readability, applying modern software engineering principles – often exposing bugs in the process.
Overall, large amounts of software was rewritten, which represented the bulk of the work on this project.
Minor improvements were also made to the human interfaces to the software, both graphical and logical in nature – greatly improving its transparency, enabling problems to be more easily identified.
Simulations were performed to evaluate a claim made that a \17° roll restriction was impairing learning progress. This was shown to be true, but a simple solution of adding a cost function term was proposed and tested that diminished this effect in simulation, without incurring a mechanical redesign.
Despite this, learning efforts on the hardware were even less successful than before, and did not meet the expectations set by simulations.
In outlining future work for this project, it is noted that the Pilco framework would benefit from the application of Automatic Differentiation, possibly necessitating a programming language change.
It is also discussed that an affine controller is unlikely to ever be satisfactory, with a quadratic controller shown to be desirable by means of a thought experiment.
Finally, observations are made that the state constraints need to be integrated into the prediction process to avoid underestimating loss.
The full text of this report in PDF format (complete with hyperlinks), selected raw data used within, and links to all the source code used, will be made available online at \url{https://github.com/eric-wieser/masters-thesis/releases}.