I finally solved the pendulum swing-up task the other day. Here's a picture of it balancing itself:
The main change that helped solve this task was the following relatively major change: I no longer use radial basis functions. Instead, I discretize all incoming continuous inputs into separate "boxes" (to use the terminology from the literature) and generate a list of all possible combinations of the input signals. So in the case of the pendulum, say we discretize the two inputs (pendulum angle and angular velocity) into 12 boxes each. That means the intermediate state representation is an array of 144 possible combinations, each one representing a unique state. It's kind of a brute-force method, but it's very reliable and easy to understand. I may come back to radial basis functions and hebbian learning mechanisms later to form a more compact state representation.
Here are some pictures of the pendulum's (simple) neural network before and after learning the task. Excitatory connections are green, inhibitory are red. The connection diameter represents its weight's magnitude.
So here's a list of the tasks solved so far (with the number of inputs and outputs specified as (inputs/outputs)):
- N-armed bandit (0/10)
- hot plate (1/3)
- signaled hot plate (2/3)
- 2D signaled hot plate (3/5)
- pendulum swing up (2/3)
Next up: the inverted pendulum (aka cart-pole)...
One more thing: I used SWIG to generate Python bindings. Verve seems to work pretty well as a Python module. I don't know if I'll use it much right away, but it's good to have around.
No comments:
Post a Comment