So what's new over the past 2 months? Mainly, I have been creating lots of different tests (unit tests and "learning tests"). I was on the brink of finishing the pendulum swing-up task... for about two weeks. It gets frustrating after seeing the pendulum almost hold itself up for the 1000th time, so I took a step back and decided to go about things more scientifically. I developed a framework for what I'm calling "learning tests" - simple learning tasks that can be thrown together quickly and added to an ever-growing suite. Every time I make a significant change in how the agent's work, I can run the suite of learning tests and make sure the agents perform each one above some threshold. I also plot the results of each one using gnuplot (http://www.gnuplot.info). Here are a few example plots (green is overall performance, red is temporal difference error):
Also, I wrote a bunch of unit tests to make sure each functional component is working correctly even after I make big changes. (I decided to write my own C++ unit testing framework. Check here if you're interested: http://quicktest.sf.net)
The purpose of adding all these tests is so I can scale up the learning task complexity a little at a time and find out exactly where problems occur. This seems much more reasonable than throwing a bunch of complicated systems together (learning agent, physics simulation, etc.) and hope it can learn efficiently.
Another big change has been in the area of input representation. Now the agents' inputs are much more customizeable. You can choose between 'continuous' and 'discrete' input channels. Continuous inputs are real values within [-1, 1]. They have a 'resolution' option which determines how many radial basis functions are used when encoding the input signal as a population code. Discrete inputs have a finite number of possible options. You could have a discrete input channel with 13 possible options representing ace, 2, 3, ..., 9, 10, jack, queen, king. Or you could have one with 2 possible options: alarm on, alarm off.
I'm continually excited about working on this project. It helps my morale to have the unit tests and learning tests. I feel like I can keep making incremental progress this way.
Oh, one more thing... I just bought a new Dell Dimension 9100: dual core 2.8 GHz (to inspire me to make Verve multithreaded :) ), 1 Gb ram, nVidia GeForce 6800, and a 24" LCD.