Thursday, November 16, 2006

Videos - Reinforcement Learning Control Tasks

These videos are from two reinforcement learning control problems I setup for my master's thesis work last year.


Pendulum Swing-Up
A physically-simulated pendulum is controlled by a Verve agent (http://verve-agents.sourceforge.net). Based on simple reinforcement signals (+1 when the pendulum is close to vertical, -1 otherwise), the agent learns to swing the pendulum upright and balance it after about 60 trials.






Cart-Pole/Inverted Pendulum
A physically-simulated cart is controlled by a Verve agent (http://verve-agents.sourceforge.net) in order to balance an attached pole. Based on simple reinforcement signals (-1 when the pole falls over or the cart goes off the edge of the platform, +1 otherwise), the agent learns to balance the pole for 30 minutes after about 600 trials.


4 comments:

Anonymous said...

Hi, I've been looking for a ready to use neural net toolset to tryout with Simbad and Robotcode (see sourceforge). Verve is one of the candidates. I was wondering if you knew of anyone using Verve in other projects? Your website doesn't have any details and the forum is somewhat lacking in contacts.

Also, does ur libraries perform any call backs to the main application? It looks pretty well self-contained and only requires to be given input and the main program to acquire action when the main program's ready.

Thanks for your hard work. All the best! Sean Underwood

Tyler Streeter said...

Hi Sean,

It looks like Simbad and Robocode are Java frameworks. Were you planning on using a Java version of Verve? If so, I would suggest using SWIG to generate Java bindings. (Look at the provided Python bindings for an example. You can probably use the same SWIG interface file.)

I don't know of anyone using Verve in a project right now. Version 0.1.0 has been downloaded almost 200 times, so I'll bet there are people out there using it. What details do you need? I haven't posted any tutorials (which would be extremely helpful for beginners), but there is a pseudo code sample on the website and full API documentation. Also, there are sample applications with the SDK.

You're right - Verve is pretty self-contained. It doesn't make callbacks to the application.

Ravi said...

wow!
very nice! Which tools did you use to produce these videos? Specifically what did you do to capture your csreen and convert to flash?

Thanks in advance.

Tyler Streeter said...

Ravi-

For these videos I saved screenshots directly from my code (i.e. constantly grabbing the OpenGL frame buffer and saving it as image files), then converted the images to a divx .avi file using VirtualDub (http://www.virtualdub.org). Then I uploaded the .avi file to Google Video, which converts it to flash.

Now VirtualDub has a nice screen capture mode, so I would probably use it to generate the .avi directly instead of saving frames from my code.

Tyler