I've been thinking for a while about how to make the training system work smoothly with the desired behaviors. So far I've been creating a separate C++ class for each desired behavior. This works fairly well, but they usually take a while to tweak. It would be nice to have a separate script file for each desired behavior. So... I've been learning to use Lua. I think Lua scripts will work nicely for this purpose. I won't need anything too fancy, and it seems like Lua is getting pretty popular with game developers.
I hit a roadblock yesterday when I was thinking about how Lua would work with Verve. The users will have full control over their character models and virtual environments, but the desired behaviors (whether scripts or C++ functions/classes) will need to have full access to the character model and possibly the environment to decide how well the character's controller is doing.
The way I did this with my desired behavior C++ classes was this (using "walking" as an example): each frame/time step a pointer to the character would be passed to the Walking class. This class would examine the character's body (how far it had moved forward, how many steps it had taken) and update the fitness score. The problem with doing this in Lua scripts is that the users would have to create all the Lua/C++ bindings to let the script access the character and environment. This would work, but it makes the API uglier.
So I came up with this idea: the user will create a set of metrics to measure in the C++ code. For example, let's say the user is training a simulated robot to stack blocks. Each time step the user could call Robot->GetMetrics() and maybe Environment->GetMetrics(). These return the number of steps the robot has taken, how much energy he has used, the height of his head, the number of blocks stacked up in the environment, etc. Notice that some of the numbers might not be needed for all desired behaviors; they are just a bunch of number that might be used. Then, the numbers all get passed to the Stacking Blocks Lua script which measures only the number of blocks stacked and returns a fitness score. Another Lua script (Jumping) might use other numbers (the head height) to calculate a fitness score.
I'll test this method to see how well it works. I want the API to be very clean, and I don't want the users to have to write tons of extra code. To give them the freedom to use their own character code, though, they'll still have to do some extra work.
I think I may also give them the option of using Lua scripts or not. They may just want to write the desired behaviors in C++ anyway (maybe because they don't know/want to learn Lua or because the Lua scripts might slow down the training process).
No comments:
Post a Comment