ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question
1

ROS rl (texplore) package

asked 2012-08-01 19:22:37 -0600

ocli gravatar image

updated 2012-08-02 07:45:12 -0600

joq gravatar image

Hello,

I understand that the rl texplore stack has the ability to model a continuous state environment. I'm having some trouble running the agent/environment packages in a unique environment for the package given my problem is modelled with an 'infinitely' large state-space.

Is this possible, or is there a known workaround? Or will I have to cut my losses and discretize the state-space?

Thanks for any knowledge on this area.

Regards,

Oliver

edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
2

answered 2012-08-02 07:56:39 -0600

Nan Jiang gravatar image

I have looked into the code in rl texplore a little bit. According to what you said, I assume that you are using a model-based agent. As far as I know, although continuous regression methods like m5tree is available as modelling technique, you still have to discretize your state space in planning/learning (according to my memory they did not implement value function approximation algorithms), so an infinitely large state space may be problematic.

I do not know whether you are doing planning or learning. Planning should be OK with infinitely large state space. For learning the problem is that you cannot keep an infinitely large table to save the q-values (I think their code asks for parameters to limit the range of each attribute in the state vector), and you can implement your own function approximation algorithm to solve this.

edit flag offensive delete link more

Comments

Sorry I didn't clarify a few things. I'm trying to compare a few planning methods (model-based), with an agent that I'm planning on integrating into the ROS rl architecture. This new technique uses value function approximation so I was hoping ROS had integrated a similar approach already.

ocli gravatar image ocli  ( 2012-08-02 20:19:21 -0600 )edit

Thanks for the post though, I didn't realise they had integrated a continuous regression method.

ocli gravatar image ocli  ( 2012-08-02 20:20:20 -0600 )edit
1

answered 2012-08-09 09:52:13 -0600

toddhester gravatar image

Hi,

Sorry for the slow reply. I do have methods to model continuous domains in the package, such as using M5 regression trees or linear regression models. However, all of the planning methods that I have implemented for the model-based methods eventually store a discrete value function. So for the planning you would have to discretize.

For the UCT planning, the planning rollouts happen in continuous space, using the continuous model. Then the values are updated back to a discretized state. So this planning method is less dependent on a good discretization, i.e. you won't have state aliasing problems if the discretization is too large, as each next state in the rollout is still using the continuous model and state. The discretization is only necessary for UCT(lambda) with lambda < 1 where the updates for a state is bootstrapping to the value of the next state. If you run it with lambda=1, the agent's current state is only updated with the full return of the planning rollout, and the discretization isn't really necessary.

So if you want to do planning in a continuous and infinite domain, I would recommend planning with UCT(lambda) and setting lambda = 1. The model based methods take as input the minimum and maximum values for each state feature, which are used to bound the values that are updated, as well as # of discrete values you want to discretize each feature into. If you can put some bounds on the state features (even ones that are wildly too big) and then pass 1 as the nstates parameter of how many discrete values you want, I think it should work.

Another alternative is to adapt some of the model-free methods with function approximation (e.g. Q-learning with tile coding) as the planner for the model-based method using the continuous models.

Thanks, Todd

edit flag offensive delete link more

Question Tools

Stats

Asked: 2012-08-01 19:22:37 -0600

Seen: 351 times

Last updated: Aug 09 '12