rl-texplore-ros-pkg References

This page provides references for the code provided in the rl-texplore-ros-pkg, including the reinforcement_learning stack and all of its packages.

  • Beeson P, O’Quin J, Gillan B, Nimmagadda T, Ristroph M, Li D, Stone P (2008) Multiagent interactions in urban driving. Journal of Physical Agents 2(1):15–30
  • Brafman R, Tennenholtz M (2001) R-Max - a general polynomial time algorithm for near-optimal reinforcement learning. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI), pp 953–958
  • Breiman L (2001) Random forests. Machine Learning 45(1):5–32
  • Dietterich T (1998) The MAXQ method for hierarchical reinforcement learning. In: Proceedings of the Fifteenth International Conference on Machine Learning (ICML), pp 118–126
  • Hester T, Stone P (2010) Real time targeted exploration in large domains. In: Proceedings of the Ninth International Conference on Development and Learning (ICDL)

  • Hester T, Quinlan M, Stone P (2012) A real-time model-based reinforcement learning architecture for robot control. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)

  • Hester T, Stone P (2012) Intrinsically Motivated Model Learning for a Developing Curious Agent. In: Proceedings of the Eleventh International Conference on Development and Learning (ICDL)

  • Hester T, Stone P (2013) TEXPLORE: Real-Time Sample-Efficient Reinforcement Learning for Robots. In: Machine Learning, Volume 90, Issue 3, Pages 385-429, 2013.

  • Kocsis L, Szepesvari C (2006) Bandit based Monte-Carlo planning. In: Proceedings of the Seventeenth European Conference on Machine Learning (ECML)
  • Konidaris G, Barto A (2007) Building Portable Options: Skill Transfer in Reinforcement Learning. In: Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI)
  • Moore A, Atkeson C (1993) Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning 13:103–130
  • McCallum A (1996) Learning to use selective attention and short-term memory in sequential tasks. In: From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior

  • Quigley M, Conley K, Gerkey B, Faust J, Foote T, Leibs J, Wheeler R, Ng A (2009) ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software
  • Quinlan R (1986) Induction of decision trees. Machine Learning 1:81–106
  • Quinlan R (1992) Learning with continuous classes. In: 5th Australian Joint Conference on Artificial Intelligence, World Scientific, Singapore, pp 343–348
  • Rummery G, Niranjan M (1994) On-line Q-learning using connectionist systems. Tech. Rep. CUED/F-INFENG/TR 166, Cambridge University Engineering Department
  • Strehl A, Diuk C, Littman M (2007) Efficient structure learning in factored-state MDPs. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, pp 645–650
  • Sutton R (1990) Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proceedings of the Seventh International Conference on Machine Learning (ICML), pp 216–224
  • Sutton R, Barto A (1998) Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA
  • Tanner B, White A (2009) RL-Glue : Language-independent software for reinforcement-learning experiments. Journal of Machine Learning Research 10:2133–2136
  • Watkins C (1989) Learning from delayed rewards. PhD thesis, University of Cambridge

Wiki: rl-texplore-ros-pkg/rl_references (last edited 2013-02-08 19:35:16 by ToddHester)