본문 바로가기
HOME> 논문 > 논문 검색상세

논문 상세정보

임무수행을 위한 개선된 강화학습 방법
An Improved Reinforcement Learning Technique for Mission Completion

권우영   (한양대학 정보통신대학원UU0001519  ); 이상훈   (한양대학 전기전자제어계측학과UU0001519  ); 서일홍   (한양대학 정보통신대학원UU0001519  );
  • 초록

    Reinforcement learning (RL) has been widely used as a learning mechanism of an artificial life system. However, RL usually suffers from slow convergence to the optimum state-action sequence or a sequence of stimulus-response (SR) behaviors, and may not correctly work in non-Markov processes. In this paper, first, to cope with slow-convergence problem, if some state-action pairs are considered as disturbance for optimum sequence, then they no to be eliminated in long-term memory (LTM), where such disturbances are found by a shortest path-finding algorithm. This process is shown to let the system get an enhanced learning speed. Second, to partly solve a non-Markov problem, if a stimulus is frequently met in a searching-process, then the stimulus will be classified as a sequential percept for a non-Markov hidden state. And thus, a correct behavior for a non-Markov hidden state can be learned as in a Markov environment. To show the validity of our proposed learning technologies, several simulation result j will be illustrated.


  • 주제어

    reinforcement learning .   delayed reward .   markov process .   batch process.  

  • 참고문헌 (17)

    1. R. Sun, C.,Sessions, 'Self Segmentation of Sequences', IEEE Trans System Man and Cybernetics, Vol.30, No. 3, pp.403418, 2000 
    2. M. Wieringm, J. Schmidhuber, 'HQ-learnming. Adaptive Behavior', 6:2, pp 219-246, 1997 
    3. M. Humphrys, 'Action selection methods using reinforcement learning', From Animals to Animats 4: Proceedings of the Fourth International conference on Simulation of Adaptive Behavior, Cambridge, MA, pp 135-144, MIT Press, 1996 
    4. L. Chrisman, 'Reinforcement Learning with Perceptual Aliasing : The Perceptual Distinctions Approach', National Conference on Artificial Intelligence, pp 183-188, 1992 
    5. R. Sun, T. Peterson, 'Autonomous Learning of Sequential Tasks: Experiments and Analyses', IEEE Trans. Neural Networks, vol.9, no.6, Nov. 1998 
    6. R.E. Neapolitan, Foundation of algorithms : using C++ pseudocode, Jones and Bartlett Publishers, 1998 
    7. M.L. Minsky, 'Steps towards artificial intelligence', In Proceedings of the Institute of Radio Engineers, 49, pp8-30, 1961 
    8. A. K. McCallum, 'Reinforcement Learning with selective Perception and Hidden State', PhD thesis, University of Rochester, 1996 
    9. R.Sun, C.Sessions, 'Self Segmentation of Sequences', IEEE Trans System Man and Cybernetics, vol. 30, no. 3, pp. 403-418, 2000 
    10. M.L. Littman, 'Algorithm for Sequential Decision Making', PhD thesis, Brown University, 1996 
    11. S. D. Whitehead, L.J. Lin, 'Reinforcement learning in non-Markov environments', Artificial Intelligence, 1993 
    12. R.,Sutton, A. Barto, Reinforcement Learning, MIT Press, 1997 
    13. C. Watkins, 'Learning from Delayed Rewards', PhD thesis, University of Cambridge, 1989 
    14. B.F. Skinner, Behavior of Organisms, Appleton-Century-Crofts, 1938 
    15. D.S. Touretzky, L.M.,Saksida, 'Operant conditioning in skinnerbots', Adaptive Behavior, 5(3/4), pp. 219-247, 1997 
    16. L. Kaelbling, M. Littman, A.,Moore, 'Reinforcement Learning : A Survey', J. Artificial Intelligence Research, vol.4, pp.237-285, 1996 
    17. W.S. Lovejoy, 'A survey of algorithmic method for partially observable Markov decision processes', Annual of Operation Research, 28, pp47-66, 1991 

 저자의 다른 논문

  • Kwon, Woo-Young (2)

    1. 2012 "호모 사피엔스와 인지 로봇" 정보과학회지 = Communications of the Korean Institute of Information Scientists and Engineers 30 (12): 71~83    
    2. 2013 "힘과 위치를 동시에 고려한 양팔 물체 조작 솜씨의 모방학습" 로봇학회논문지 = The journal of Korea Robotics Society 8 (1): 20~28    
  • 서일홍 (59)

 활용도 분석

  • 상세보기

    amChart 영역
  • 원문보기

    amChart 영역

원문보기

무료다운로드
  • NDSL :
유료다운로드

유료 다운로드의 경우 해당 사이트의 정책에 따라 신규 회원가입, 로그인, 유료 구매 등이 필요할 수 있습니다. 해당 사이트에서 발생하는 귀하의 모든 정보활동은 NDSL의 서비스 정책과 무관합니다.

원문복사신청을 하시면, 일부 해외 인쇄학술지의 경우 외국학술지지원센터(FRIC)에서
무료 원문복사 서비스를 제공합니다.

NDSL에서는 해당 원문을 복사서비스하고 있습니다. 위의 원문복사신청 또는 장바구니 담기를 통하여 원문복사서비스 이용이 가능합니다.

이 논문과 함께 출판된 논문 + 더보기