Monte Carlo Tree Search Guided by Symbolic Advice for MDPs

Duration: 29 mins 28 secs

Share this media item:

Embed this media item:

<iframe width="_width_" height="_height_" src="https://sms.cam.ac.uk/media/3527397/embed" frameborder="0" scrolling="no" allowfullscreen></iframe>

Choose size:

About this item

Available Formats

About this item

Description:	Raskin, J Wednesday 12th May 2021 - 18:00 to 18:30


Created:	2021-05-18 11:01
Collection:	Verified software: from theory to practice
Publisher:	Isaac Newton Institute for Mathematical Sciences
Copyright:	Raskin, J
Language:	eng (English)


Abstract:	In this talk, we consider the online computation of a strategy that aims at optimizing the expected average reward in a Markov decision process. The strategy is computed with a receding horizon and using Monte Carlo tree search (MCTS). We augment the MCTS algorithm with the notion of symbolic advice, and show that its classical theoretical guarantees are maintained. Symbolic advice are used to bias the selection and simulation strategies of MCTS. We describe how to use QBF and SAT solvers to implement symbolic advice in an efficient way. We illustrate our new algorithm using the popular game Pac-Man and show that the performances of our algorithm exceed those of plain MCTS as well as the performances of human players.

Abstract:

In this talk, we consider the online computation of a strategy that aims at optimizing the expected average reward in a Markov decision process. The strategy is computed with a receding horizon and using Monte Carlo tree search (MCTS). We augment the MCTS algorithm with the notion of symbolic advice, and show that its classical theoretical guarantees are maintained. Symbolic advice are used to bias the selection and simulation strategies of MCTS. We describe how to use QBF and SAT solvers to implement symbolic advice in an efficient way. We illustrate our new algorithm using the popular game Pac-Man and show that the performances of our algorithm exceed those of plain MCTS as well as the performances of human players.

Available Formats

Format	Quality	Bitrate	Size
MPEG-4 Video	640x360	947.82 kbits/sec	204.56 MB	View	Download
WebM	640x360	361.01 kbits/sec	77.96 MB	View	Download
iPod Video	480x270	479.88 kbits/sec	103.57 MB	View	Download
MP3	44100 Hz	249.84 kbits/sec	53.98 MB	Listen	Download
Auto *	(Allows browser to choose a format it supports)