Skip to content

Instantly share code, notes, and snippets.

@mishoo
Created November 20, 2011 15:35
Show Gist options
  • Save mishoo/1380369 to your computer and use it in GitHub Desktop.
Save mishoo/1380369 to your computer and use it in GitHub Desktop.
Unit 12.3 — value iteration
(defpackage :value-iteration
(:use :cl))
(in-package :value-iteration)
(defun value-iteration (&key grid is-term actions next-state cost gamma)
(loop :with changed = nil
:for i :from 0 :to (1- (array-dimension grid 0))
:finally (return changed)
:do (loop :for j :from 0 :to (1- (array-dimension grid 1))
:with val
:unless (funcall is-term i j) :do
(let ((best (loop :for a :in actions
:maximize (reduce #'+
(funcall next-state a i j)
:key (lambda (m)
(destructuring-bind (i j p) m
(* p (aref grid i j))))))))
(setf val (+ cost (* gamma best))))
(when (/= val (aref grid i j))
(setf changed t
(aref grid i j) val)))))
(defun test ()
(let ((a (make-array '(2 4)
:initial-contents '((0 0 0 0)
(-100 0 0 100)))))
(loop :while
(value-iteration
:grid a
:cost -4
:gamma 1
:is-term (lambda (i j)
(and (= i 1)
(or (= j 0) (= j 3))))
:actions '(:east :west :north :south)
:next-state (lambda (action i j)
(flet ((ibound (i)
(if (< i 0) 0
(min i (1- (array-dimension a 0)))))
(jbound (j)
(if (< j 0) 0
(min j (1- (array-dimension a 1))))))
(case action
(:east `((,i ,(jbound (1+ j)) 0.8)
(,i ,(jbound (1- j)) 0.2)))
(:west `((,i ,(jbound (1- j)) 0.8)
(,i ,(jbound (1+ j)) 0.2)))
(:south `((,(ibound (1+ i)) ,j 0.8)
(,(ibound (1- i)) ,j 0.2)))
(:north `((,(ibound (1- i)) ,j 0.8)
(,(ibound (1+ i)) ,j 0.2))))))))
a))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment