同 project2 一样,project4 也是要求我们编写代码来控制 pacman 的行动来刷分。
在这一次的任务中,最初,整个地图是不可见的。
pacman 需要根据探索到的已知信息来推断地图上的房子哪一个有食物,哪一个是鬼屋。
最终进入食物屋吃到食物。
为了从已知信息中推断出隐含信息,这里使用了贝叶斯网络来进行推导。
解决本任务需要对贝叶斯网络有一定了解,这里简单介绍一下。
贝叶斯网络维基百科词条:https://zh.wikipedia.org/wiki/%E8%B2%9D%E6%B0%8F%E7%B6%B2%E8%B7%AF
贝叶斯网络是一种概率图型模型,其表现形式是有向无环图。 一般而言,贝叶斯网络的有向无环图中的节点表示随机变量。 连接两个节点的箭头代表此两个随机变量是具有因果关系或是非条件独立的; 若两个节点间以一个单箭头连接在一起,表示其中一个节点是“因(parents)”,另一个是“果(descendants or children)”,两节点就会产生一个条件概率值。 而两个节点间若没有箭头则为条件独立。
如上图中,共有三个变量X2、X4、X5。
其中,X4-->X5, X4-->X2, X5-->X2。
X4-->X5 表示: 在 X4 发生的前提下,X5 发生的概率。
Question 1需要我们构造一个初始的贝叶斯网络。
我们先要了解项目中贝叶斯网络的构建方法。
在代码框架中,bayesNet.py 实现了贝叶斯网络的表示,并提供了一些贝叶斯网络的构建工具,还有一些使用示例。
# printStarterBayesNet() 给出了一些贝叶斯网络的构建例子 # This is the example V structured Bayes' net from the lecture # on Bayes' nets independence. # Constructing Bayes' nets: variables list variableList = ['Raining', 'Ballgame', 'Traffic'] # Constructing Bayes' nets, edge list: (x, y) means edge from x to y edgeTuplesList = [('Raining', 'Traffic'), ('Ballgame', 'Traffic')] # Construct the domain for each variable (a list like) variableDomainsDict = {} variableDomainsDict['Raining'] = ['yes', 'no'] variableDomainsDict['Ballgame'] = ['yes', 'no'] variableDomainsDict['Traffic'] = ['yes', 'no'] # None of the conditional probability tables are assigned yet in our Bayes' net bayesNet = constructEmptyBayesNet(variableList, edgeTuplesList, variableDomainsDict)
从上面的示例可知,我们要提供变量列表、边元组列表以及变量取值来构造贝叶斯网络。
所以,结合上面的贝叶斯网络结构图,构建变量列表、变量关系元组列表以及变量的取值列表,就可以很简便地构造出网络。
def constructBayesNet(gameState): """ Question 1: Bayes net structure Construct an empty Bayes net according to the structure given in the project description. There are 5 kinds of variables in this Bayes net: - a single "x position" variable (controlling the x pos of the houses) - a single "y position" variable (controlling the y pos of the houses) - a single "food house" variable (containing the house centers) - a single "ghost house" variable (containing the house centers) - a large number of "observation" variables for each cell Pacman can measure You *must* name all position and house variables using the constants (X_POS_VAR, FOOD_HOUSE_VAR, etc.) at the top of this file. The full set of observation variables can be obtained as follows: for housePos in gameState.getPossibleHouses(): for obsPos in gameState.getHouseWalls(housePos) obsVar = OBS_VAR_TEMPLATE % obsPos In this method, you should: - populate `obsVars` using the procedure above - populate `edges` with every edge in the Bayes Net (a tuple `(from, to)`) - set each `variableDomainsDict[var] = values`, where `values` is the set of possible assignments to `var`. These should again be set using the constants defined at the top of this file. """ obsVars = [] edges = [] variableDomainsDict = {} "*** YOUR CODE HERE ***" # 获取所有的Observation 并规定它的值域 for housePos in gameState.getPossibleHouses(): for obsPos in gameState.getHouseWalls(housePos): obsVar = OBS_VAR_TEMPLATE % obsPos obsVars.append(obsVar) variableDomainsDict[obsVar] = OBS_VALS # 变量表 variableList = [X_POS_VAR, Y_POS_VAR] + HOUSE_VARS + obsVars # 变量表中各变量的值域(可能取值) variableDomainsDict[X_POS_VAR] = X_POS_VALS variableDomainsDict[Y_POS_VAR] = Y_POS_VALS variableDomainsDict[FOOD_HOUSE_VAR] = HOUSE_VALS variableDomainsDict[GHOST_HOUSE_VAR] = HOUSE_VALS # 在节点间加边 结点间的条件关系可以参考官网上的指导 edges.append((X_POS_VAR, FOOD_HOUSE_VAR)) edges.append((Y_POS_VAR, FOOD_HOUSE_VAR)) edges.append((X_POS_VAR, GHOST_HOUSE_VAR)) edges.append((Y_POS_VAR, GHOST_HOUSE_VAR)) for obsVar in obsVars: edges.append((FOOD_HOUSE_VAR, obsVar)) edges.append((GHOST_HOUSE_VAR, obsVar)) # 通过变量表、变量间关系、遍历取值构建贝叶斯网络 ABayesNet = bn.constructEmptyBayesNet(variableList, edges, variableDomainsDict) return ABayesNet, obsVars "*** END YOUR CODE HERE ***"
Question 2要求我们填写Y变量的条件变量表。
仿照X变量的条件变量表,填写每一个条件变量的概率即可。
def fillYCPT(bayesNet, gameState): """ Question 2: Bayes net probabilities Fill the CPT that gives the prior probability over the y position variable. See the definition of `fillXCPT` above for an example of how to do this. You can use the PROB_* constants imported from layout rather than writing probabilities down by hand. BOTH_TOP_VAL, BOTH_BOTTOM_VAL, LEFT_TOP_VAL, LEFT_BOTTOM_VAL """ from layout import PROB_BOTH_TOP from layout import PROB_BOTH_BOTTOM from layout import PROB_ONLY_LEFT_TOP from layout import PROB_ONLY_LEFT_BOTTOM yFactor = bn.Factor([Y_POS_VAR], [], bayesNet.variableDomainsDict()) "*** YOUR CODE HERE ***" # 设置 Y_POS_VAR 的各个取值的概率 yFactor.setProbability({Y_POS_VAR: BOTH_TOP_VAL}, PROB_BOTH_TOP) yFactor.setProbability({Y_POS_VAR: BOTH_BOTTOM_VAL}, PROB_BOTH_BOTTOM) yFactor.setProbability({Y_POS_VAR: LEFT_TOP_VAL}, PROB_ONLY_LEFT_TOP) yFactor.setProbability({Y_POS_VAR: LEFT_BOTTOM_VAL}, PROB_ONLY_LEFT_BOTTOM) bayesNet.setCPT(Y_POS_VAR, yFactor) "*** END YOUR CODE HERE ***"
Question 3 要求我们实现自己的条件概率的乘积方法。
输入为一个条件概率列表,输出是列表中条件概率的乘积结果,乘积结果也是一个条件概率。
网站上给了一些该函数的使用例子:
观察这些例子可以发现,用贝叶斯网络可以很简单地解决条件概率的问题。
条件概率的乘积问题,可以转换为寻找融合后贝叶斯网络的根节点。
以 P(V, W | X, Y, Z) * P(X, Y | Z) 为例:
P(V, W | X, Y, Z) 可以表示为一个简单的贝叶斯网络:
其中条件变量X, Y, Z 为父节点;非条件变量 V, W为子节点。
P(X, Y | Z) 可表示为:
其中条件变量Z 为父节点;非条件变量X, Y 为子节点。
观察两个贝叶斯网络可以发现,能作为条件变量的,必须是贝叶斯网络中最上层的根节点。
同理,P(V, W | X, Y, Z) * P(X, Y | Z) 也可以表示成一个贝叶斯网络。
我们可以画出 P(V, W | X, Y, Z) * P(X, Y | Z) = P(V, W, X, Y | Z) 的网络表示:
可以发现,P(V, W | X, Y, Z) * P(X, Y | Z) 的贝叶斯网络就是 P(V, W | X, Y, Z) 和 P(X, Y | Z) 的网络进行融合得到的,
而结果的条件变量也是它的贝叶斯网络的最上层根节点。
因此,条件概率的乘积可以转换为寻找融合后的贝叶斯网络的最上层节点。
一种实现想法是,沿着融合后的关系链往上找,最上层的变量即为条件变量。
但是,框架给我们提供了一种更简便的实现方法:
在Factor表示的条件概率中,Conditional的和Unconditional的是分开存储的。
意思说我们可以知道每个条件概率贝叶斯树的根节点。
所以,可以推断:
1、在Unconditional中出现过的变量一定不为条件变量,
因为它已经有父节点了,融合后也不可能成为最上层节点。
2、在Conditional中出现的变量,如果它没有在Unconditional中出现过,则一定为条件变量。
因为它融合后不可能为子节点。
因此,实现步骤为:
1、找出乘积因子中所有的变量
2、将变量分类为条件变量和非条件变量
3、去除条件变量中在非条件变量中出现过的变量
这就寻找出了结果的条件变量和非条件变量。
4、为结果贝叶斯树中的每一个条件组合赋上概率值
5、构造结果Factor,返回。
def joinFactors(factors): """ Question 3: Your join implementation Input factors is a list of factors. You should calculate the set of unconditioned variables and conditioned variables for the join of those factors. Return a new factor that has those variables and whose probability entries are product of the corresponding rows of the input factors. You may assume that the variableDomainsDict for all the input factors are the same, since they come from the same BayesNet. joinFactors will only allow unconditionedVariables to appear in one input factor (so their join is well defined). Hint: Factor methods that take an assignmentDict as input (such as getProbability and setProbability) can handle assignmentDicts that assign more variables than are in that factor. Useful functions: Factor.getAllPossibleAssignmentDicts Factor.getProbability Factor.setProbability Factor.unconditionedVariables Factor.conditionedVariables Factor.variableDomainsDict """ # typecheck portion setsOfUnconditioned = [set(factor.unconditionedVariables()) for factor in factors] if len(factors) > 1: intersect = functools.reduce(lambda x, y: x & y, setsOfUnconditioned) if len(intersect) > 0: print("Factor failed joinFactors typecheck: ", factors) raise ValueError("unconditionedVariables can only appear in one factor. \n" + "unconditionedVariables: " + str(intersect) + "\nappear in more than one input factor.\n" + "Input factors: \n" + "\n".join(map(str, factors))) """iterate over assignments in newFactor.getAllPossibleAssignmentsDicts(): iterate over factors: Call factor.getProbability(assignment) [and do something with it so you can do the setProbability step in the next line] setProbability in the newFactor.""" "*** YOUR CODE HERE ***" #获取所有的Factor allFactors=[factor for factor in factors] conditional=set([]) unconditional=set([]) # 将变量分为conditional和unconditional的 for factor in allFactors: for f in factor.conditionedVariables(): conditional.add(f) for f in factor.unconditionedVariables(): unconditional.add(f) # 如果变量在unconditional里出现过,那么它做乘积后就在unconditional里 conditional=list(filter(lambda x: x not in unconditional, conditional)) variableDomainsDict=allFactors[0].variableDomainsDict() newCPT = Factor(unconditional, conditional, variableDomainsDict) for assignment in newCPT.getAllPossibleAssignmentDicts(): probability = 1 for factor in allFactors: probability = probability * factor.getProbability(assignment) newCPT.setProbability(assignment, probability) return newCPT "*** END YOUR CODE HERE ***"