ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

Probabilistic Graphical Modeling概率图模型学习笔记

2020-02-29 19:42:30  阅读:378  来源: 互联网

标签:... Graphical xn variables AC x2 Modeling x1 Probabilistic


Probabilistic Graphical Modeling概率图模型学习笔记

0. learning materials

1. Introduction

A great amout of problems we have are to model the real world with some kind of functions (the most direct example is estimation fitting problems, and further our deep learning, machine learning algorithms are mostly fitting the real problem with a function).

  • But most of the time, the measurement involves a significant amont of uncertainty (in another work “error”). As a result, our measurements are actually following a probability distribution. This introduces the Probability Theory.
  • And the measurements are dependenting on each other, and sometimes we cannot find the exact expression of these relationship, but we know it exists, and we can know some properties of their relation, by prior knowledges. And this introduces the Graph modeling.

As a result, Probabilistic Graphical Modeling is concived to solve such kinds of questions.
There are three main elements in PGM :

  • Representation : How to specify a model ? Normally, we have Bayesian network for a Directed Acyclic Graph; and Markov Random Field for a Undirected graph representation. (And of course, we have other models)
  • Inference : How to ask the model questions ? For example, the Marginal inference telling the probability of a given variable, when we sum over every other variables. And Maximum a posterior inference to tell the most likely assignment of variables.
  • Learning : How to fit a model to real-world data ? Inference and learning have a special link. Inference is a key to learning.

2. Representation

2.1 Bayesian network

It is a directed acyclic graph, and it can deal with variables with causality (the variables, which have directed relationship).

p(x1,x2,x3,..,xn)=p(x1)p(x2x1)...p(xnxn1,...,x2,x1) p(x_{1}, x_{2}, x_{3},..,x_{n}) = p(x_{1})p(x_{2} | x_{1}) ...p(x_{n}|x_{n-1}, ...,x_{2},x_{1}) p(x1​,x2​,x3​,..,xn​)=p(x1​)p(x2​∣x1​)...p(xn​∣xn−1​,...,x2​,x1​)

Based on these relationship, a directed graph could be built, and further the probabilty expression could be formed. That the variable only depends on some of the ancestors AiA_{i}Ai​.

p(xixi1,...,x2,x1)=p(xixAi) p(x_{i}|x_{i-1}, ... ,x_{2}, x_{1}) = p(x_{i}|x_{A_{i}}) p(xi​∣xi−1​,...,x2​,x1​)=p(xi​∣xAi​​)

Fromal definition:
A bayesian network is a direct graph G=(V,E)G=(V,E)G=(V,E) together with :

  • A random variable xix_{i}xi​ for each Node iVi \in Vi∈V.
  • One conditional probability distribution (CPD) p(xixAi)p(x_{i}|x_{A_{i}})p(xi​∣xAi​​) per node, specifying the probability of xix_{i}xi​ conditioned on its parents’ values. (in another word Edge)

End of definition

  • When GGG contains cycles, its associated probability may not sum to one.
  • Certain variables could be independent. (this will help to build a more efficient inference)
  • Common Parent. ABCA \leftarrow B \to CA←B→C: if BBB is observed, then ACBA \bot C | BA⊥C∣B (p(ACB)=p(AB)p(CB)p(AC|B)=p(A|B)p(C|B)p(AC∣B)=p(A∣B)p(C∣B)), if BBB is unobserved, (p(AC)p(A)p(C)p(AC) \ne p(A)p(C)p(AC)​=p(A)p(C))。
  • Cascade. ABCA \to B \to CA→B→C: if BBB is observed, then ACBA \bot C | BA⊥C∣B (p(ACB)=p(AB)p(CB)p(AC|B)=p(A|B)p(C|B)p(AC∣B)=p(A∣B)p(C∣B)), if BBB is unobserved, (p(AC)p(A)p(C)p(AC) \ne p(A)p(C)p(AC)​=p(A)p(C))。
  • V-structure. ABCA \to B \leftarrow CA→B←C: if BBB is unobserved, then ACBA \bot C | BA⊥C∣B (p(AC)=p(A)p(C)p(AC) = p(A)p(C)p(AC)=p(A)p(C)), if BBB is observed, (p(ACB)p(AB)p(CB)p(AC|B) \ne p(A|B)p(C|B)p(AC∣B)​=p(A∣B)p(C∣B))。

2.2 Markov Random Fields

There are cases where Bayesian network cannot describe. But Markov Random Fields (a undirected graph) can solve some. And this expression is used more in computer vision area.

For an example, the friendship, should not be expressed by directed edge. which is also hard to express with conditional probability. Which is nature to introduce the undirected edge in Markov Random Fields.

  • The edge becomes more like a interaction that push the variables, it likes a force, a potential energy.
  • And it requires less prior about the variables, as we do not know their exact dependence relationship. All we need to know is there exist a interaction.
  • As they are more like a potential energy other than a probability, we should not forget to add a normalization term in our expression.

Fromal definition:
A Markov Random Field (MRF) is a probability distribution ppp over variables x1,x2,...,xnx_{1},x_{2},...,x_{n}x1​,x2​,...,xn​ defined by an undirected graph GGG in which nodes correspond to variables xix_{i}xi​. The probability ppp has the form :
p(x1,x2,...,xn)=1ZcCϕc(xc) p(x_{1},x_{2},...,x_{n}) = \frac{1}{Z}\prod_{c \in C}\phi_{c}(x_{c}) p(x1​,x2​,...,xn​)=Z1​c∈C∏​ϕc​(xc​)
Where CCC donates the set of cliques (fully connected subgraphs) of GGG, and each factor ϕc\phi_{c}ϕc​ is an nonegative function over the variables in the clique. The partition function :
Z=x1,x2,...,xncCϕc(xc) Z = \sum_{x_{1},x_{2},...,x_{n}}\prod_{c \in C}\phi_{c}(x_{c}) Z=x1​,x2​,...,xn​∑​c∈C∏​ϕc​(xc​)
is a normalizing constant that ensures that the distribution sums to one.
End of definition

  • MRF can express more non-directional assications.
  • But it takes more to calculate then Bayesian Nets, especially the noramlization term ZZZ.
  • Bayesian Nets are computational easier.

A Bayesian network can always be converted into an undirected network with normalization constant one. The converse is also possible, but may be computationally intractable, and may produce a very large (e.g. fully connected) directed graph.

2.3 Conditional Random Fields

It is a special case of Markov Random Fields, applied to model a conditional probability distribution.
a conditional random field results in an instantiation of a new Markov Random Field for each input x.

2.4 Factor Graph

A factor graph is a bipartite graph where one group is the variables in the distribution being modeled, and the other group is the factors defined on these variables. Edges go between factors and variables that those factors depend on.

hhhliuye 发布了17 篇原创文章 · 获赞 4 · 访问量 2263 私信 关注

标签:...,Graphical,xn,variables,AC,x2,Modeling,x1,Probabilistic
来源: https://blog.csdn.net/weixin_44492024/article/details/104574272

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有