首页 > 其他分享> 文章详细

[paper reading][Proceedings of the IEEE 2016] Taking the Human Out of the Loop: A Review of Bayesian

2021-11-05 12:31:06 阅读：230 来源： 互联网

标签：kernel Taking feature prior Optimization mathcal reading data mean

1 Introduction
2 Bayesian Optimization with Parametric Models
3 Nonparametric models

Proceedings of the IEEE 2016
https://ieeexplore.ieee.org/abstract/document/7352306
A review of BO, an optimization algorithm typically for "hyperparameters".

1 Introduction

design, choice, high-dim, hyperparam
- IBM ILOG CPLEX
\(x^* = argmax_{x\in \mathcal X}f(x)\)
- compact subset of \(\mathbb R^d\), or ...
- stochastic output \(\mathbb E[y|f(x)]=f(x)\)
- unbiased noisy point-wise observations
data efficient, evaluations are costly
prior, refine
best choice? acquisition function \(\alpha_n: \mathcal X\to \mathbb R\)
- mean, confidence interval
myopic heuristics
- uncertainty is large (exploration), or prediction is high (exploitation)
- acquisition function: easy to find the optimum, analytic?

2 Bayesian Optimization with Parametric Models

parametrized by \(w\)
\(\mathcal D\): data
bayesian: \(p(w|D)=p(D|w)p(w)/p(D)\)
- beliefs about \(w\) after observing data \(D\)
- \(p(D)\) intractable, but in fact a normalizing constant
prior: conjucacy, analytically
\(K\) drugs, independent
- to optimize \(f\), on \(K\) indices, fully parametrized
- beta, conjugacy
TS, simplest strategy, posterior prob of optimality, estimated, MC
- \(a_{n+1}=argmax_a f_{\bar w}(a)\)
- no more param other than the prior
linear model, feature, vector, \(f_w(a)=x_a^T w\)
\(X\): input vectors, \(y\): outputs
nonlinear basis functions
- radial
- Fourier
- learned from data
- feature map, regardless, weights can be computed analytically

3 Nonparametric models

start, observation variance \(\sigma^2\), zero-mean Gaussian prior \(V_0\), preserve Gaussianity
basis functions, linear regression, symmetric positive-semidefinite, kernel
- intuitive similarity between pairs of points, rather than a feature map \(\Phi\)
- tractable, linear algebra, unnecessary to explicitly define \(\Phi\)
GP, nonparametric model, prior mean, covariance
\(f|X\sim \mathcal N(m,K)\)
\(y|f, \sigma^2\sim \mathcal N(f,\sigma^2 I)\)
posterior: use \(x\) and previous data (not "abstracted by parameters")
kernel, structure, periodic, stationary
- Matern, diagonal, paramtrized
kernel, smoothness and amplitude
prior, possible offset, constant, expert knowledge

标签：kernel,Taking,feature,prior,Optimization,mathcal,reading,data,mean
来源： https://www.cnblogs.com/minor-second/p/15512655.html

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9

[paper reading][Proceedings of the IEEE 2016] Taking the Human Out of the Loop: A Review of Bayesian

1 Introduction

2 Bayesian Optimization with Parametric Models

3 Nonparametric models