吴恩达Coursera, 机器学习专项课程, Machine Learning：Advanced Learning Algorithms第二周测验

2022-07-03 01:33:20 阅读：249 来源： 互联网

标签：function loss 吴恩达 optimizer activation layer Machine Learning model

Practice quiz: Neural Network Training

第 1 个问题：Here is some code that you saw in the lecture:

model.compile(loss=BinaryCrossentropy())

For which type of task would you use the binary cross entropy loss function?

A classification task that has 3 or more classes (categories)
【正确】binary classification (classification with exactly 2 classes)
regression tasks (tasks that predict a number)
BinaryCrossentropy() should not be used for any task.
【解释】Yes! Binary cross entropy, which we've also referred to as logistic loss, is used for classifying between two classes (two categories).

第 2 个问题：Here is code that you saw in the lecture:

model = Sequential([
Dense(units=25, activation='sigmoid’),
Dense(units=15, activation='sigmoid’),
Dense(units=1, activation='sigmoid’)
])
model.compile(loss=BinaryCrossentropy())
model.fit(X,y,epochs=100)

Which line of code updates the network parameters in order to reduce the cost?

【正确】model.fit(X,y,epochs=100)
None of the above -- this code does not update the network parameters.
model = Sequential([...])
model.compile(loss=BinaryCrossentropy())
【解释】Yes! The third step of model training is to train the model on data in order to minimize the loss (and the cost)

Practice quiz: Activation Functions

第 1 个问题：Which of the following activation functions is the most common choice for the hidden layers of a neural network?

【正确】ReLU (rectified linear unit)
Sigmoid
Most hidden layers do not use any activation function
Linear
【解释】Yes! A ReLU is most often used because it is faster to train compared to the sigmoid. This is because the ReLU is only flat on one side (the left side) whereas the sigmoid goes flat (horizontal, slope approaching zero) on both sides of the curve.

第 2 个问题：For the task of predicting housing prices, which activation functions could you choose for the output layer? Choose the 2 options that apply.

【正确】ReLU
【解释】Yes! ReLU outputs values 0 or greater, and housing prices are positive values.
【正确】linear
【解释】Yes! A linear activation function can be used for a regression task where the output can be both negative and positive, but it's also possible to use it for a task where the output is 0 or greater (like with house prices).
Sigmoid

第 3 个问题：True/False? A neural network with many layers but no activation function (in the hidden layers) is not effective; that’s why we should instead use the linear activation function in every hidden layer.

True
【正确】False
【解释】Yes! A neural network with many layers but no activation function is not effective. A linear activation is the same as "no activation function".

Practice quiz: Multiclass Classification

第 1 个问题：For a multiclass classification task that has 4 possible outputs, the sum of all the activations adds up to 1. For a multiclass classification task that has 3 possible outputs, the sum of all the activations should add up to ….

It will vary, depending on the input x.
【正确】1
More than 1
Less than 1

第3个问题：For multiclass classification, the cross entropy loss is used for training the model. If there are 4 possible classes for the output, and for a particular training example, the true class of the example is class 3 (y=3), then what does the cross entropy loss simplify to? [Hint: This loss should get smaller when a 3 gets larger.]

第 3 个问题：For multiclass classification, the recommended way to implement softmax regression is to set from_logits=True in the loss function, and also to define the model's output layer with…

【正确】a 'linear' activation
a 'softmax' activation
【解释】Yes! Set the output as linear, because the loss function handles the calculation of the softmax with a more numerically stable method.

Practice quiz: Additional Neural Network Concepts

第 1 个问题：The Adam optimizer is the recommended optimizer for finding the optimal parameters of the model. How do you use the Adam optimizer in TensorFlow?

The call to model.compile() will automatically pick the best optimizer, whether it is gradient descent, Adam or something else. So there’s no need to pick an optimizer manually.
【正确】When calling model.compile, set optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3).
The call to model.compile() uses the Adam optimizer by default
The Adam optimizer works only with Softmax outputs. So if a neural network has a Softmax output layer, TensorFlow will automatically pick the Adam optimizer.
【解释】Correct. Set the optimizer to Adam.

第 2 个问题：The lecture covered a different layer type where each single neuron of the layer does not look at all the values of the input vector that is fed into that layer. What is this name of the layer type discussed in lecture?

1D layer or 2D layer (depending on the input dimension)
【正确】convolutional layer
A fully connected layer
Image layer
【解释】Correct. For a convolutional layer, each neuron takes as input a subset of the vector that is fed into that layer.

标签：function,loss,吴恩达,optimizer,activation,layer,Machine,Learning,model
来源： https://www.cnblogs.com/chuqianyu/p/16439081.html

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9