Quiz 5¶

For Penn State student, access quiz here

import ipywidgets as widgets

Question 1¶

Consider a DNN layer $f^\ell = W^\ell \sigma (f^{\ell-1}) + b^\ell$ , where $W^\ell \in \mathbb{R}^{n_\ell \times n_{\ell-1}}$ with $n_\ell = n_{\ell-1} = m$. If we apply the Xavier’s initialization for this layer, what is the suggested variance to sample $W_{st}^\ell$ ?

Question 2¶

When training a CNN model with batch normalization (BN) structure, let us consider the time step $t$ with mini-batch $\mathcal B_t$ for the $j$-th channel of $\ell$-th layer (spatial dimension (resolution) for this layer is $n_\ell \times m_\ell $). Then, what is the size for the commonly used mean $[\mu^\ell_{\mathcal B_t}]_j$ and variance $[\sigma^\ell_{\mathcal B_t}]_j$ in BN for CNN models on this layer?

Question 3¶

If we define a convolutional layer with batch normalization as follows

class model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.bn1 =  nn.BatchNorm2d(N)
    def forward(self,x):
        out = F.relu(self.bn1(self.conv1(x)))

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-2-17b224f18c5e> in <module>
----> 1 class model(nn.Module):
      2     def __init__(self):
      3         super().__init__()
      4         self.conv1 = nn.Conv2d(3, 10, 5)
      5         self.bn1 =  nn.BatchNorm2d(N)

NameError: name 'nn' is not defined

What is the value of N in nn.BatchNorm2d(N)?

Question 4¶

How many kernels/filters are there in the initialization layer self.conv1 of ResNet18?

self.conv1 = nn.Conv2d(3, 64, kernel_size=3, st
ride=1, padding=1, bias=False)

Question 5¶

What is the equivalent code of the following code?

Conv_BN = nn.Sequential(nn.Conv2d(1,3,3),nn.BatchNorm2d(3))
 
x = torch.randn(1, 1, 28, 28)

out = Conv_BN(x)

Question 6¶

In the following code, what is the size of out if the size of x is torch.Size([3, 3, 3, 3])

out = x.view(x.size(0), -1)

Question 7¶

When we define ResNet18 as follows

my_model = ResNet(BasicBlock, [2,2,2,2], num_classes=10)

what does [2,2,2,2] mean?

Question 8¶

Here, let $\sigma(x) = e^x, \quad x \in \mathbb{R}.$ Consider the following 1-hidden layer DNN function with \sigma$ activation function for any $x\in \mathbb{R}^2$

$ f(x;\theta) = W^2 \sigma (W^1 x+ b^1) \in \mathbb{R}, $

where

$\theta = \{ W^1, b^1, W^2\}$ and $W^1 \in \mathbb{R}^{2\times 2}, \quad W^2 \in \mathbb{R}^{1\times 2}, \quad b^1 \in \mathbb{R}^2.$

Calculate $\left. \frac{\partial f(x; \theta)}{\partial W^1_{st}} \right|_{\theta = \theta^*, x = x^*} \quad \text{and} \quad \left. \frac{\partial f(x; \theta)}{\partial x_i} \right|_{\theta = \theta^*, x = x^*},$

for $i = 1,2$ and $s,t = 1,2$, where $\theta = \theta^*, x = x^*$ means

\[\begin{split} W^1 = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}, W^2 = \begin{pmatrix} 1 & 1 \end{pmatrix}, b^1 = \begin{pmatrix} 0 \\ 0 \end{pmatrix} \end{split}\]

and

\[\begin{split} x = \begin{pmatrix} 1 \\0 \end{pmatrix} \end{split}\]

Question 9¶

Consider the convolution for one channel with stride one and zero padding $A\ast: R^{n}\mapsto R^{n}$. $ A\ast u=f, $ where $ A=\frac{1}{h}\begin{pmatrix} -1, &2,&-1 \end{pmatrix}. $

Consider following two iterative methods for the above equation. Given $u^{0}$, for $\ell=1,2,\cdots,2m$

$u^{\ell}=u^{\ell-1}+\frac{h}{4}(f-A\ast u^{\ell-1})$

And Given $\tilde{u}^{0}=u^{0}$, for $\ell=1,2,\cdots,m$

$\tilde{u}^{\ell}=\tilde{u}^{\ell-1}+S_1\ast(f-A\ast\tilde{u}^{\ell-1})$

Determine $S_1$ in the second iterative method such that $u^{2m}=\tilde{u}^{m} \quad\hbox{when}\quad m=1,$, namely $u^{2}=\tilde{u}^{1}$

Question 10¶

Consider the convolution for one channel with stride one and zero padding. Given $f\in \mathbb R^n$, let $u$ be the solution of the following linear system $A\ast u=f$,where $A=(-1,2,-1)$

(a) Show that the solution $u$ satisfies the minimization problem

(b) Write out the gradient descent method to solve the above minimization problem

Deep Learning Web Course