StudentsEducators
  1. University
  2. Technische Universität Berlin
  3. Machine Learning

Machine_Learning_I_Exam_WS_14_15

Take Machine_Learning_I_Exam_WS_14_15 and compare your solution. From the course Machine Learning at Technische Universität Berlin (TU Berlin).

Section MAIN-dd0444ba-2553-4ed1-9d9d-c4fe9ab8f9d9

Mixed
Multi Choice
15 P

Es gibt nur eine richtige Antwort. Falsche Antworten geben 0 Punkte genauso wie keine Antwort.


a
3 P

The Bayes error for classification is:

Select one answer

b
3 P

Independent Component Analysis can be achieved by:

Select one answer

c
3 P

The K-means algorithm:

Select one answer

d
3 P

A biased estimator is sometimes used to:

Select one answer

e
3 P

Which is False? The Restricted Boltzmann machine is:

Select one or multiple answers

Section MAIN-d6cc762f-db24-4666-9c07-f994243f7f26

Mixed
Models and Datasets
15 P

a
7 P

Sketch a two-dimensional two-class supervised dataset that Fishers linear discriminant and the linear hard-margin SVM would learn separating bounderies with different directions. Your example must be stereotyping for the two classification techniques.

Your answer:

b
7 P

Sketch a two-dimensional unsupervised dataset for which K-means (K=3) may get stuck at a local optimum. In your drawing show (using square markers) the position of the centroids for the local minimum and (with circels) the position of the global minimum. Your example must be stereotyping in order to show the difference between the local and global minimum.

Your answer:

Section MAIN-6afbbac4-8754-42e1-a08b-53612b0510ae

Mixed
Kernels and Feature Maps
25 P

In order for a kernel to be positive semid-definite (PSD) it must satisfy ∑i=1n∑j=1ncicjk(xi,xj)≥0\sum_{i=1}^{n} \sum_{j=1}^{n} c_i c_j k(x_i, x_j) \ge 0∑i=1n​∑j=1n​ci​cj​k(xi​,xj​)≥0 for all sequences of data points and coefficients.


a
10 P

Show that if k1k_1k1​ and k2k_2k2​ are PSD kernels, then k3=αk1+βk2k_3 = \alpha k_1 + \beta k_2k3​=αk1​+βk2​ with α,β≥0\alpha, \beta \ge 0α,β≥0 is also a PSD kernel.

Your answer:

b
5 P

Give an example showing that the positive semi-definiteness is not guaranteed if α<0\alpha < 0α<0 or β<0\beta < 0β<0

Your answer:

c
10 P

A feature map associated to a kernel kkk must satisfy: ⟨ϕ(xi),ϕ(xj)⟩=k(xi,xj)∀i,j∈X\langle \phi(x_i), \phi(x_j) \rangle = k(x_i, x_j) \forall i, j \in X⟨ϕ(xi​),ϕ(xj​)⟩=k(xi​,xj​)∀i,j∈X Find the feature map of k3=αk1+βk2k_3 = \alpha k_1 + \beta k_2k3​=αk1​+βk2​ and show that it fulfills the equation above.

Your answer:

Section MAIN-6b5f0dbf-b253-44e3-bd00-65b58852a908

Mixed
Lagrance multipliers
20 P

We consider a discrete probability distribution p(i)p(i)p(i), i={1..10}i = \{1..10\}i={1..10}. We can represent such probability distribution as a vector ppp indexed by iii subject to the constraints pi≥0p_i \ge 0pi​≥0 and ∑i=110pi=1\sum_{i=1}^{10} p_i = 1∑i=110​pi​=1. We would like to find analytically the probability distribution ppp with maximum entropy. The Entropy is given by H(p)=−∑i=110pilogpiH(p) = -\sum_{i=1}^{10} p_i log p_iH(p)=−∑i=110​pi​logpi​ and is a concave function


a
5 P

Write down the Lagrangian function associated to this constrained optimization problem.

Your answer:

b
10 P

Show using the Lagrangian method that the optimal probability distribution is uniform with p(i)=0.1p(i) = 0.1p(i)=0.1 for all iii.

Your answer:

c
5 P

Explain briefly why the same Lagrange method cannot be used to find probability distribution with minimum entropy

Your answer:

Section MAIN-5af17f8f-c333-4f31-a190-4a2e7cc41348

Mixed
Quadratic Programming
25 P

We consider the regularized regression task this is solved by optimizing
min⁡W∑k=1N(yk−w1xk)2\min_{W} \sum_{k=1}^{N} (y_k - w_1 x_k)^2minW​∑k=1N​(yk​−w1​xk​)2 subject to ∣∣w∣∣∞≤1||w||_\infty \leq 1∣∣w∣∣∞​≤1
where www is the solution and (x,y)k(x, y)_k(x,y)k​ is a dataset of NNN input-output pairs.


a
10 P

Show that the optimization problem can be rewritten as:
min⁡WwTXTXw−2yTXw\min_W w^T X^T X w - 2y^T X wminW​wTXTXw−2yTXw subject to wi≤1w_i \leq 1wi​≤1 and wi≥−1w_i \geq -1wi​≥−1

Your answer:

b
15 P

You have at your disposal a quadratic solver quadprog(Q,1,A,b) that is solving
min⁡VvTQv+1Tv\min_V v^T Q v + 1^T vminV​vTQv+1Tv subject to Av≤bAv \leq bAv≤b
Write down the code that builds the numpy array Q,1,A,b from the data X and y.

Your answer:
iconlogo
Log in