AI代写/机器学习代写/ML代写: Machine Learning Theory (CSC 482A/581A)

AI代写/机器学习代写/ML代写:这是一个典型的人工智能相关的项目,涉及了理论等方面的内容
Instructions:
• You must write up your solutions individually.
• You may have high-level discussions with up to 2 other students registered in the course. If
you discuss problems with others, include at the top of your submission: their names, V#’s,
and the problems discussed.
• Your solutions should be clear and concise (and legible!). You are strongly encouraged to
type up your solutions in LaTeX. For any problems where you only have a partial solution,
be clear about any parts of your solution for which you have low confidence.
• If submitted through conneX, the due date is Friday, February 2nd, 7pm. This is a hard
deadline. If submitted in writing, submit at the beginning of class (12:30pm) Friday, February
2nd. You are thus incentivized to submit online (using LaTeX) as you get more time!
Questions:
1. Let X = R
2 and take C to be the class of concentric circles C = {cr : r ≥ 0}, where for each
nonnegative real number r ≥ 0, we have cr(x) = 1[kxk2 ≤ r]. Prove that C is PAC learnable.
In particular, show a PAC learning algorithm which, given a training sample of size n ≥
log 1
δ
ε
,
finds with probability at least 1 − δ a hypothesis ˆf ∈ C for which R(
ˆf) ≤ ε.
2. Devise an efficient mistake bound learner for the concept class k-term DNF over X = {0, 1}
d
.
The runtime and mistake bound of your algorithm both should be polynomial in d; you may
treat k as a constant.
3. Let X = {0, 1}
d and consider PAC learning a finite concept class C. Assume that the inputs
are drawn i.i.d. from an unknown distribution P over X , and the labels are generated via the
rule Y = c(X) for some c ∈ C.
Let’s call this problem the “clean” problem; so, in the clean problem, the training sample
consists of random examples of the form (X, Y ) for X ∼ P and Y = c(X).
Next, consider the following “corrupted” problem: Each time we request a random example
(X, Y ), with probability α(X) ∈ [0, 1] the value of the label Y is flipped. Call the resulting
label eY . Thus,
eY =
(
−Y with probability α(X)
Y with probability 1 − α(X)
In the corrupted problem, the examples are of the form (X, eY ), and so the labels are noisy.
1
(a) Using c and α, derive an expression for the Bayes classifier for the corrupted problem.
(b) For the remaining questions, assume that α(x) = 1
4
for all x ∈ X . What is the Bayes
classifier for the corrupted problem?
(c) What is the Bayes risk for the corrupted problem?
(d) Let cε ∈ C be a hypothesis for which Pr(cε(X) 6= c(X)) = ε > 0. What is the risk
(expected zero-one loss) of cε for the corrupted problem?
(e) Design an algorithm for PAC learning C given access only to corrupted labeled examples
(X1,
eY1), . . . ,(Xn,
eYn). That is, your algorithm should, with probability at least 1 − δ,
output a concept ˆf ∈ C for which EX∼P [
ˆf(X) 6= c(X)] ≤ ε. Your algorithm should be
statistically efficient (you should mention the sample size n required, and n should be
polynomial in 1
ε
and 1
δ
), but it need not be computationally efficient. Please explain why
your algorithm is correct.
2

发表评论

电子邮件地址不会被公开。 必填项已用*标注