## Quadratic programming and an application to the disputed

**matlab代写/算法代写/CS代写:这是一个通过matlab实现的代写作业，需要对matlab有深入的理解和认识**

Contents

1 Quadratic Programming 1

2 Project Description 2

3 Deliverables 5

3.1 Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3.2 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.2.1 MATLAB m-files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4 Grading Scheme 6

5 Teams 7

6 Algorithm List 7

Key concepts: Algorithm implementation, computational considerations, fundamental

engineering trade-offs, algorithm performance analysis, algorithm selection process, basic

machine learning, SVM

1 Quadratic Programming

In this project, you will consider quadratic programming problems of the following form:

min

1

2

x

T Hx + c

T x (QP)

s.t Ax ≤ b

x ≥ 0

1

MIE335 Lab Project Winter 2018

where x is a vector of continuous decision variables and H is a positive definite matrix.

You could just plug this problem into Gurobi and get an answer, but maybe you are

in the real world and your employer or client does not want to pay for a Gurobi license,

or maybe your problem is so big that Gurobi cannot solve or even store it. So, you must

code up an algorithm to solve the problem on your own. The question is, what algorithm

to use?

Usually, it is not obvious which algorithm will work best for a particular optimization

problem. Even with linear programs, both simplex and interior point methods are common

choices, and one may arrive at a solution faster with one method than the other for particular

problem instances. In the world of research, a significant amount of time is devoted to

designing new optimization algorithms, or tweaks to existing algorithms, that consistently

outperform other well-known algorithms across a class of problems. Even if our goal is to

simply use an off-the-shelf algorithm, we have to implement a few different algorithms to

see which performs best for our problem. The definition of “best” is also something that

must be carefully considered.

Your job is to implement any two algorithms listed in Section 6, analyze the algorithms

you will implement to predict which one you think will work best, and then pick the one

that actually performs the best. In order to pick the best algorithm you will be given a

small number of sample QP instances (available on Blackboard) to test your algorithms’

efficiency on, but you should try creating your own, too.

2 Project Description

Linear and quadratic programming can be used to solve problems in many applications.

In this project, you will apply quadratic programming to a machine learning formulation

to determine the authorship of the disputed Federalist Papers.

The Federalist Papers were written in 1787-1788 by Alexander Hamilton, John Jay,

and James Madison to persuade the citizens of the State of New York to ratify the U.S.

Constitution. As it was common in those days, these 77 shorts essays, about 900 to

3500 words in length, appeared in newspapers signed with a pseudonym, in this instance,

Publius. In 1778 these papers were collected along with eight additional articles in the same

subject and were published in book form. Since then, the consensus has been that John

Jay was the sole author of five of a total 85 papers, that Hamilton was the sole author of

51, that Madison was the sole author of 14, and that Madison and Hamilton collaborated

on another three. The authorship of the remaining 12 papers has been in dispute; these

papers are usually referred to as the disputed papers. It has been generally agreed that the

disputed papers were written by either Madison or Hamilton, but there was no consensus

about which were written by Hamilton and which by Madison.

The data, obtained from Bosch and Smith (1998)1

, federalData.mat, is available on

1Bosch, R. A. and Smith, J. A., Separating hyperplanes and the authorship of the disputed federalist

2

MIE335 Lab Project Winter 2018

Blackboard. The file contains a data matrix with 118 lines of data, one line per paper. (A

number of other papers with known authorship of either Hamilton or Madison were added

to the Federalist Papers mentioned above, to provide extra data on the vocabularial habits

of the two authors.) The first entry in each line contains the code number of the author

(or a label), 1 for Hamilton (56 papers in total), 2 for Madison (50 papers in total), and

3 for the disputed papers (12 in total). The remaining entries contain 70 floating point

numbers that correspond to the relative frequencies (number of occurrences per 1000 words

of the text) of the 70 function words, which are also available in the data file as an array

of strings.

The idea of the project is to compute a linear support vector machine2

(SVM) to

determine if a disputed paper was authored by Hamilton or Madison. To do this, you will

divide the papers with known authors into a training set and a tuning set for the separating

plane. The disputed papers form a testing set. Once we have calculated the separating

plane, we will test to see which side of the plane the disputed papers lie on.

The MATLAB routine getFederalistData.m (also available on Blackboard) parses

the data from federalData.mat to produce a training matrix train (consisting of 86

entries with known authorship), a tuning matrix tune (consisting of the other 20 papers of

known authorship), and a testing matrix test (consisting of the 12 entries with unknown

authorship). Each row of train and tune consists of a label followed by a feature vector in

R

70

.

Use the following quadratic programming formulation of the SVM for this problem:

min

1

N

X

N

i=1

si +

µ

2

w

>w (1a)

s.t. yi(w

>xi + b) ≥ 1 − si ∀i ∈ {1, . . . , N} (1b)

s ≥ 0 (1c)

where (w, b) are the decision variables we would like to learn, s is the error variable and N

is the total number of training examples. Set yi = 1 for Hamilton and yi = −1 for Madison.

Each xi

is an example (feature) vector in R

70. The output (w, b) will parametrize a linear

classifer f such that

f(xi) = w

>xi + b

will be mostly greater than zero on the Hamilton documents and mostly less than zero

on the Madison documents. µ is the regularization parameter. Large values of µ result in

small values of kwk2.

You should use a known solver as a reference point and solve the QP to optimality.

You can use CVX (Free: http://cvxr.com/cvx/) with a free solver like SDPT3

papers. American Mathematical Monthly 105, 7 (August-September), 601-608, 1998.

2Support Vector Machine (SVM) is a supervised machine learning algorithm which can be used for

classification by finding the hyperplane that differentiates two classes very well.

3

MIE335 Lab Project Winter 2018

(http://www.math.nus.-edu.sg/ mattohkc/). Besides CVX you can also use Gurobi or

CPLEX (has a MATLAB interface), or MATLAB’s quadprog function, the choice is yours.

You should also write MATLAB code and implement any two algorithms listed in

Section 6 to solve the QP (1a)-(1c).

You should answer the following questions:

Q1. Write MATLAB code to solve the QP to find the classifying hyperplane, where the

matrices M and H are extracted from the training matrix train described above.

Solve the QP for these values of µ: 0, 0.001, 0.01, 0.1, 1, 10, and 100. For each value

of µ, calculate and print the following information:

• the optimal QP objective

• the values of b and kwk2

• the number of misclassified points (those that lie on the wrong side of the classifying

hyperplane) from the training set, which is described in the matrix train

• the number of misclassified points from the tuning set, which is described in the

matrix tune

• the predictions of authorship for the 12 disputed papers. (To calculate this prediction,

take each of the rows in test and determine which side of the classifying

hyperplane they lie on.) Express your result with one line of output for each of

the 12 papers, indicating the paper number, predicted author and margin (the

value of w

>x + b for the x corresponding to this paper.)

Q2. From the results in Q1, answer the following questions and give your reasons for

your answer. What is the effect of µ on kwk2 and on the quality of the classifying

hyperplane? What is the best value of µ?

Q3. Fix µ = 0.1 for purposes of this question and the next. Suppose you want to use only

2 of the 70 attributes (word frequencies) for your prediction of authorship. Determine

which pair of attributes is the most effective in determining a correct prediction as

follows. For each of the 2415 pairs of possible attributes, determine a separating

hyperplane in R

2

. (Note that for these problems, n = 2 in our formulation above in

contrast to n = 70 when we use all features.) For each plane, determine the number of

misclassified cases from the tuning set. Report the best classified, and the two words

that it uses, along with the number of misclassified tuning points for this classifier.

Q4. Use the optimal classifier from Q3 to predict the authorship of each of the twelve

papers. Plot all the data points according to the “best” two attributes on a twodimensional

figure using MATLAB built-in plotting routines. Use ‘o’ for Hamilton

papers and ‘+’ for Madison papers, and ‘*’ for disputed papers in the plot. Use

MATLAB to draw in the calculated line w

>x = −b. Check to see if the number

4

MIE335 Lab Project Winter 2018

of misclassified papers shown in your plot agrees with your calculations. (Note that

some points may coalesce, so you may want to randomly perturb the points by a

small amount to visualize all these points).

3 Deliverables

The project has two deliverables:

• PDF report (no other file type will be accepted)

• Zip file of all your code

3.1 Report

Your written report should contain everything about your algorithm implementation and

comparison process. There is no page minimum or maximum; just write enough to thoroughly

describe your process, and no more. Remember the report is also graded on clarity

and conciseness. At the very least, your report should contain the following information

(the section outline is provided as an example):

Introduction Explain which two algorithms you implemented and why you chose them,

and what criteria you will use to compare their performances and why you are concerned

with those criteria.

Algorithms Provide details about each of the two algorithms you implemented, including

pseudo-code, complexity analysis, and predictions about performance (have a subsection

for each algorithm). Be sure to properly cite your sources! If you saw multiple

variations of an algorithm in reading articles and textbooks, cite those multiple versions

and explain why you picked the variation that you did. Also, explain which

known solver you used as a reference point.

Algorithm Comparison Explain in detail which instances you used to test your algorithms

on, how you generated your own instances (if applies), how each algorithm

performed in terms of objective function value, computation time, and anything else

you consider relevant. Use graphs and tables whenever appropriate. Make sure the

graphs and tables you used are clear and easy to read.

SVM Application Results Answer in detail the questions Q1-Q4.

Conclusions Make your final recommendation for the best algorithm of the two you

tested, and discuss anything interesting you observed during the process. Also, provide

final comments regarding the SVM application.

5

MIE335 Lab Project Winter 2018

3.2 Code

The two algorithms must be coded in MATLAB. For each algorithm, put all of your m-files

in a folder. Required folder names and main m-file names are listed in Section 6. Together

with your code from the solver that you chose to use, zip the folders into a single zip file

and submit that zip file.

3.2.1 MATLAB m-files

For each algorithm, the main m-file that runs the program must have the following heading:

function [z,b,w,p1,p2] = alg_name(M,H,mu)

where

• z is the objective function value. b and w are the solutions found by the algorithm

for the SVM problem, respectively. p1 is the number of missclassified points from

the training set, and p2 is the number of missclassified points from the tuning set.

• M and H are the M and H data matrices in the QP model (1a)-(1c), respectively. mu

is the parameter µ.

• alg_name is the main m-file name listed in Section 6

You must assume that the user will pass in all inputs and all dimensions will be consistent.

4 Grading Scheme

Table 1: Grading Scheme

Item Weight

Report 40%

Algorithm code (5% solver + 15 % each algorithm) 35%

Correctness of the results 15%

Efficiency (5% each ×2) 10%

BONUS: Write your report in LATEX +5%

PENALTY: Failure to report team members by Reading Week −5%

Report You will be assessed on the thoroughness of each of the report sections described

in Section 3.1, as well as on organization, quality and appropriateness of figures and tables.

6

MIE335 Lab Project Winter 2018

Algorithm code You will be assessed on correctness and style (i.e., formatting, appropriate

use of functions, etc.).

Correctness of the results Mainly the correctness of your answers for the questions

Q2 – Q4 from Section 1 will be graded.

Efficiency In order to be considered for the points for efficiency you must meet the

following optimality and feasibility requirements:

• found an objective function value within 5%

• found a solution no more than 2% from feasibility3

Note: Teams that did not get to within 5% optimality and 2% feasibility automatically get

0 points for that algorithm.

Of all the teams in the class that met these optimality/feasibility requirements, the team

with the fastest running time gets all 5 points, and everyone else receives points in the range

[1, 5] proportional to computation time. Specifically, if f is the fastest computation time,

computation time t earns the following points:

max {1, 5 × (f /t)}

If you are the only team to implement a particular algorithm, then you get the full 5 points

as long as you meet the optimality and feasibility requirements.

5 Teams

Teams will (mostly) consist of four people and must be finalized by the start of Reading

Week. Depending on how many people are in the class by Reading Week, some threeperson

teams will be necessary. Teams should not have more than four people. There will

be no difference in grading three-person teams vs four-person teams.

Teams are strongly recommended to have at least one person that have taken MIE365

Operational Research III: Advanced OR.

You must report all your team members to Prof. Bodur by the start of Reading Week,

and indicate which if any team members satisfy the above recommendation.

6 Algorithm List

Select any two of the algorithms listed in Table 2. Instructions to implement the algorithms

will not be provided; you will have to use your extensive experience searching the Internet to

3

2% feasibility is defined as no constraint violated by more than 2%.

7

MIE335 Lab Project Winter 2018

find algorithm steps. Search Google Scholar4

for relevant technical articles and textbooks.

The library will also probably have most textbooks of interest. Pay attention to publication

dates and results, and try to find the latest and greatest variations.

Feel free to tweak any of the algorithms for better performance on your specific model

structure, but be sure to clearly describe any tweaks in your report. A minor tweak does

not count as making up your own method.

Table 2: Algorithm candidates and required naming schemes

Algorithm Folder name Main m-file name

1. Interior point method IPM run_IPM

2. Active set method AS run_AS

3. Trust region method TR run_TR

4. Projected gradient method PG run_PG

5. Conjugate gradient method CG run_CG

6. Sequential quadratic programming algorithm SQP run_SQP

7. Hildreth’s quadratic programming algorithm HQP run_HQP

8. Simplex method for quadratic programming SM run_SM

9. Simulated annealing SA run_SA

10. Genetic algorithm GA run_GA

11. Tabu search TS run_TS

12. Make up your own method! novel run_NOVEL