Extra Material Support Vector Machines With Conic Optimization

Leo Migdal
-
extra material support vector machines with conic optimization

In this notebook we come back to the concept of training support vector machines as we did in the first SVM notebook The difference is that we shall now be solving the dual problems related to training the SVM's using the conic quadratic optimization by explicitly calling the Mosek solver, which should yield more stable numerical... The first part of this notebook shall therefore consist of data imports and other things that need no further explanation. Please move directly to the cell entitled "Conic optimization model" if you already have the data loaded from there. Point of attention: An important difference with the first notebook will be the fact that we will eliminate the 'intercept' bbb of the SVM to keep our equations simple. This cell selects and verifies a global SOLVER for the notebook.

If run on Google Colab, the cell installs Pyomo and ipopt, then sets SOLVER to use the ipopt solver. If run elsewhere, it assumes Pyomo and the Mosek solver have been previously installed and sets SOLVER to use the Mosek solver via the Pyomo SolverFactory. It then verifies that SOLVER is available. We have so far explored the Naive Bayes classifier and logistic regression for solving classification tasks. In this section, we will learn about another tool called support vector machines (SVMs). While the final algorithm will be similar to logistic regression, SVMs approach the problem from a different perspective.

As we’ll soon discover, both logistic regression and SVMs have their own strengths and weaknesses. Plus, we’ll get to see some fun math along the way! Consider a binary classification problem with data points \(\mathbf{x}^{(i)}\) and labels \(y^{(i)} \in \{-1, 1\}\). (We previously used \(y^{(i)} \in \{0, 1\}\), but this is just a relabeling for convenience.) We want to find a hyperplane that separates the two classes. For the majority of our discussion, we will assume that the data is linearly separable, meaning there exists a hyperplane that can perfectly separate the two classes. When there is such a separating hyperplane, there are often infinitely many that we can choose from.

The question at the heart of support vector machines is: which hyperplane should we choose? As we’ve seen in the past, it is not too difficult to find a model that perfectly fits our training data; the real challenge is to find a model that generalizes well to unseen... In the context of classification, we may expect that a model that separates the two classes with the largest ‘margin of error’ will generalize better than one that is very close to the data... With this in mind, Vladimir Vapnik and Alexey Chervonenkis proposed the idea of support vector machines, which aim to find the hyperplane that maximizes the margin between the two classes. Let’s consider a particular hyperplane, defined by the normal vector \(\mathbf{w}\) and bias \(b\). The equation of the hyperplane is given by: \[ \begin{align} \langle \mathbf{w}, \mathbf{x} \rangle - b = 0.

\end{align} \] In this notebook we come back to the concept of training support vector machines as we did in the first support vector machine notebook The difference is that we shall now be solving the dual problems related to training the SVM’s using the conic quadratic optimization by explicitly calling the Mosek solver, which should yield more stable numerical... The first part of this notebook shall therefore consist of data imports and other things that need no further explanation. Please move directly to the cell entitled “Conic optimization model” if you already have the data loaded from there. Point of attention: An important difference with the first notebook will be the fact that we will eliminate the ‘intercept’ \(b\) of the SVM to keep our equations simple.

The following data set contains data from a collection of known genuine and known counterfeit banknote specimens. The data includes four continuous statistical measures obtained from the wavelet transform of banknote images named “variance”, “skewness”, “curtosis”, and “entropy”, and a binary variable named “class” which is 0 if genuine and 1... arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community?

Learn more about arXivLabs. In this chapter, there is a number of examples with companion AMPL implementation that explore various modeling and implementation aspects of conic problems: Markowitz portfolio optimization problem revisited Optimal design of multi-layered building insulation Training Support Vector Machines with Conic Programming Extra material: Luenberger’s Investment Wheel

People Also Search

In This Notebook We Come Back To The Concept Of

In this notebook we come back to the concept of training support vector machines as we did in the first SVM notebook The difference is that we shall now be solving the dual problems related to training the SVM's using the conic quadratic optimization by explicitly calling the Mosek solver, which should yield more stable numerical... The first part of this notebook shall therefore consist of data i...

If Run On Google Colab, The Cell Installs Pyomo And

If run on Google Colab, the cell installs Pyomo and ipopt, then sets SOLVER to use the ipopt solver. If run elsewhere, it assumes Pyomo and the Mosek solver have been previously installed and sets SOLVER to use the Mosek solver via the Pyomo SolverFactory. It then verifies that SOLVER is available. We have so far explored the Naive Bayes classifier and logistic regression for solving classificatio...

As We’ll Soon Discover, Both Logistic Regression And SVMs Have

As we’ll soon discover, both logistic regression and SVMs have their own strengths and weaknesses. Plus, we’ll get to see some fun math along the way! Consider a binary classification problem with data points \(\mathbf{x}^{(i)}\) and labels \(y^{(i)} \in \{-1, 1\}\). (We previously used \(y^{(i)} \in \{0, 1\}\), but this is just a relabeling for convenience.) We want to find a hyperplane that sepa...

The Question At The Heart Of Support Vector Machines Is:

The question at the heart of support vector machines is: which hyperplane should we choose? As we’ve seen in the past, it is not too difficult to find a model that perfectly fits our training data; the real challenge is to find a model that generalizes well to unseen... In the context of classification, we may expect that a model that separates the two classes with the largest ‘margin of error’ wi...

\end{align} \] In This Notebook We Come Back To The

\end{align} \] In this notebook we come back to the concept of training support vector machines as we did in the first support vector machine notebook The difference is that we shall now be solving the dual problems related to training the SVM’s using the conic quadratic optimization by explicitly calling the Mosek solver, which should yield more stable numerical... The first part of this notebook...