Support Vector Machine

The objective of the support vector machine algorithm is to find a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points.

To separate the two classes of data points, there are many possible hyperplanes that could be chosen. Our objective is to find a plane that has the maximum margin, i.e the maximum distance between data points of both classes. Maximizing the margin distance provides some reinforcement so that future data points can be classified with more confidence.

Hyperplanes and Support Vectors

Hyperplanes are decision boundaries that help classify the data points. Data points falling on either side of the hyperplane can be attributed to different classes. Also, the dimension of the hyperplane depends upon the number of features. If the number of input features is 2, then the hyperplane is just a line. If the number of input features is 3, then the hyperplane becomes a two-dimensional plane. It becomes difficult to imagine when the number of features exceeds 3.

Support vectors are data points that are closer to the hyperplane and influence the position and orientation of the hyperplane. Using these support vectors, we maximize the margin of the classifier. Deleting the support vectors will change the position of the hyperplane. These are the points that help us build our SVM.

SVM Implementation from scratch

The dataset we will be using to implement our SVM algorithm is the Iris dataset.

import pandas as pd 

df = pd.read_csv('iris.csv')
df = df.drop(['Id'],axis=1)
target = df['Species'] 
s = set()
for val in target:
    s.add(val)
s = list(s)
rows = list(range(100,150))
df = df.drop(df.index[rows])

Since the Iris dataset has three classes, we will remove one of the classes. This leaves us with a binary class classification problem.

import matplotlib.pyplot as plt 

x = df['SepalLengthCm'] 
y = df['PetalLengthCm']

setosa_x = x[:50] 
setosa_y = y[:50]

versicolor_x = x[50:] 
versicolor_y = y[50:]

plt.figure(figsize=(8,6))
plt.scatter(setosa_x,setosa_y,marker='+',color='green')
plt.scatter(versicolor_x,versicolor_ y,marker='_',color='red')
plt.show()

Also, there are four features available for us to use. We will be using only two features, i.e Sepal length and Petal length. We take these two features and plot them to visualize. From the above graph, you can infer that a linear line can be used to separate the data points.

from sklearn.utils import shuffle
from sklearn.cross_validation import train_test_split
import numpy as np 
 Shuffle and split the data into training and test set
X, Y = shuffle(X,Y)
x_train = [ ]
y_train = [ ]
x_test = [ ]
y_test = [ ]

x_train, x_test, y_train, y_te st = train_test_split(X, Y, train_size=0.9)

x_train = np.array(x_train)
y_train = np.array(y_train)
x_test = np.array(x_test)
y_test = np.array(y_test) 

y_train = y_train.reshape(90,1) 
y_test = y_test.reshape(10,1)

We extract the required features and split it into training and testing data. 90% of the data is used for training and the rest 10% is used for testing. Let’s now build our SVM model using the numpy library.

 Clip the weights
index = list(range(10,90)) 
w1 = np.delete(w1,index)
w2 = np.delete(w2,index)

w1 = w1.reshape(10,1)
w2 = w2.reshape(10,1) 
 Predict
y_pred = w1 * test_f1 + w2 * test_f2
predictions = []
for val in y_pred:
if(val > 1):
predictions.append(1)
else:
predictions.append(-1)

print(accuracy_score(y_test,predictions))

We now clip the weights as the test data contains only 10 data points. We extract the features from the test data and predict the values. We obtain the predictions and compare it with the actual values and print the accuracy of our model.

#>Accuracy:1.0

Implementation with Scikit learn

There is another simple way to implement the SVM algorithm. We can use the Scikit learn library and just call the related functions to implement the SVM model. The number of lines of code reduces significantly too few lines.

from sklearn.svm import SVC
from sklearn.metrics import accuracy_score 

clf = SVC(kernel='linear')
clf.fit(x_train,y_train)
y_pred = clf.predict(x_test) 
print(accuracy_score(y_test,y_pred))

Page structure

Concept map →