Machine Learning

3.1 Perceptron

3.1.1 Example runs from project description

This section contains code equivalent to that in the example run in the project documentation, to show that it meets the specifications.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from portfolio.perceptron import Perceptron, plot_decision_regions, IRIS_OPTIONS
from portfolio.visualize import Plot

Download and parse the dataset

df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', **IRIS_OPTIONS)

Extract the first 100 labels (which are the first two types)

y = df.iloc[0:100, 4].values
y

Iris-setosa

Iris-versicolor

Since we only selected the first two, the classes are either Iris-setosa or Iris-versicolor. Label Iris-setosa as 1 and everything else as 0.

y = np.where(y == 'Iris-setosa', -1, 1)
y

-1

1

The first and third features are seperable, so select only those for the values in y

X = df.iloc[0:100, [0, 2]].values
X

5.1	1.4
4.9	1.4
4.7	1.3
4.6	1.5
5	1.4
5.4	1.7
4.6	1.4
5	1.5
4.4	1.4
4.9	1.5
5.4	1.5
4.8	1.6
4.8	1.4
4.3	1.1
5.8	1.2
5.7	1.5
5.4	1.3
5.1	1.4
5.7	1.7
5.1	1.5
5.4	1.7
5.1	1.5
4.6	1
5.1	1.7
4.8	1.9
5	1.6
5	1.6
5.2	1.5
5.2	1.4
4.7	1.6
4.8	1.6
5.4	1.5
5.2	1.5
5.5	1.4
4.9	1.5
5	1.2
5.5	1.3
4.9	1.5
4.4	1.3
5.1	1.5
5	1.3
4.5	1.3
4.4	1.3
5	1.6
5.1	1.9
4.8	1.4
5.1	1.6
4.6	1.4
5.3	1.5
5	1.4
7	4.7
6.4	4.5
6.9	4.9
5.5	4
6.5	4.6
5.7	4.5
6.3	4.7
4.9	3.3
6.6	4.6
5.2	3.9
5	3.5
5.9	4.2
6	4
6.1	4.7
5.6	3.6
6.7	4.4
5.6	4.5
5.8	4.1
6.2	4.5
5.6	3.9
5.9	4.8
6.1	4
6.3	4.9
6.1	4.7
6.4	4.3
6.6	4.4
6.8	4.8
6.7	5
6	4.5
5.7	3.5
5.5	3.8
5.5	3.7
5.8	3.9
6	5.1
5.4	4.5
6	4.5
6.7	4.7
6.3	4.4
5.6	4.1
5.5	4
5.5	4.4
6.1	4.6
5.8	4
5	3.3
5.6	4.2
5.7	4.2
5.7	4.2
6.2	4.3
5.1	3
5.7	4.1

Plotting the data, we can clearly see that these two features are seperable.

plt.figure()
plt.scatter(X[:50, 0], X[:50, 1], color='red', marker='o', label='setosa')
plt.scatter(X[50:100, 0], X[50:100, 1], color='blue', marker='x', label='versicolor')
plt.xlabel('petal length')
plt.ylabel('sepal length')
plt.legend(loc='upper left')
Plot.embed(plt)

Initialize the perceptron with a learning rate of 0.1 and a maximum of 1000 iterations.

pn = Perceptron(0.1, 1000)

fit runs the perceptron algorithm on the given data. The number of errors per iteration is stored in errors. Since this only took 6 iterations to converge, it dropped out early instead of doing the entire 1000.

pn.fit(X, y)
pn.errors

2

3

2

1

0

This data can be seen plotted below.

plt.figure()
plt.plot(range(1, len(pn.errors) + 1), pn.errors, marker='o')
plt.xlabel('Iteration')
plt.ylabel('# of misclassifications')
Plot.embed(plt)

pn.net_input(X)

-1.32

-1.184

-1.23

-0.798

-1.252

-0.978

-0.98

-1.07

-0.844

-1.002

-1.342

-0.752

-1.116

-1.322

-2.16

-1.546

-1.706

-1.32

-1.182

-1.138

-0.978

-1.138

-1.708

-0.774

-0.206

-0.888

-1.206

-1.388

-0.684

-0.752

-1.342

-1.206

-1.592

-1.002

-1.616

-1.774

-1.002

-1.026

-1.138

-1.434

-1.094

-1.026

-0.888

-0.41

-1.116

-0.956

-0.98

-1.274

-1.252

3.394

3.438

3.826

3.14

3.552

3.914

3.87

2.274

3.484

3.162

2.57

3.232

2.8

4.006

2.344

3.052

3.982

3.118

3.574

2.89

4.324

2.732

4.234

4.006

3.074

3.12

3.712

4.144

3.71

2.094

2.776

2.594

2.754

4.802

4.118

3.71

3.598

3.324

3.254

3.14

3.868

3.824

2.936

2.206

3.436

3.368

3.21

1.592

3.186

Running it on the original data in order, we can see that it correctly classifies all of the samples.

pn.predict(X)

-1

1

pn.weight

-0.4

-0.68

1.82

Here is a visualization of the computed decision boundary

plot_decision_regions(X, y, pn)
Plot.embed(plt)

The other data I tested against:

Really close data. 701 seems like a really big number, but they are close so I'm assuming it could be right.

X1 = df.iloc[0:100, [0, 1]].values
pn.fit(X1, y)
pn.errors.shape

701

plot_decision_regions(X1, y, pn)
Plot.embed(plt)

And data with a very wide separation. Here we can see it dropping out after only 3 iterations.

X2 = df.iloc[0:100, [2, 3]].values
pn.fit(X2, y)
pn.errors

2

0

plot_decision_regions(X2, y, pn)
Plot.embed(plt)

Close all the figures we opened:

plt.close('all')

3.2 Linear Regression

For single dimension data, linear regression is defined as \(w = A^{-1}b\), where A is \(xx^\mathsf{T}\), and b is \(yx\). Since the matrix may not be invertible, we take the pseudo inverse. Since I added a column of ones to the A matrix, \(w_0\) holds the y intercept, and since it is only a single dimension, \(w_1\) holds the slope of the line.

The linear regression code may be run on the iris dataset with the following:

from portfolio.linear_regression import LinearRegression

x_sepal_width = x[:, 1]

regression_plot = Plot(x_sepal_width, y)
regression_plot.add_view(LinearRegression)
regression_plot.embed()

3.3 Decision Stumps

Decision stumps are a type of weak learner. They consist of a decision tree with only a single node. My implementation only accepts a binary split of classes labeled -1 and 1, although in theory a decision stump can split based on threshold into any number of classes.

I couldn't figure out how to translate the pseudocode for the efficient \(O(dm)\) implementation, so this is the naïve \(O(dm^2)\) version. I also have a somewhat fuzzy understanding of the book's explanation, but this is equivalent as far as I can tell.

Since the decision stump needs a threshold \(\theta\) in order to classify everything on each side as a different class, we need to define a set of steps. Given that two points may be arbitrarily close, I used the \(x\) values from the data to ensure that if there was a single optimal \(\theta\) it would be selected.

The other part of the decision stump is the dimension of the data against which it will classify. This implementation considers all given dimensions and selects the most suitable one.

The last part is the error function. The error function I used is a count of the differences between the predicted class and the expected class.

For each of the combinations of dimensions and possible thresholds, the decision stump checks the error function. The one with the lowest error is selected as the dimension to classify against and the \(\theta\) to use as a threshold.

Shown here is the decision boundary, along with the values plotted for the chosen dimension, with the label on the y axis. As you can see, these are not seperable, but the optimal boundary was still found.

from portfolio.decision_stumps import DecisionStumps

x_sepal_width = x[:, 1]

regression_plot = Plot(x, y)
regression_plot.add_view(DecisionStumps)
regression_plot.embed()

3.4 Support Vector Machine

3.4.1 Soft SVM

SVMs are a linear classification model . Unlike the perceptron, SVMs optimize for the widest margin between classes.

In addition, since this is a soft-margin svm, it uses a a hinge loss function. This allows it to accept non linearly serperable data, and produce a reasonable classification boundary.

Optimization of this loss function is implemented via a stochastic gradient decent (SGD).

For example, this data from iris is not seperable.

from portfolio import svm

y = df.iloc[50:, 4].values
y = np.where(y == 'Iris-versicolor', -1, 1)
X = df.iloc[50:, [0, 3]].values
np.c_[y, X]

label	sepal length	petal width
-1	7	1.4
-1	6.4	1.5
-1	6.9	1.5
-1	5.5	1.3
-1	6.5	1.5
-1	5.7	1.3
-1	6.3	1.6
-1	4.9	1
-1	6.6	1.3
-1	5.2	1.4
-1	5	1
-1	5.9	1.5
-1	6	1
-1	6.1	1.4
-1	5.6	1.3
-1	6.7	1.4
-1	5.6	1.5
-1	5.8	1
-1	6.2	1.5
-1	5.6	1.1
-1	5.9	1.8
-1	6.1	1.3
-1	6.3	1.5
-1	6.1	1.2
-1	6.4	1.3
-1	6.6	1.4
-1	6.8	1.4
-1	6.7	1.7
-1	6	1.5
-1	5.7	1
-1	5.5	1.1
-1	5.5	1
-1	5.8	1.2
-1	6	1.6
-1	5.4	1.5
-1	6	1.6
-1	6.7	1.5
-1	6.3	1.3
-1	5.6	1.3
-1	5.5	1.3
-1	5.5	1.2
-1	6.1	1.4
-1	5.8	1.2
-1	5	1
-1	5.6	1.3
-1	5.7	1.2
-1	5.7	1.3
-1	6.2	1.3
-1	5.1	1.1
-1	5.7	1.3
1	6.3	2.5
1	5.8	1.9
1	7.1	2.1
1	6.3	1.8
1	6.5	2.2
1	7.6	2.1
1	4.9	1.7
1	7.3	1.8
1	6.7	1.8
1	7.2	2.5
1	6.5	2
1	6.4	1.9
1	6.8	2.1
1	5.7	2
1	5.8	2.4
1	6.4	2.3
1	6.5	1.8
1	7.7	2.2
1	7.7	2.3
1	6	1.5
1	6.9	2.3
1	5.6	2
1	7.7	2
1	6.3	1.8
1	6.7	2.1
1	7.2	1.8
1	6.2	1.8
1	6.1	1.8
1	6.4	2.1
1	7.2	1.6
1	7.4	1.9
1	7.9	2
1	6.4	2.2
1	6.3	1.5
1	6.1	1.4
1	7.7	2.3
1	6.3	2.4
1	6.4	1.8
1	6	1.8
1	6.9	2.1
1	6.7	2.4
1	6.9	2.3
1	5.8	1.9
1	6.8	2.3
1	6.7	2.5
1	6.7	2.3
1	6.3	1.9
1	6.5	2
1	6.2	2.3
1	5.9	1.8

Running the soft svd classifier on it, we get a reasonably accurate classification of the data.

plot = Plot(X, y)
plot.add_view(svm.Soft)
plot.embed()

3.5 K-Nearest Neighbor

K nearest neighbors classifier simply classifies a sample based upon the closest training data.

At least in this implementation, although also in general to a lesser extent, KNN tends to be very computationally intensive for large datasets, since it must compute the distance to all points in order to find the best.

from portfolio.knn import KNN

y = df.iloc[50:, 4].values
y = np.where(y == 'Iris-versicolor', -1, 1)
X = df.iloc[50:, [0, 3]].values
np.c_[y, X]

label	sepal length	petal width
-1	7	1.4
-1	6.4	1.5
-1	6.9	1.5
-1	5.5	1.3
-1	6.5	1.5
-1	5.7	1.3
-1	6.3	1.6
-1	4.9	1
-1	6.6	1.3
-1	5.2	1.4
-1	5	1
-1	5.9	1.5
-1	6	1
-1	6.1	1.4
-1	5.6	1.3
-1	6.7	1.4
-1	5.6	1.5
-1	5.8	1
-1	6.2	1.5
-1	5.6	1.1
-1	5.9	1.8
-1	6.1	1.3
-1	6.3	1.5
-1	6.1	1.2
-1	6.4	1.3
-1	6.6	1.4
-1	6.8	1.4
-1	6.7	1.7
-1	6	1.5
-1	5.7	1
-1	5.5	1.1
-1	5.5	1
-1	5.8	1.2
-1	6	1.6
-1	5.4	1.5
-1	6	1.6
-1	6.7	1.5
-1	6.3	1.3
-1	5.6	1.3
-1	5.5	1.3
-1	5.5	1.2
-1	6.1	1.4
-1	5.8	1.2
-1	5	1
-1	5.6	1.3
-1	5.7	1.2
-1	5.7	1.3
-1	6.2	1.3
-1	5.1	1.1
-1	5.7	1.3
1	6.3	2.5
1	5.8	1.9
1	7.1	2.1
1	6.3	1.8
1	6.5	2.2
1	7.6	2.1
1	4.9	1.7
1	7.3	1.8
1	6.7	1.8
1	7.2	2.5
1	6.5	2
1	6.4	1.9
1	6.8	2.1
1	5.7	2
1	5.8	2.4
1	6.4	2.3
1	6.5	1.8
1	7.7	2.2
1	7.7	2.3
1	6	1.5
1	6.9	2.3
1	5.6	2
1	7.7	2
1	6.3	1.8
1	6.7	2.1
1	7.2	1.8
1	6.2	1.8
1	6.1	1.8
1	6.4	2.1
1	7.2	1.6
1	7.4	1.9
1	7.9	2
1	6.4	2.2
1	6.3	1.5
1	6.1	1.4
1	7.7	2.3
1	6.3	2.4
1	6.4	1.8
1	6	1.8
1	6.9	2.1
1	6.7	2.4
1	6.9	2.3
1	5.8	1.9
1	6.8	2.3
1	6.7	2.5
1	6.7	2.3
1	6.3	1.9
1	6.5	2
1	6.2	2.3
1	5.9	1.8

knn = KNN(X, y)

knn.predict([[0, 0]]).item()

-1

knn.predict([[3, 8]]).item()

1

Given the same data as the SVM example, we can see that KNN is able to classify this data entirely as opposed to the lossy linear classifier. This isn't necessarily an argument, since it doesn't take into account the possibility of overfitting, which KNN is always affected by.

Given certain datasets, including this one, the performance is very good.

fig = plt.figure()
visualize.plot_decision_regions(fig.add_subplot(), X, y, knn)
Plot.embed(fig)

Machine Learning

1 About

2 Setup

3 Models

3.1 Perceptron

3.1.1 Example runs from project description

3.2 Linear Regression

3.3 Decision Stumps

3.4 Support Vector Machine

3.4.1 Soft SVM

3.5 K-Nearest Neighbor

5.1	1.4
4.9	1.4
4.7	1.3
4.6	1.5
5	1.4
5.4	1.7
4.6	1.4
5	1.5
4.4	1.4
4.9	1.5
5.4	1.5
4.8	1.6
4.8	1.4
4.3	1.1
5.8	1.2
5.7	1.5
5.4	1.3
5.1	1.4
5.7	1.7
5.1	1.5
5.4	1.7
5.1	1.5
4.6	1
5.1	1.7
4.8	1.9
5	1.6
5	1.6
5.2	1.5
5.2	1.4
4.7	1.6
4.8	1.6
5.4	1.5
5.2	1.5
5.5	1.4
4.9	1.5
5	1.2
5.5	1.3
4.9	1.5
4.4	1.3
5.1	1.5
5	1.3
4.5	1.3
4.4	1.3
5	1.6
5.1	1.9
4.8	1.4
5.1	1.6
4.6	1.4
5.3	1.5
5	1.4
7	4.7
6.4	4.5
6.9	4.9
5.5	4
6.5	4.6
5.7	4.5
6.3	4.7
4.9	3.3
6.6	4.6
5.2	3.9
5	3.5
5.9	4.2
6	4
6.1	4.7
5.6	3.6
6.7	4.4
5.6	4.5
5.8	4.1
6.2	4.5
5.6	3.9
5.9	4.8
6.1	4
6.3	4.9
6.1	4.7
6.4	4.3
6.6	4.4
6.8	4.8
6.7	5
6	4.5
5.7	3.5
5.5	3.8
5.5	3.7
5.8	3.9
6	5.1
5.4	4.5
6	4.5
6.7	4.7
6.3	4.4
5.6	4.1
5.5	4
5.5	4.4
6.1	4.6
5.8	4
5	3.3
5.6	4.2
5.7	4.2
5.7	4.2
6.2	4.3
5.1	3
5.7	4.1

label	sepal length	petal width
-1	7	1.4
-1	6.4	1.5
-1	6.9	1.5
-1	5.5	1.3
-1	6.5	1.5
-1	5.7	1.3
-1	6.3	1.6
-1	4.9	1
-1	6.6	1.3
-1	5.2	1.4
-1	5	1
-1	5.9	1.5
-1	6	1
-1	6.1	1.4
-1	5.6	1.3
-1	6.7	1.4
-1	5.6	1.5
-1	5.8	1
-1	6.2	1.5
-1	5.6	1.1
-1	5.9	1.8
-1	6.1	1.3
-1	6.3	1.5
-1	6.1	1.2
-1	6.4	1.3
-1	6.6	1.4
-1	6.8	1.4
-1	6.7	1.7
-1	6	1.5
-1	5.7	1
-1	5.5	1.1
-1	5.5	1
-1	5.8	1.2
-1	6	1.6
-1	5.4	1.5
-1	6	1.6
-1	6.7	1.5
-1	6.3	1.3
-1	5.6	1.3
-1	5.5	1.3
-1	5.5	1.2
-1	6.1	1.4
-1	5.8	1.2
-1	5	1
-1	5.6	1.3
-1	5.7	1.2
-1	5.7	1.3
-1	6.2	1.3
-1	5.1	1.1
-1	5.7	1.3
1	6.3	2.5
1	5.8	1.9
1	7.1	2.1
1	6.3	1.8
1	6.5	2.2
1	7.6	2.1
1	4.9	1.7
1	7.3	1.8
1	6.7	1.8
1	7.2	2.5
1	6.5	2
1	6.4	1.9
1	6.8	2.1
1	5.7	2
1	5.8	2.4
1	6.4	2.3
1	6.5	1.8
1	7.7	2.2
1	7.7	2.3
1	6	1.5
1	6.9	2.3
1	5.6	2
1	7.7	2
1	6.3	1.8
1	6.7	2.1
1	7.2	1.8
1	6.2	1.8
1	6.1	1.8
1	6.4	2.1
1	7.2	1.6
1	7.4	1.9
1	7.9	2
1	6.4	2.2
1	6.3	1.5
1	6.1	1.4
1	7.7	2.3
1	6.3	2.4
1	6.4	1.8
1	6	1.8
1	6.9	2.1
1	6.7	2.4
1	6.9	2.3
1	5.8	1.9
1	6.8	2.3
1	6.7	2.5
1	6.7	2.3
1	6.3	1.9
1	6.5	2
1	6.2	2.3
1	5.9	1.8

label	sepal length	petal width
-1	7	1.4
-1	6.4	1.5
-1	6.9	1.5
-1	5.5	1.3
-1	6.5	1.5
-1	5.7	1.3
-1	6.3	1.6
-1	4.9	1
-1	6.6	1.3
-1	5.2	1.4
-1	5	1
-1	5.9	1.5
-1	6	1
-1	6.1	1.4
-1	5.6	1.3
-1	6.7	1.4
-1	5.6	1.5
-1	5.8	1
-1	6.2	1.5
-1	5.6	1.1
-1	5.9	1.8
-1	6.1	1.3
-1	6.3	1.5
-1	6.1	1.2
-1	6.4	1.3
-1	6.6	1.4
-1	6.8	1.4
-1	6.7	1.7
-1	6	1.5
-1	5.7	1
-1	5.5	1.1
-1	5.5	1
-1	5.8	1.2
-1	6	1.6
-1	5.4	1.5
-1	6	1.6
-1	6.7	1.5
-1	6.3	1.3
-1	5.6	1.3
-1	5.5	1.3
-1	5.5	1.2
-1	6.1	1.4
-1	5.8	1.2
-1	5	1
-1	5.6	1.3
-1	5.7	1.2
-1	5.7	1.3
-1	6.2	1.3
-1	5.1	1.1
-1	5.7	1.3
1	6.3	2.5
1	5.8	1.9
1	7.1	2.1
1	6.3	1.8
1	6.5	2.2
1	7.6	2.1
1	4.9	1.7
1	7.3	1.8
1	6.7	1.8
1	7.2	2.5
1	6.5	2
1	6.4	1.9
1	6.8	2.1
1	5.7	2
1	5.8	2.4
1	6.4	2.3
1	6.5	1.8
1	7.7	2.2
1	7.7	2.3
1	6	1.5
1	6.9	2.3
1	5.6	2
1	7.7	2
1	6.3	1.8
1	6.7	2.1
1	7.2	1.8
1	6.2	1.8
1	6.1	1.8
1	6.4	2.1
1	7.2	1.6
1	7.4	1.9
1	7.9	2
1	6.4	2.2
1	6.3	1.5
1	6.1	1.4
1	7.7	2.3
1	6.3	2.4
1	6.4	1.8
1	6	1.8
1	6.9	2.1
1	6.7	2.4
1	6.9	2.3
1	5.8	1.9
1	6.8	2.3
1	6.7	2.5
1	6.7	2.3
1	6.3	1.9
1	6.5	2
1	6.2	2.3
1	5.9	1.8

5.1	1.4
4.9	1.4
4.7	1.3
4.6	1.5
5	1.4
5.4	1.7
4.6	1.4
5	1.5
4.4	1.4
4.9	1.5
5.4	1.5
4.8	1.6
4.8	1.4
4.3	1.1
5.8	1.2
5.7	1.5
5.4	1.3
5.1	1.4
5.7	1.7
5.1	1.5
5.4	1.7
5.1	1.5
4.6	1
5.1	1.7
4.8	1.9
5	1.6
5	1.6
5.2	1.5
5.2	1.4
4.7	1.6
4.8	1.6
5.4	1.5
5.2	1.5
5.5	1.4
4.9	1.5
5	1.2
5.5	1.3
4.9	1.5
4.4	1.3
5.1	1.5
5	1.3
4.5	1.3
4.4	1.3
5	1.6
5.1	1.9
4.8	1.4
5.1	1.6
4.6	1.4
5.3	1.5
5	1.4
7	4.7
6.4	4.5
6.9	4.9
5.5	4
6.5	4.6
5.7	4.5
6.3	4.7
4.9	3.3
6.6	4.6
5.2	3.9
5	3.5
5.9	4.2
6	4
6.1	4.7
5.6	3.6
6.7	4.4
5.6	4.5
5.8	4.1
6.2	4.5
5.6	3.9
5.9	4.8
6.1	4
6.3	4.9
6.1	4.7
6.4	4.3
6.6	4.4
6.8	4.8
6.7	5
6	4.5
5.7	3.5
5.5	3.8
5.5	3.7
5.8	3.9
6	5.1
5.4	4.5
6	4.5
6.7	4.7
6.3	4.4
5.6	4.1
5.5	4
5.5	4.4
6.1	4.6
5.8	4
5	3.3
5.6	4.2
5.7	4.2
5.7	4.2
6.2	4.3
5.1	3
5.7	4.1

label	sepal length	petal width
-1	7	1.4
-1	6.4	1.5
-1	6.9	1.5
-1	5.5	1.3
-1	6.5	1.5
-1	5.7	1.3
-1	6.3	1.6
-1	4.9	1
-1	6.6	1.3
-1	5.2	1.4
-1	5	1
-1	5.9	1.5
-1	6	1
-1	6.1	1.4
-1	5.6	1.3
-1	6.7	1.4
-1	5.6	1.5
-1	5.8	1
-1	6.2	1.5
-1	5.6	1.1
-1	5.9	1.8
-1	6.1	1.3
-1	6.3	1.5
-1	6.1	1.2
-1	6.4	1.3
-1	6.6	1.4
-1	6.8	1.4
-1	6.7	1.7
-1	6	1.5
-1	5.7	1
-1	5.5	1.1
-1	5.5	1
-1	5.8	1.2
-1	6	1.6
-1	5.4	1.5
-1	6	1.6
-1	6.7	1.5
-1	6.3	1.3
-1	5.6	1.3
-1	5.5	1.3
-1	5.5	1.2
-1	6.1	1.4
-1	5.8	1.2
-1	5	1
-1	5.6	1.3
-1	5.7	1.2
-1	5.7	1.3
-1	6.2	1.3
-1	5.1	1.1
-1	5.7	1.3
1	6.3	2.5
1	5.8	1.9
1	7.1	2.1
1	6.3	1.8
1	6.5	2.2
1	7.6	2.1
1	4.9	1.7
1	7.3	1.8
1	6.7	1.8
1	7.2	2.5
1	6.5	2
1	6.4	1.9
1	6.8	2.1
1	5.7	2
1	5.8	2.4
1	6.4	2.3
1	6.5	1.8
1	7.7	2.2
1	7.7	2.3
1	6	1.5
1	6.9	2.3
1	5.6	2
1	7.7	2
1	6.3	1.8
1	6.7	2.1
1	7.2	1.8
1	6.2	1.8
1	6.1	1.8
1	6.4	2.1
1	7.2	1.6
1	7.4	1.9
1	7.9	2
1	6.4	2.2
1	6.3	1.5
1	6.1	1.4
1	7.7	2.3
1	6.3	2.4
1	6.4	1.8
1	6	1.8
1	6.9	2.1
1	6.7	2.4
1	6.9	2.3
1	5.8	1.9
1	6.8	2.3
1	6.7	2.5
1	6.7	2.3
1	6.3	1.9
1	6.5	2
1	6.2	2.3
1	5.9	1.8

label	sepal length	petal width
-1	7	1.4
-1	6.4	1.5
-1	6.9	1.5
-1	5.5	1.3
-1	6.5	1.5
-1	5.7	1.3
-1	6.3	1.6
-1	4.9	1
-1	6.6	1.3
-1	5.2	1.4
-1	5	1
-1	5.9	1.5
-1	6	1
-1	6.1	1.4
-1	5.6	1.3
-1	6.7	1.4
-1	5.6	1.5
-1	5.8	1
-1	6.2	1.5
-1	5.6	1.1
-1	5.9	1.8
-1	6.1	1.3
-1	6.3	1.5
-1	6.1	1.2
-1	6.4	1.3
-1	6.6	1.4
-1	6.8	1.4
-1	6.7	1.7
-1	6	1.5
-1	5.7	1
-1	5.5	1.1
-1	5.5	1
-1	5.8	1.2
-1	6	1.6
-1	5.4	1.5
-1	6	1.6
-1	6.7	1.5
-1	6.3	1.3
-1	5.6	1.3
-1	5.5	1.3
-1	5.5	1.2
-1	6.1	1.4
-1	5.8	1.2
-1	5	1
-1	5.6	1.3
-1	5.7	1.2
-1	5.7	1.3
-1	6.2	1.3
-1	5.1	1.1
-1	5.7	1.3
1	6.3	2.5
1	5.8	1.9
1	7.1	2.1
1	6.3	1.8
1	6.5	2.2
1	7.6	2.1
1	4.9	1.7
1	7.3	1.8
1	6.7	1.8
1	7.2	2.5
1	6.5	2
1	6.4	1.9
1	6.8	2.1
1	5.7	2
1	5.8	2.4
1	6.4	2.3
1	6.5	1.8
1	7.7	2.2
1	7.7	2.3
1	6	1.5
1	6.9	2.3
1	5.6	2
1	7.7	2
1	6.3	1.8
1	6.7	2.1
1	7.2	1.8
1	6.2	1.8
1	6.1	1.8
1	6.4	2.1
1	7.2	1.6
1	7.4	1.9
1	7.9	2
1	6.4	2.2
1	6.3	1.5
1	6.1	1.4
1	7.7	2.3
1	6.3	2.4
1	6.4	1.8
1	6	1.8
1	6.9	2.1
1	6.7	2.4
1	6.9	2.3
1	5.8	1.9
1	6.8	2.3
1	6.7	2.5
1	6.7	2.3
1	6.3	1.9
1	6.5	2
1	6.2	2.3
1	5.9	1.8

5.1	1.4
4.9	1.4
4.7	1.3
4.6	1.5
5	1.4
5.4	1.7
4.6	1.4
5	1.5
4.4	1.4
4.9	1.5
5.4	1.5
4.8	1.6
4.8	1.4
4.3	1.1
5.8	1.2
5.7	1.5
5.4	1.3
5.1	1.4
5.7	1.7
5.1	1.5
5.4	1.7
5.1	1.5
4.6	1
5.1	1.7
4.8	1.9
5	1.6
5	1.6
5.2	1.5
5.2	1.4
4.7	1.6
4.8	1.6
5.4	1.5
5.2	1.5
5.5	1.4
4.9	1.5
5	1.2
5.5	1.3
4.9	1.5
4.4	1.3
5.1	1.5
5	1.3
4.5	1.3
4.4	1.3
5	1.6
5.1	1.9
4.8	1.4
5.1	1.6
4.6	1.4
5.3	1.5
5	1.4
7	4.7
6.4	4.5
6.9	4.9
5.5	4
6.5	4.6
5.7	4.5
6.3	4.7
4.9	3.3
6.6	4.6
5.2	3.9
5	3.5
5.9	4.2
6	4
6.1	4.7
5.6	3.6
6.7	4.4
5.6	4.5
5.8	4.1
6.2	4.5
5.6	3.9
5.9	4.8
6.1	4
6.3	4.9
6.1	4.7
6.4	4.3
6.6	4.4
6.8	4.8
6.7	5
6	4.5
5.7	3.5
5.5	3.8
5.5	3.7
5.8	3.9
6	5.1
5.4	4.5
6	4.5
6.7	4.7
6.3	4.4
5.6	4.1
5.5	4
5.5	4.4
6.1	4.6
5.8	4
5	3.3
5.6	4.2
5.7	4.2
5.7	4.2
6.2	4.3
5.1	3
5.7	4.1

label	sepal length	petal width
-1	7	1.4
-1	6.4	1.5
-1	6.9	1.5
-1	5.5	1.3
-1	6.5	1.5
-1	5.7	1.3
-1	6.3	1.6
-1	4.9	1
-1	6.6	1.3
-1	5.2	1.4
-1	5	1
-1	5.9	1.5
-1	6	1
-1	6.1	1.4
-1	5.6	1.3
-1	6.7	1.4
-1	5.6	1.5
-1	5.8	1
-1	6.2	1.5
-1	5.6	1.1
-1	5.9	1.8
-1	6.1	1.3
-1	6.3	1.5
-1	6.1	1.2
-1	6.4	1.3
-1	6.6	1.4
-1	6.8	1.4
-1	6.7	1.7
-1	6	1.5
-1	5.7	1
-1	5.5	1.1
-1	5.5	1
-1	5.8	1.2
-1	6	1.6
-1	5.4	1.5
-1	6	1.6
-1	6.7	1.5
-1	6.3	1.3
-1	5.6	1.3
-1	5.5	1.3
-1	5.5	1.2
-1	6.1	1.4
-1	5.8	1.2
-1	5	1
-1	5.6	1.3
-1	5.7	1.2
-1	5.7	1.3
-1	6.2	1.3
-1	5.1	1.1
-1	5.7	1.3
1	6.3	2.5
1	5.8	1.9
1	7.1	2.1
1	6.3	1.8
1	6.5	2.2
1	7.6	2.1
1	4.9	1.7
1	7.3	1.8
1	6.7	1.8
1	7.2	2.5
1	6.5	2
1	6.4	1.9
1	6.8	2.1
1	5.7	2
1	5.8	2.4
1	6.4	2.3
1	6.5	1.8
1	7.7	2.2
1	7.7	2.3
1	6	1.5
1	6.9	2.3
1	5.6	2
1	7.7	2
1	6.3	1.8
1	6.7	2.1
1	7.2	1.8
1	6.2	1.8
1	6.1	1.8
1	6.4	2.1
1	7.2	1.6
1	7.4	1.9
1	7.9	2
1	6.4	2.2
1	6.3	1.5
1	6.1	1.4
1	7.7	2.3
1	6.3	2.4
1	6.4	1.8
1	6	1.8
1	6.9	2.1
1	6.7	2.4
1	6.9	2.3
1	5.8	1.9
1	6.8	2.3
1	6.7	2.5
1	6.7	2.3
1	6.3	1.9
1	6.5	2
1	6.2	2.3
1	5.9	1.8

label	sepal length	petal width
-1	7	1.4
-1	6.4	1.5
-1	6.9	1.5
-1	5.5	1.3
-1	6.5	1.5
-1	5.7	1.3
-1	6.3	1.6
-1	4.9	1
-1	6.6	1.3
-1	5.2	1.4
-1	5	1
-1	5.9	1.5
-1	6	1
-1	6.1	1.4
-1	5.6	1.3
-1	6.7	1.4
-1	5.6	1.5
-1	5.8	1
-1	6.2	1.5
-1	5.6	1.1
-1	5.9	1.8
-1	6.1	1.3
-1	6.3	1.5
-1	6.1	1.2
-1	6.4	1.3
-1	6.6	1.4
-1	6.8	1.4
-1	6.7	1.7
-1	6	1.5
-1	5.7	1
-1	5.5	1.1
-1	5.5	1
-1	5.8	1.2
-1	6	1.6
-1	5.4	1.5
-1	6	1.6
-1	6.7	1.5
-1	6.3	1.3
-1	5.6	1.3
-1	5.5	1.3
-1	5.5	1.2
-1	6.1	1.4
-1	5.8	1.2
-1	5	1
-1	5.6	1.3
-1	5.7	1.2
-1	5.7	1.3
-1	6.2	1.3
-1	5.1	1.1
-1	5.7	1.3
1	6.3	2.5
1	5.8	1.9
1	7.1	2.1
1	6.3	1.8
1	6.5	2.2
1	7.6	2.1
1	4.9	1.7
1	7.3	1.8
1	6.7	1.8
1	7.2	2.5
1	6.5	2
1	6.4	1.9
1	6.8	2.1
1	5.7	2
1	5.8	2.4
1	6.4	2.3
1	6.5	1.8
1	7.7	2.2
1	7.7	2.3
1	6	1.5
1	6.9	2.3
1	5.6	2
1	7.7	2
1	6.3	1.8
1	6.7	2.1
1	7.2	1.8
1	6.2	1.8
1	6.1	1.8
1	6.4	2.1
1	7.2	1.6
1	7.4	1.9
1	7.9	2
1	6.4	2.2
1	6.3	1.5
1	6.1	1.4
1	7.7	2.3
1	6.3	2.4
1	6.4	1.8
1	6	1.8
1	6.9	2.1
1	6.7	2.4
1	6.9	2.3
1	5.8	1.9
1	6.8	2.3
1	6.7	2.5
1	6.7	2.3
1	6.3	1.9
1	6.5	2
1	6.2	2.3
1	5.9	1.8