top of page
Interesting Recent Posts :
Writer's pictureRohit chopra

Machine Learning : KNN model

Introduction to K-Nearest Neighbors (KNN) Algorithm in Machine Learning:

K-Nearest Neighbors (KNN) is one of the simplest and most popular machine learning algorithms used for classification and regression tasks. It is a type of supervised learning algorithm that is based on the idea of finding the K nearest data points to a new input data point and predicting the output based on the majority class of the K nearest neighbors.

The KNN algorithm is used in various fields such as image recognition, speech recognition, recommender systems, and natural language processing, etc.

How does KNN Algorithm Work?

The KNN algorithm works by finding the K nearest data points to a new input data point and predicting the output based on the majority class of those K nearest neighbors. The distance between the input data point and the other data points is calculated using a distance metric such as Euclidean distance, Manhattan distance, or Hamming distance, etc.

The value of K determines the number of neighbors to consider while making a prediction. A smaller value of K means that the algorithm will be sensitive to noise in the data, while a larger value of K means that the algorithm will be more generalized but may not capture the underlying structure of the data.

KNN Algorithm in Python:

In this section, we will implement the KNN algorithm in Python using the scikit-learn library. We will use the iris dataset, which is a classic machine learning dataset consisting of 150 observations of iris flowers with 4 features: sepal length, sepal width, petal length, and petal width.

Step 1: Import the Required Libraries

We will start by importing the required libraries, which are NumPy and scikit-learn.


import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

Step 2: Load the Iris Dataset

Next, we will load the iris dataset using the load_iris() function from scikit-learn.


iris = load_iris()
X = iris.data
y = iris.target

Step 3: Split the Data into Training and Test Sets

We will now split the data into training and test sets using the train_test_split() function from scikit-learn. We will use 70% of the data for training and 30% for testing.


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)

Step 4: Train the KNN Model

We will now train the KNN model on the training data using the KNeighborsClassifier() function from scikit-learn. We will set the value of K to 3 and use the Euclidean distance metric.


k = 3
model = KNeighborsClassifier(n_neighbors=k, metric='euclidean')
model.fit(X_train, y_train)

Step 5: Make Predictions on the Test Data

We will now use the trained KNN model to make predictions on the test data.


y_pred = model.predict(X_test)

Step 6: Evaluate the Model

We will evaluate the performance of the KNN model using the accuracy score, which is the fraction of correctly classified instances out of the total number of instances.


accuracy = accuracy_score(y_test, y_pred)
print('Accuracy:', accuracy)

Output:


Accuracy: 0.9777777777777777
94 views

Recent Posts

See All

GAN Unsupervised Deep Learning

Generative Adversarial Networks (GANs) are a type of unsupervised deep learning algorithm that are used to generate synthetic data...

Comments


bottom of page