Image Classification with YOLOv8

Siddheshwar Harkal
5 min readAug 26, 2023

--

Ultralytics published the latest version of the YOLOv8 (You Only Look Once ) model in January 2023 which is a new state-of-the-art (SOTA) computer vision model

YOLO is primarily designed for object detection tasks which involve identifying and localizing objects within an image while YOLOv8 is designed to be fast, accurate, and easy to use, making it an excellent choice for a wide range of object detection and tracking, for instance, segmentation, image classification and pose estimation tasks.

YOLOv8 Layout

The actual published paper has not been released yet but the creators of YOLOv8 promised that it will come out soon, YOLOv8 builds on the success of previous versions of the CNN model by introducing new features and improvements for enhanced performance and training using the PyTorch freamwork, The YOLO model family belongs to one-stage object detection models that process an entire image in a single forward pass of a convolutional neural network

In this tutorial, we specifically look at how to solve image classification problems using YOLOv8 which is pre-trained on the ImageNet dataset with an image resolution of 224

Image Classification

I have taken the dataset from Kaggle The objective of this challenge is to build an image classification model capable of accurately classifying the given sports-related images where we have around 8000 images for training and 2000 images for testing, These images are sourced from various sports activities, encompassing cricket, wrestling, tennis, badminton, soccer, swimming, and karate.

In the next section, we will cover how to access YOLO via Python.

Install Ultralytics with pip and get up and running in minutes

!pip install ultralytics

Dataset format

The folder structure for classification datasets in torch-vision typically follows a standard format mentioned below

Data/
| — class1/
| | — img1.jpg
| | — img2.jpg
| | — …
|
| — class2/
| | — img1.jpg
| | — img2.jpg
| | — …
|
| — class3/
| | — img1.jpg
| | — img2.jpg
| | — …
|
| — …

In this folder structure, the Data directory contains one subdirectory for each class in the dataset. Each subdirectory is named after the corresponding class and contains all the images for that class. Each image file is named uniquely and is typically in a common image file format such as JPEG or PNG. After that, we spit the dataset into the train, validated and tested the dataset according to the directory and it looks as mentioned in the below format

Data/
|
| — train/
| | — cricket/
| | | — 10008_cricket.png
| | | — 10009_cricket.png
| | | — …
| |
| | — wrestling/
| | | — 1000_wrestling.png
| | | — 1001_wrestling.png
| | | — …
| |
| | — tennis/
| | | — 10014_tennis.png
| | | — 10015_tennis.png
| | | — …
| |
| | — …
|
| — val/
| | — cricket/
| | | — 10_cricket.png
| | | — 11_cricket.png
| | | — …
| |
| | — wrestling/
| | | — 100_wrestling.png
| | | — 101_wrestling.png
| | | — …
| |
| | — tennis/
| | | — 1000_tennis.png
| | | — 1001_tennis.png
| | | — …

Train

We have trained a CNN model on the sports dataset for 5 epochs and got a good amount of accuracy, we can be trained with other hyperparameters such as image size, learning rate, device, batch size, weight decay, optimizer, and many more based on CPU and GPU

Before training, we have to create an account on weight and bias to Track, monitor, and visualize the model metadata and metrics

You can use the following code snippets

# import YOLO model
from ultralytics import YOLO

# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)

# Train the model
model.train(data='/kaggle/working/data', epochs=5)

After model training, Yolov8 creates a run directory where we see model metrics in PNG format such as confusion matrics and results.csv that contain training and validation loss, accuracy, and learning rate

The training results of the model are shown below.

Validate

# Validate the model
metrics = model.val() # no arguments needed, dataset and settings remembered
metrics.top1 # top1 accuracy
metrics.top5 # top5 accuracy

Val mode is used for validating a YOLOv8 model after it has been trained. In this mode, the model is evaluated on a validation set to measure its accuracy and generalization performance, We are getting a top 1 accuracy is 94 and a top 5 accuracy is 99

Inference

results = model.predict(“/kaggle/working/data/test/Cricket/704bb73ae1.jpg”)
probs = result.probs # Probs object for classification outputs
print(probs.data)

# Output
tensor([1.4597e-05, 9.9976e-01, 9.4075e-07, 9.4564e-05, 4.1166e-06, 9.2344e-05, 3.5393e-05], device='cuda:0')

After making a prediction on the test image model will return a dictionary having box, mask, keypoints, prob where the box is used for the object in the object detection use case, prob is used image classification use case

Applying the trained model on a test image snippet yielded remarkably accurate results, considering the limited amount of data used for fine-tuning. The outcome of the model predictions is shown below.

Conclusion

In this article, we have provided an overview of the YOLOv8 Model, Additionally, we have provided a step-by-step guide on how to use YOLOv8 for Image classification tasks in addition to that Roboflow and Ultralytics provide an excellent platform for building, annotating, and training Yolov8 models

Reference

If you found this article insightful, follow me on Linkedin and Medium Your support and engagement will greatly motivate me to create more valuable content

Please feel free to comment if you have any questions 🙂

--

--

Siddheshwar Harkal
Siddheshwar Harkal

No responses yet