Vehicle logo recognition using histograms of oriented gradient descriptor and sparsity score

ABSTRACT


INTRODUCTION
How to identify a brand and distinguish with the others visually?Each brand has its own logo that represent a trademark, contain a certain meaning symbolizing that brand and its manufacturer.To create an impressive logo, specific characteristics need to pay attention to many details including: layout, colors, lines, angles, and all information must be arranged in coherent and harmony way.Traditional vehicle recognition systems identify vehicle based on manual human observations via license plate or model of vehicles.Thus, automatic vehicle identification is a key problem in intelligent transportation system.Each vehicle has a unique license plate, but it is difficult to track and identify since these images are in very low quality in in smart surveillance systems.
Various works have been proposed for vehicle logo recognition in the past.We briefly review several works in this field.For example, Llorca et al. [1] apply histograms of oriented gradients and support vector machines (SVM) classifier for vehicle manufacturer recognition.A region of interest is applied before extracting histograms of oriented gradient (HOG) features from logo images.Huang et al. [2] propose a system for car logo segmentation and recognition based on an efficient pre-trained convolutional neural network (CNN) model.Huan et al. [3] present a new algorithm based on Hough transform and Deep Learning is for vehicle logo retrieval task which combine shape detection and deep belief networks.Pan et al. [4]  images.An enhanced logo-recognition system is presented by Psyllos et al. [5] on the Medialab license plate recognition (LPR) dataset.Huang et al. [6] apply Faster-RCNN model with two different CNNs (VGG-16 and ResNet-50) for vehicle logo recognition.Sotheeswaran and Ramanan [7] present a study focuses on local features that describe structural characteristics of the logo of a car using a coarse-to-fine strategy.Nie et al. [8] present a new VLR method based on foreground-background pixel-pair feature.Different hand-crafted descriptors (HOG, LBP and SIFT) are applied to extract features.Cyganek and M. Wo´ zniak [9] use ensemble classifiers based on higher-order singular value decomposition to classify vehicle logo.More recently, Zhao and Wang [10] introduce a modified version of HU invariant moment to represent car logo images.
Indeed, many machine learning problems in computer vision and several related domains need to deal with very high dimensional data.Many of these features may not be relevant for the final prediction task and degrade the classification performance.Multiple studies have shown that the classification performance can be improved by eliminating these features.These issues can be solved by the method of the dimensionality reduction.For this purpose, the dimensionality reduction can be achieved either by feature extraction or feature selection to a low dimensional space.Feature extraction refers to the methods that create a set of new features based on the linear or non-linear combinations of the original features.Further analysis is problematic since we cannot get the physical meanings of these features in the transformed space.Examples of feature extraction methods include principal component analysis (PCA) [11], locality preserving projections (LPP) [12].
In contrast, the feature selection methods aim at finding adequate subsets of features by keeping some original features and therefore maintains the physical meanings of the features.The use of both methods has the advantage of improving performance of classification and increasing computational efficiency.Recently, feature selection has gained increasing interest in the field of machine learning [13][14][15][16], data analysis [17][18][19], and successfully applied in computer vision such as information retrieval [20][21][22] or visual object tracking [23][24][25].In this work, we focus on the application of feature selection methods to vehicle logo images classification by sparsity score.This paper is organized and structured as follows.Section 2 introduces the feature extracting methods based on three local image descriptors.Section 2 and 3 present proposed approach and experimental results.Finally, the conclusion is discussed in section 4.

THE FEATURE EXTRACTION AND SELECTION 2.1. Histograms of oriented gradient descriptor
Histograms of oriented gradient (HOG) descriptor is applied for different problems in machine vision [26][27][28][29][30][31][32].HOG feature is extracted by counting the occurrences of gradient orientation base on the gradient angle and the gradient magnitude of local patches of an image.The gradient angle and magnitude at each pixel are computed in an 8 × 8 pixels patch.Next, 64 gradient feature vectors are divided into 9 angular bins 0-180 ° (20 °each).The gradient magnitude  and angle  at each position (, ℎ) from an image  are computed as follows:

Feature selection
Based on the availability of supervised information (i.e.labels), feature selection techniques can be grouped into two large categories: supervised and unsupervised context [33].Additionally, different strategies of feature selection are proposed based on evaluation process such as filter, wrapper, and hybrid methods [34].Hybrid approaches incorporate both filter and wrapper into a single structure, to give an effective solution for dimensionality reduction [35].
Liu et al. extend the unsupervised sparsity score to supervised context by utilizing the class label information [36,37] (5) After calculating the score for each feature, they are sorted in the ascending order of SparseScore  to select the relevant ones.In the classification experiments, Liu et al. have demonstrated that this score outperforms other methods in most cases, especially for multi-class problems [36].

Experimental setup
Despite the vehicle logo recognition problem has been studied for many years, a few publicly available is available for the computer vision community.There are a few datasets is applied for logo detection such as vehicle logo dataset with 30 classes, namely VLD-30 [38].Table 1 analyses the existing vehicle logo datasets.The first column indicates the dataset name and its reference.The second column show the availability of the predefined training and testing set.This information is important for compare the classification results because other researchers can use this decomposition to train and test their models instead of using cross-validation methods.The next columns represent the number of logos, total images, and its resolution.The last column shows the availability of the corresponding dataset.
The drawback of these datasets is the lack of pre-defined decomposition for training and testing set and mages are resized to the same resolution.To this end, we collected and organized a large-scale and comprehensive image database called VLR-40.This dataset contains images that were taken by different users, in unconstrained condition.The data was gathered by crawling from web pages and cropped semi-automatically.We keep the original resolution from cropped logo.It can be downloaded publicly at: https://data.mendeley.com/datasets/dr233ns3g6/3which contains total 4,000 color images of vehicle logo (see several example images from this dataset in 1).The sizes of each image are totally different.There is a total of 40 classes of logo in this original dataset.The data is split in half to be used as training and testing sets for classification task.Vehicle logo images are cropped from photo of car images semi-automatically.They also contain background and foreground and make this dataset is more challenging than the others.Figure 1 illustrates the two-vehicle logos are cropped from an original image.Figure 2 illustrates the cropped images from original image crawled from the web.We can observe that the two cropped images of the Ferrari logo are different visually by the resolution, rotation, and color.Under the current technology condition, these problems make this dataset is more challenging than others.Table 2 presents example images of 40 different logo from VLR-40 dataset.

Results
There are many color spaces are proposed in the literature for different applications.The HOG descriptor is applied to extract features from vehicle logo image of each color component.The final feature is obtained by fusing all features from three components.Three color spaces (RGB, HSV, YCbCr) are considered to encode image since these spaces are widely used for pattern recognition application.The training set is used to compute sparse score by ( 5) and (6).The value of these scores are then applied to rank features of training and testing set.Here, we use the cut-off ratio is 1% number of features to determine the optimum dimension.Table 2 presents the classification results on the VLR-40 dataset.The first column indicates the color space used to encode vehicle logo images.The second column shows the accuracy achieved of each space and its dimension when no selection method is applied.The number of features is 11,532×3 = 34,596 features.We see that the accuracy varies on different color space.The second and third column present the classification results by using sparse score 1 and sparse score 2, respectively.The sparse score 1 clearly outperforms other methods by giving the best accuracy (75.25%) by using 77% (26,638 features) number of features.The sparse score 2 give the accuracy close to the results when no selection method is applied.However, it largely reduces number of features comparing with sparse score 1.For example, sparse score 2 only uses 20% number of features on HSV space while giving better performance.By observing this table, we see that feature selection method gives the accuracy as performing when no selection method, but it allows to reduce the dimension space.Additionally, Figure 3 compares the performance of two sparse score 1 and 2 on three different color spaces.The combination of HSV color space and sparse core 1 give the worst performance compared with other methods.The RGB space and sparse score 2 give a good performance at early stage since it only need fewer than 10% number of features to reach an accuracy more than 70%.In contrast, the YCbCr and HSV spaces combined with sparse score 1 achieve a very low accuracy at the beginning when number of selected features is fewer than 55%.So, experimental results show that it should be interesting to find a suitable color space to encode vehicle logo images and an appropriate feature selection method to remove irrelevant features.The experimental results show that the sparse score 1 gives the best accuracy on the RGB color space and largely reduce number of features.This study is now extended to compare the performance and find a suitable color space for encoding vehicle logo images.

Figure 1 .
Figure 1.Example of vehicle logo images extracted from real-life scenario.The original image ison the left and two logos are cropped from this image

 3023 Figure 3 .
Figure 3. Classification performance of SparseScore1 and SparseScore2 on VLR-40 dataset by different color spaces . Let    denotes the  ℎ feature of  ℎ instance in class , ̂   is the element of sparse similarity matrix   which is constructed within the class ,   is a Ndimensional vector with   =1, if   belongs to the class  and 0 otherwise.The two proposed supervised sparsity score of the  ℎ feature, denoted SparseScore , which should be minimized, are defined as follows:TELKOMNIKA Telecommun Comput El Control  Vehicle logo recognition using histograms of oriented gradient descriptor and… (KittikhunMeethongjan)

Table 1 .
Summary of the available logo databases in literature

Table 2 .
Classification results on the VLR-40 dataset with two feature selection methods