A principal component analysis-based feature dimensionality reduction scheme for content-based image retrieval system

ABSTRACT


INTRODUCTION
One of the challenges of relevance feedback (RF) in image retrieval is the inherent 'curse of dimensionality' occasioned by small sample size with high feature dimension.Therefore, for RF techniques which are based on training classifier using feedback examples, the curse of dimensionality can deteriorate the classifier performance, thereby leading to poor retrieval results.To mitigate this problem, a technique that relies on the properties of the feedback examples for selecting a lower dimensional feature, that will serve as good representative for classification can be employed.In this way, a significant dimensionality reduction can be achieved by removing irrelevant or redundant features, thus leading to a significant decrease in training time and memory complexities, and better classifier performance [1,2].Approaches for feature dimensionality reduction have been grouped into two [3]: (a) those that involves linear or nonlinear mapping from the original feature space to a new one of lower dimensionality.Notable among these are linear discriminant analysis [4] and principal component analysis [1,[5][6][7]; (b) those that directly reduce the number of the original features by selecting a subset of them that still retains sufficient information for classification.In general, approaches in this category can be grouped into two namely: filter methods and wrapper methods [8].
The filter methods are generally not classifier dependent as they acquire no feedback from the classifiers, but depend on indirect assessments like distance measure to estimate classification performance on the other hand, the wrapper methods are classifiers dependent and are known to yield better classification performance [8,9].Many features selection methods for classification have been proposed in the literature, [10] with many experimental results in favour of the wrapper methods [8,11,12].However, in spite of good classification performance, the wrapper methods have limited application due to high computational complexity, especially when applied to support vector machine (SVM) classifiers.
PCA is a dimensionality reduction technique that transforms the original set of features into a smaller subset that account for as much of the total variation in the data as possible [13].It is widely used in the area of pattern recognition, computer vision and signal processing [7].Several optimality properties of PCA have been identified namely: variance of extracted features is maximized; the extracted features are uncorrelated; finds best linear approximation in the mean-square sense and maximizes information contained in the extracted feature [14].
These properties of PCA have attracted research on PCA-based variable selection methods [7,[13][14][15][16][17][18]] and has been applied to relevance feedback in both document and image retrieval systems [1,5,6].In [1], a novel PCA-based feature dimensionality reduction scheme (or approach) was proposed for the RF framework with a view to capturing the subjective class implied in the positive examples.Similarly, the works of Cox, et al, [19] and Vasconcelos & Lippman [20], employed Bayesian learning to integrate user's feedback for updating image probability distribution and subsequently re-rank images in the database.
It was reported that the scheme (or approach) reduced the average retrieval time and significantly reduced storage space utilization.However, the precision measure in top 20 retrieval results in four feedback iterations was 45%.This may be due to the failure of Bayesian classifiers to use the few available image samples gathered over the feedback iterations to estimate the class probability distribution.It was stated by Yin, Bhanu, Chang and Dong [21] that one of the shortcomings of the Bayesian approach is that it requires more feedback iterations to gather more samples, which is not always available in real time retrieval systems, to effectively estimate the probability distribution of the image samples.
In other to address the computational complexity issue, a SVM-based technique, termed filtered and supported sequential forward search, was proposed feature selection [3].The technique integrates the filter and wrapper parts into one scheme by leveraging on their unique strengths.Results of experimental on both synthetic and real data showed effectiveness of the method regarding classification accuracy.However, given the fact that much smaller data, compared to what obtains in CBIR system, was used to evaluate the system, an average run-time of 16.23 seconds was recorded Such a lengthy run time is not acceptable for CBIR system with RF framework.

Feature extraction
Feature extraction is one very crucial task in CBIR application, and it is the core of any such system [4].The extraction of suitable features from the images influences to a great extent the choice of the indexing structure and the query processing unit.In view of this, various methods of feature extraction to extract various types of visual contents from the images have been developed and are being improved upon overtime [22,23].Three generic domain image databases (DB10, DB20 and DB100) were employed with each image database indexed using two colour models (CM54 and HIST32) and two texture models (GW54 and WM40).Adegbola, Aborisade, Popoola and Atayero [24] presents detailed description of various image database and feature extraction models.

Feature selection model
In a generic system, it is extremely difficult to know the particular feature model(s) to be used to uniquely identify certain groups of images.Therefore, a combination of several image feature models is usually employed with the assumption that at least one will have the ability to capture the unique identity of the targeted images.This approach poses several challenges.First, because the image features are cascaded as a flat vector, such arrangement may increase the chances of diluting the feature component that uniquely identifies the targeted image group.This may also lead to what is known as curse of dimensionality in CBIR system that employs machine learning techniques for relevance feedback.Cost of feature extraction algorithm is another issue which may become prohibitive as the number of feature descriptors increases.In view of this, including too many features is obviously not feasible for application involving human-machine interaction.Since such system is expected to be fast enough for smooth denote the training set containing N training pairs, where    is the numerical value of feature   for the ith training sample.The goal of dimensionality reduction is to find a minimal set of features   = { 1 ,  2 , … .,   } to represent the input vector X in a lower dimensional space as where  < , while the classification obtained in the low-dimensional space still yields the desired accuracy.

Principal component analysis
PCA is a statistical priocedure for high dimensionality reduction of feature space.It uses orthogonal transformation to decorrelate a set of correlated feature space to enhance variance by emphazising the directions of principal variation of dataset [25].Consider a set of d-dimensional vectors { = [ 1 , … ,   ]  } with distribution centred at the origin, () = 0.The covariance is obtained using ( 4) where  is the expectation operator.The parameters   can be arranged to form the  ×  covariance matrix Assuming (  ) ≠ 0, then by applying eigenvector decomposition,   can be decomposed into the product of three matrices: where, Λ = diag{λ 1 , … , λ d } is the Eigenvalue matrix. = [w 1 , … , w d ] T forms a set of orthonormal basis vectors called Eigenvectors.For dimensionality reduction, only the set of orthonormal bases vectors resulting from the k-largest Eigenvalues are retained.This will result into significant feature dimensionality reduction.Normally, the k-largest Eigenvalues that constitutes 95% of the total Eigenvalues are retained for dimensionality reduction.However, this work employed precision/recall graph to determine the dimension of feature to be retained.This is a more objective choice, since the resulting lower dimensional feature vectors are used for distance (similarity) measurement in image retrieval system with relevance feedback.Consequently, the number of feature dimension retained is based on a 5% maximum loss constraint imposed on the precision/recall graph.

RESULTS AND ANALYSIS
Combination of visual descriptors results to increase in the dimension of the resulting feature vector.Normally, the resulting feature model, which is the concatenation of individual feature vectors, could have very high dimensions and thus increase the latency of RF scheme even on a medium-size image database.Hence, in order to mitigate the curse of dimensionality problem associated with machine learning based RF scheme, reducing the dimensions of feature vectors may be necessary.In this study, principal component analysis (PCA) is integrated to the developed OC-SVM RF for the purpose of feature vector dimensionality reduction.
A criterion of 5% maximum degradation in mean precision value was used to determine the dimension of feature vector to keep.The effect of feature vector dimensionality reduction is shown in Figure 1.The maximum mean precision values obtained on DB10, DB20 and DB100 were 0.9067, 0.7266 and 0.7275 respectively, for 80% reduction in feature vector dimension.While a reduction of feature TELKOMNIKA Telecommun Comput El Control  A principal component analysis-based feature dimensionality reduction scheme… (Oluwole A. Adegbola) 1895 dimension by 83% for DB10, DB20 and DB100 resulted into mean precision values of 0.6933, 0.5093 and 0.3657 respectively.
Figure 2 shows the comparison between the OC-SVM RF that used the whole 174-dimensional feature (STD) and the OC-SVM RF with PCA that used 35-dimensional features (PCA).The maximum mean precision values of 0.9400, 0.7600 and 0.7860 were achieved on the DB10, DB20 and DB100 respectively for the STD.The maximum mean precision achieved with PCA on the DB10, DB20 and DB100 were 0.9067, 0.7266 and 0.7275 respectively.Thus an 80% reduction in feature dimension, yielded tolerable degradation of 3.54%, 4.39% and 7.4% in maximum mean precision performance on DB10, DB20 and DB100 respectively.

CONCLUSION
In CBIR system designed for generic image databases, it is general practice to represent images using combination of several different image features with a view to capturing extra information that may improve retrieval accuracy.This usually results in high dimensionality of visual feature vectors for CBIR system with classifier-based relevance feedback scheme.In this paper, the issue of curse of dimensionality is addressed using a PCA-based feature selection approach.The feature selection model was incorporated


ISSN: 1693-6930 TELKOMNIKA Telecommun Comput El Control, Vol. 18, No. 4, August 2020: 1892 -1896 selection of most appropriate features to relduce computational burden becomes imperative and to achieve this, a procedure that uses Principal Component Analysis is employed in this work.Assume a binary classification problem, given a set of label training data {(  ,   )  = 1,  |   ≠ 0} where sample   ∈ ℝ  and   ∈ {−1 , 1}.Let Mean precision result of the OC-SVM RF with PCA of different dimensionality reduction