Comparative Study of Bankruptcy Prediction Models

Early indication of Bankruptcy is important for a company. If companies aware of potency of their Bankruptcy, they can take a preventive action to anticipate the Bankruptcy. In order to detect the potency of a Bankruptcy, a company can utilize a model of Bankruptcy prediction. The prediction model can be built using a machine learning methods. However, the choice of machine learning methods should be performed carefully because the suitability of a model depends on the problem specifically. Therefore, in this paper we perform a comparative study of several machine leaning methods for Bankruptcy prediction. It is expected that the comparison result will provide insight about the robust method for further research. According to the comparative study, the performance of several models that based on machine learning methods (k-NN, fuzzy k-NN, SVM, Bagging Nearest Neighbour SVM, Multilayer Perceptron(MLP), Hybrid of MLP + Multiple Linear Regression), it can be concluded that fuzzy k-NN method achieve the best performance with accuracy 77.5%. The result suggests that the enhanced development of bankruptcy prediction model could use the improvement or modification of fuzzy k-NN.


Introduction
In bussiness, a company can have two possibilities (gain profit ar loss).In the high competitive era, early warning of a Bankruptcy is important to prevent the worst condition for the company.In order to predict the Bankruptcy, a company can employ the relevant data such as asset total, inventroy, profit and financial deficiency.Those data will give maximum advantage when their pattern is interpretable.With the objective of discover the Bankruptcy pattern, a machine learning method can be employed.Specifically, the method will classify whether pattern in the company data support the indication of Bankruptcy or not.
Recently, several machine learning methods are proposed for Bankruptcy prediction.Some of them are k-nearest neighbor, neural network and support vector machine.Those methods come with their advantage and disadvantage.Among several cases, neural network and support vector machine are superior than other methods.For example, support vector machine is exploited in detection of diabetes mellitus [1] and neural network is employed in classification of mobile robot navigation [2].The superiority is because of their capability in generalization.However, their models are difficult to interpret.On the contrary, model that use k-nearest neighbor is easier to interpret and its computation is simple.
For Bankruptcy prediction model, Li et.al., [6] proposed fuzzy k-nn model and Wieslaw et.al. [3] proposed statistical-based model.Still, the improvement space is available in order to obtain a better model.The main contribution of this paper is conducting a comparative study for evaluating the most suitable model for Bankruptcy prediction problem.The comparative result can be used as a consideration for further research in the Bankruptcy prediction problem.In this comparative study, the usage of k-nearest neighbour, neural network and support vector machine in a model prediction will be evaluated and will be compared.In addition, the variant of the methods will be evaluated as well.The variant metods are fuzzy k-nearest neighbour, bagging nearest neighbour support vector machine, and a hybrid model of multilayer perceptron and multiple linear regression.By considering the excellency and the drawback of each method, this study will explore which method is suitable for Bankruptcy prediction model.
The organization of the paper is as follow, the next section describes the dataset and followed by machine learning methods explanation in the third section.Subsequently, the result of the comparative study is illustrated in the fourth section.Finally, the last section describes the conclusion and dsicussion.

Methods
This section describes methods that are compared in this study and followed by the dataset.

K-Nearest Neighbour
K-Nearest Neighbor (KNN) is a non-parametric classification method.Computationally, it is simpler than another methods such as Support Vector Machine (SVM) and Artificial Neural Network (ANN).In order to classify, KNN requires three parameters, dataset, distance metric and k (number of nearest neigbours) [8].
Similarity between atributes with those of their neares neighbour can be computed using Euclidean distance.The majority class number will be transferred as the predicted class.If a record is represented as a vector (x1, x2, ..., xn), then Euclidean distance between two records is computed as follow [8]: (1) The value d(xi, xj) represents distance between a record with its neighbours.The computed distances are sorted in ascending way.Next, choose k smallest distances as k nearest distances.Classes of records in the k nearest neighbours are then used for class prediction.The majority class in that set will be tansferred to the predicted data.

Fuzzy K-Nearest Neighbour
In 1985, Keller proposed a KNN method with fuzzy logic, later it is caleed Fuzzy k-Nearest Neighbour [4].The fuzzy logic is exploited to define the membership degree for each data in each category, as describes in the next formula [4]: The i variable define the index of classes, j is number of k neighbours, and m with value in (1, ∞) is fuzzy strength parameter to define weight or membership degree from data x.Eulidean distance between x and j-th neighbour is symbolized as ||x-xj||.Membership function of xj to each class is defined as uij [4]: In addition, nj is the number of neighbours with j-th class.Equation ( 3) is subject to the next equation [4]: After a data is evaluated using those formulas, it would be classified into a class according to the membership degree to the corresponding class (in this case, class positive means bancrupt and class negative means not bancrupt).[5].C(x) = arg max u x , u x 2.3 Support Vector Machine Support vector machines (SVM) is a method that perform a classification by finding a hyperplane with the largest margin [8].A Hyperplane separate a class from another.Margin is distance between hyperplane and the closest data to the hyperplane.Data from each class that closest to hyperplane are defined as support vectors [8].
In order to generate SVM models, using training data x ∈ R and label class y ∈ 1, 1 , SVM finds a hyperplane with the largest margin with this equationc [8]: . 0 (6) To maximize margin, an SVM should satisfy this equation [8]: Xi is training data, yi is label class, w and b are parameters to be defined in the training process.The equation ( 7) is adjusted using slack variable in order to handle the misclassification cases.The adjusted formula is then defined as in equation ( 8) [8]: To solve the optimation process, Lagrange Multiplier (α) is introduced as follow: Because vector w may in high dimension, equation ( 9) is transformed into dual form [8]: And decision function is defined as follow [8]: Value of b parameter is calculated using this formula [8]:

Bagging Nearest Neighbour Support Vector Machine (BNNSVM)
In order to create BNNSVM model, model Nearest Neighbor Support Vector Machines (NNSVM) is created first.The procedure is as follow [6]: 1. Training data is divided into train set (trs) and test set (ts) using cross validation process.3. Perform a prediction task using 10 NNSVM models from step 2. 4. For each record in test set, vote the prediction result using the NNSVM models.5. Final prediction result is the class that is voted in the step 4. If the voting result is 'negative' then the data is predicted as 'negative' and vice versa for 'positive' result.

Multiple Layer Perceptron (MLP)
Multilayer Perceptron (MLP) method is an ANN method with architecture at least 3 layers.Those 3 layers are input laye, hidden layer and output layer.Similar to another ANN methods, this method aims to calculate the weight vectors.The weight vector will be fit to training data.To update the weight vector, MLP uses backpropagation algorithm.The activation function that is used in this MLP model is Sigmoid function.
In prediction stage, a data company x will be classified as positive (the company has bancrupt potency) or negative (the company fine condition)according to equation (13).In the equation (13) wi is weight vector from training proses, w0 is bias and n feature dimension of the data [9].
In the training stage, the weight vector is updated in two steps.The first step perform initialization of weight vector, both in input layer and hidden layer.Afterward, the forward propagation is computed to obtain the network output.The computation is started from input layer, hidden layer and output layer.When the value (ok) from output layer and value (oh) from hidden layer are obtained, back propagation procedure is performed to calculate the error (δk) in output layer (equation 14) and error (δh) in hidden layer (equation 15).In the equation 8, wkh is weight value of the hidden unit that connected to output unit [9] ) According to error calculation, weight vector at input layer (equation 16) and weight vector at hidden layer (equation 17) are updated.The number of iteration is determined based on epoch [9]

 2 .
ISSN: 1693-6930 TELKOMNIKA Vol.11, No. 3, September 2013: 591 -596 594 Find k-nearest neighbours for each record in ts.These k-nearest neighbours is defined as ts_nns_bd.3. Create a classification model from ts_nns_bd.The model is specified as NNSVM.4. Perform prediction to testing data using NNSVM model.Subsequently, bagging algorithm is integrated to NNSVM model to form BNNSVM.The computation of BNNSVM model is defined in the next steps [6]: 1. Create 10 new base training set from trs data.In order to generate base training set, perform sampling with replacement.2. According to 10 base training set from step 1, generate 10 NNSVM model.

2 . 6
The Hybrid of MLP with Multiple Linear Regression (MLP+MLR) This hybrid classification model generated in two steps.The first step compute the Multiple Linear Regression (MLR) model.The result of the model is used as a new feature for