JP Journal of Heat and Mass Transfer
Special Issue, Advances in ICT-Convergence, Pages 93 - 110
(September 2020) http://dx.doi.org/10.17654/HMSI20093 |
|
THE PERFORMANCE COMPARISON OF DIFFERENT FEATURE VECTORS ON RNN NETWORK FOR MALWARE DETECTION
Young-Man Kwon, So-Hee Jun, Jae-Ju An and Myung-Jae Lim
|
Abstract: In this paper, we want to find out how to use feature vectors in the area of malware detection, in which case there is no the curse of dimensionality. We used the three feature vectors, which are one hot encoding vector, random and Word2Vec embedding vector that uses similarity. We used the transposed symmetrical neural network to find the Word2Vec embedding vector. We used the recurrent neural network (RNN) for malware detection. In the step of phase 1, we setup the hyper-parameters for RNN. In the step of phase 2, we measured the performance of RNN with three different feature vectors by using hyper-parameters those were founded in the previous step. We made experiments 30 times for each feature vector to use parametric test that is based on normality of data. As the result of experiments, we concluded that the one hot encoding method had compatible performance with Word2Vec embedding vector on RNN network for malware detection, which has the limited size of vocabulary dictionary if we consider the time to get Word2Vec. In addition, we concluded that the Word2Vec embedding vector still has the enhanced feature extraction capability than random embedding vector in the limited size of vocabulary dictionary. |
Keywords and phrases: malware detection, text classification, RNN, Word2Vec, word embedding LSTM, machine learning, deep learning, natural language processing.
|
|
Number of Downloads: 277 | Number of Views: 465 |
|