Refine your search
Collections
Co-Authors
Journals
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Nagarajan, R.
- Improved Feature Set Extraction From Documents Using Modified Bag Of Words
Abstract Views :175 |
PDF Views:0
Authors
Affiliations
1 Department of Computer and Information Science, Annamalai University, IN
1 Department of Computer and Information Science, Annamalai University, IN
Source
ICTACT Journal on Soft Computing, Vol 11, No 1 (2020), Pagination: 2213-2217Abstract
In conventional literatures, there are several different methods of collection and extraction and are also used to minimize dimensionality. Traditional methods are intuitively designed to delete redundant and outdated information to help define new test cases more effectively. But the number of specific words in the Bag of Words (BoW) model must be manually calculated, requiring time and work and portability of deficiencies. In addition, the number of codebook vectors in BoW rises as cancer types grow and the efficiency and accuracy of detection are reduced. The BoW model is therefore not ideal for multi-operative failure diagnosis. Therefore, we propose an improved BoW in this paper which selects the number of special terms required to collect cancer diagnostic functions from different documents. The overall recognition and accuracy rates are higher than other existing extraction models. The improved BoW method has been verified to be highly effective in operating conditions that meet the requirements in real time.Keywords
Bag of Words, Cancer Document Retrieval, Codebook, Dimensionality Reduction.- Improved Feature Extraction on Text Documents using Neural Network Model
Abstract Views :187 |
PDF Views:0
Authors
V. Kumaresan
1,
R. Nagarajan
1
Affiliations
1 Department of Computer and Information Science, Annamalai University, IN
1 Department of Computer and Information Science, Annamalai University, IN
Source
ICTACT Journal on Soft Computing, Vol 11, No 2 (2021), Pagination: 2279-2282Abstract
In natural language processing, the text clustering plays a major role on reducing the text dimensionality. However, the lack of data models has made the clustering algorithm to face sparsity problems. The integration with deep learning has resolved the problem of scarce knowledge on text documents. However, deeper architectures learn such redundant features, which limit the efficiency of solutions. In this paper, a complete extraction of features from text document using neural network model. The neural network model utilizes feed forward mechanism and a type of unsupervised learning that denoises the corrupted input features. The reconstructed feature is used for initialing the feed forward network. This method reduces the manual labelling in the process of screening. For evaluation, series of experiments are conducted to investigate the performance of the method over the text datasets with various conventional algorithms.Keywords
Text Document, Feature Extraction, Neural Network, Denoising.References
- C.C. Aggarwal and C. Zhai, “A Survey of Text Classification Algorithms”, Springer, 2012.
- W. Aziguli, Y. Zhang, Y. Xie and D. Zhang, “A Robust Text Classifier based on Denoising Deep Neural Network in the Analysis of Big Data”, Scientific Programming, Vol. 2017, pp. 1-20, 2017.
- L.E. Peterson, “K-Nearest Neighbor”, Scholarpedia, Vol. 4, No. 2, pp. 1883-1887, 2009.
- P. Langley, W. Iba and K. Thompson, “An Analysis of Bayesian Classifiers”, Aaai, Vol. 90, pp. 223-228, 1992.
- X. Luo, J. Deng, J. Liu and W. Wang, “A Quantized Kernel Least Mean Square Scheme with Entropy-Guided Learning for Intelligent Data Analysis”, China Communications, Vol. 14, No. 7, pp. 1-10, 2017.
- T.N. Lal, O. Chapelle and J. Weston, “Embedded Methods”, Springer, 2006.
- A. Rehman, K. Javed and H.A. Babri, “Feature Selection based on a Normalized Difference Measure for Text Classification”, Information Processing and Management, Vol. 53, No. 2, pp. 473-489, 2017.
- R. Wald, T. Khoshgoftaar and A. Napolitano, “Filter-and Wrapper-based Feature Selection for Predicting user Interaction with Twitter Bots”, Proceedings of IEEE International Conference on Information Reuse and Integration, pp. 416-423, 2013.
- I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection”, Journal of Machine Learning Research, Vol. 3, No. 2, pp. 1157-1182, 2003.
- Q. Le and T. Mikolov, “Distributed Representations of Sentences and Documents”, Proceedings of International Conference on Machine Learning, pp. 1188-1196, 2014.
- M. Jiang, Y. Liang and X. Feng, “Text Classification based on Deep Belief Network and Softmax Regression”, Neural Computing and Applications, Vol. 29, No. 1, pp. 61-70, 2018.
- C.H. Shih, B.C. Yan and S.H. Liu, “Investigating Siamese LSTM Networks for Text Categorization”, Proceedings of Asia-Pacific Conference on Signal and Information Processing Association Annual Summit, pp. 641-646, 2017.
- C.Y. Lee, S. Xie, P. Gallagher and Z. Zhang, “Deeply-Supervised Nets”, Proceedings of International Conference on Artificial Intelligence and Statistics, pp. 562-570, 2015.
- C. Szegedy, W. Liu, Y. Jia and P. Sermanet, “Going Deeper with Convolutions”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.
- M. Denil, B. Shakibi, L. Dinh and M.A. Ranzato, “Predicting Parameters in Deep Learning”, Proceedings of International Conference on Advances in Neural Information Processing Systems, pp. 2148-2156, 2013.
- B.O. Ayinde, T. Inanc and J.M. Zurada, “On Correlation of Features Extracted by Deep Neural Networks”, Proceedings of International Conference on Neural Networks, pp. 1-8, 2019.
- B.O. Ayinde and J.M. Zurada, “Clustering of Receptive Fields in Autoencoders”, Proceedings of International Conference on Neural Networks, pp. 1310-1317, 2016.
- A. Rehman, K. Javed, H.A. Babri and M.N. Asim, “Selection of the Most Relevant Terms based on a Max-Min Ratio Metric for Text Classification”, Expert Systems with Applications, Vol. 114, No. 1. pp. 78-96, 2018.
- A. Dasgupta, P. Drineas, B. Harb and V. Josifovski, “Feature Selection Methods for Text Classification”, Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 230-239, 2007.
- S. Lai, L. Xu, K. Liu and J. Zhao, “Recurrent Convolutional Neural Networks for Text Classification”, Proceedings of International Conference on Artificial Intelligence, pp. 1-14, 2015.
- N. Kousik, S. Kallam, R. Patan and A.H. Gandomi, “Improved Salient Object Detection using Hybrid Convolution Recurrent Neural Network”, Expert Systems with Applications, Vol. 166, pp 1-20, 2020.
- S. Zhou, Q. Chen and X. Wang, “Active Semi-Supervised Learning Method with Hybrid Deep Belief Networks”, PloS One, Vol. 9, No. 9, pp. 1-9, 2014.
- C. Huang, W. Gong, W. Fu and D. Feng, “A Research of Speech Emotion Recognition based on Deep Belief Network and SVM”, Mathematical Problems in Engineering, Vol. 12, No. 3, pp. 1-16, 2014.
- S.E. Kahou, C. Pal, X. Bouthillier and P. Froumenty, “Combining Modality Specific Deep Neural Networks for Emotion Recognition in Video”, Proceedings of ACM on International Conference on Multimodal Interaction, pp. 543-550, 2013.
- M. Liu, G. Haffari, W. Buntine and M. Ananda-Rajah, “Leveraging Linguistic Resources for Improving Neural Text Classification”, Proceedings of the Australasian Language Technology Association Workshop, pp. 34-42, 2017.
- BBC Sports, Available at: http://mlg.ucd.ie/datasets/bbc.html.