Open Access Open Access  Restricted Access Subscription Access

Fast and Effective Root cause Analysis of Streaming Data using In-Memory Processing Techniques


Affiliations
1 Department of Computer Science (Category-B), Bharathiar University, Coimbatore – 641046, Tamil Nadu, India
2 Department of Computer Science and Engineering,Paavai Engineering College, Namakkal – 637018, Tamil Nadu, India
 

Objectives: Increased data generation mandates a highly scalable and powerful processing framework for ischolar_main cause analysis. The objective is to identify such a framework by analyzing the existing processing architectures. Methods/Analysis: In-order to identify the best processing architecture for ischolar_main-cause analysis, the existing architectures are divided in terms of sequential processing using python, CPU based parallelization, Hadoop MapReduce and Spark based parallel in-memory processing. Pre-processing the input text was identified to be the most process intensive component of any text based processing framework. Hence this module of the proposed ischolar_main-cause analysis framework is implemented and is used for analysis. Findings: Performance is measured in terms of scalability, processing time, applicability, usability considering the streaming nature of data. Pre-processing module of the proposed framework is implemented in all of the considered processing architectures. Throttle points for each of the techniques is documented. It was identified that the scalability levels provided by sequential systems were not sufficient to handle the voluminous data. Considering the parallel approaches namely, CPU parallel, Hadoop MapReduce and Spark, it was identified that the CPU parallel approach exhibits effective performance until a certain level, after which the architecture fails. Hadoop and Spark based techniques exhibits high scalability levels, due to the underlying HDFS structure. However, their pros and cons in terms of other metrics indicate that the in-memory technique used by Sparkworks best both in terms of scalability and time complexity levels. Due to the dynamic nature of data under consideration, Spark architecture was identified to be the best for a ischolar_main-cause analysis architecture. Novelty/ Improvement: A novel ischolar_main-cause analysis framework incorporating pre-processing modules, aspect extraction and fuzzy based sentiment identification of aspects, rather than the conventional polarity analysis is proposed.

Keywords

Aspect Extraction, In-Memory Processing, Parallelization, Root Cause Analysis, Sentiment Analysis
User

Abstract Views: 171

PDF Views: 0




  • Fast and Effective Root cause Analysis of Streaming Data using In-Memory Processing Techniques

Abstract Views: 171  |  PDF Views: 0

Authors

S. Naveen Kumar
Department of Computer Science (Category-B), Bharathiar University, Coimbatore – 641046, Tamil Nadu, India
S. Vijayaragavan
Department of Computer Science and Engineering,Paavai Engineering College, Namakkal – 637018, Tamil Nadu, India

Abstract


Objectives: Increased data generation mandates a highly scalable and powerful processing framework for ischolar_main cause analysis. The objective is to identify such a framework by analyzing the existing processing architectures. Methods/Analysis: In-order to identify the best processing architecture for ischolar_main-cause analysis, the existing architectures are divided in terms of sequential processing using python, CPU based parallelization, Hadoop MapReduce and Spark based parallel in-memory processing. Pre-processing the input text was identified to be the most process intensive component of any text based processing framework. Hence this module of the proposed ischolar_main-cause analysis framework is implemented and is used for analysis. Findings: Performance is measured in terms of scalability, processing time, applicability, usability considering the streaming nature of data. Pre-processing module of the proposed framework is implemented in all of the considered processing architectures. Throttle points for each of the techniques is documented. It was identified that the scalability levels provided by sequential systems were not sufficient to handle the voluminous data. Considering the parallel approaches namely, CPU parallel, Hadoop MapReduce and Spark, it was identified that the CPU parallel approach exhibits effective performance until a certain level, after which the architecture fails. Hadoop and Spark based techniques exhibits high scalability levels, due to the underlying HDFS structure. However, their pros and cons in terms of other metrics indicate that the in-memory technique used by Sparkworks best both in terms of scalability and time complexity levels. Due to the dynamic nature of data under consideration, Spark architecture was identified to be the best for a ischolar_main-cause analysis architecture. Novelty/ Improvement: A novel ischolar_main-cause analysis framework incorporating pre-processing modules, aspect extraction and fuzzy based sentiment identification of aspects, rather than the conventional polarity analysis is proposed.

Keywords


Aspect Extraction, In-Memory Processing, Parallelization, Root Cause Analysis, Sentiment Analysis



DOI: https://doi.org/10.17485/ijst%2F2017%2Fv10i38%2F167996