A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Vadivu, G.
- Research Document Search using Elastic Search
Authors
1 Department of Computer Science Engineering, SRM University, Kattankulathur - 603203, Tamil Nadu, IN
2 Department of Information Technology, SRM University, Kattankulathur - 603203, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 37 (2016), Pagination:Abstract
Objectives: Implementation of Elastic search server to create a search engine that helps in searching, retrieving and downloading research papers stored in Django framework database and indexed by Elastic search. Analysis: Elastic Search, a search server based on Lucene can be used to search all types of documents with the help of its scalability and near-real time search. Findings: A web application which queries and searches for relevant research papers, allows users to customize their search and suggest list of research papers related to the initial query. It displays papers’ referenced authors. Item-based recommendations help users find more similar papers. Applications: Fast, Incisive Search against large volumes of data.Keywords
Django, Elastic Search, Haystack, Lucene, REST.- MapReduce: A Technical Review
Authors
1 Department of Computer Science and Engineering, SRM University, Chennai - 603203, Tamil Nadu, IN
2 Department of Information Technology, SRM University, Chennai - 603203, Tamil Nadu, IN
Source
Indian Journal of Science and Technology, Vol 9, No 1 (2016), Pagination:Abstract
MapReduce, a programming model, allows parallel processing of large amount of data sets where various data mining techniques are not quite useful. It’s Map and Reduce functions can be customized by the developers according to their application. This paper gives an idea of MapReduce, its advantages and disadvantages. This paper also focuses on how MapReduce is used, how map and reduce computations are customized, implemented in several scenarios such as in medical field to generate medical reports by processing large medical data sets, stream processing and workflow scheduling in multi core processors, in distributed environment, for processing distributed data sets by using pilot abstractions. We also represent how MapReduce used for deduplication of files to save disk space in data centers. MapReduce based Pre-Post (MRPre-Post) a parallel data mining algorithm is adapted in Hadoop platform to achieve scalability. MapReduce is implemented in vHadoop (Virtual Hadoop), a scalable hadoop virtual cluster to process machine learning algorithms. The scenarios discussed in this paper help developers and researchers how to customize and use MapReduce in their applications.