Author Details

Scroll

Refine your search

Collections

Engineering Collection

Co-Authors

Journals

Data Mining and Knowledge Engineering

Year

2011

Authors

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All

Minni, N.

PageRank Using MapReduce-An Open-Source Framework for Processing Large Data Sets

Abstract Views :167 | PDF Views:2

Authors

N. Rehna ¹, N. Minni ², F. Jasmine Natchial ³

Affiliations
1 Department of MCA, Bharathiyar College of Engineering and Technology, Karaikal, Puducherry, IN
2 Department of Computer Science, Avvaiyar Government College for Women, Karaikal, Puducherry, IN
3 Deparment of MCA, Bharathiyar College of Engineering and Technology, Karaikal, Puducherry, IN

Source

Data Mining and Knowledge Engineering, Vol 3, No 10 (2011), Pagination: 644-649

Abstract

MapReduce is simple data-parallel programming model designed for scalability and fault-tolerance and for processing and generating large data sets. It was initially created by Google for simplifying the development of large scale web search applications in data centers and has been proposed to form the basis of a ‘Data center computer’. Many real world tasks are expressible in this model. In this paper, a PageRank Algorithm is introduced for a hyperlink graph using MapReduce technique illustrated for a random web surfer. This algorithm computes the PageRank of several web pages which is distributed in the cloud. In this work, the Hyperlink Graph Page Rank (HGPR) algorithm is developed, using which the PageRanks can be computed and thereafter the most visited webpages can be traced out.
Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system.
The implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable. A typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use.

Keywords

Adjacency List, Cloud Computing, Dampling Factor, HGPR Algorithm, MapReduce, PageRank (PR).

Full Text

Username
Password
Remember me