Open Access Open Access  Restricted Access Subscription Access

QoS-Aware Data Replication in Hadoop Distributed File System


Affiliations
1 Department of Computer Technology and Application, S. G. S. I. T. S., Indore, (M. P.), India
2 Department of Computer Engineering, S. G. S. I. T. S., Indore, (M. P.), India
 

Cloud computing provides services using virtualized resources through Internet on pay per use basis. These services are delivered from millions of data centers which are connected with each other. Cloud system consists of commodity machines. The client data is stored on these machines. Probability of hardware failure and data corruption of these low performance machines are high. For fault tolerance and improving the reliability of the cloud system the data is replicated to multiple systems.

Hadoop Distributed File System (HDFS) is used for distributed storage in cloud system. The data is stored in the form of fixed-size blocks i.e. 64MB. The data stored in HDFS is replicated on multiple systems for improving the reliability of the cloud system. Block replica placement algorithm is used in HDFS for replicating the data block. In this algorithm, QoS parameter for replicating the data block is not specified between client and service provider in the form of service level agreement.

In this paper, an algorithm QoS-Aware Data Replication in HDFS is suggested which considers the QoS parameter for replicating the data block. The QoS parameter considered is expected replication time of application. The block of data is replicated to remote rack DataNodes which satisfies replication time requirement of application. This algorithm reduces the replication cost as compared to existing algorithm thus, improving the reliability and performance of system.


Keywords

Cloud Computing, Quality of Service, Data Replication, Hadoop Distributed File System, Replication Cost.
User
Notifications
Font Size

Abstract Views: 138

PDF Views: 0




  • QoS-Aware Data Replication in Hadoop Distributed File System

Abstract Views: 138  |  PDF Views: 0

Authors

Sunita Varma
Department of Computer Technology and Application, S. G. S. I. T. S., Indore, (M. P.), India
Gopi Khatri
Department of Computer Engineering, S. G. S. I. T. S., Indore, (M. P.), India

Abstract


Cloud computing provides services using virtualized resources through Internet on pay per use basis. These services are delivered from millions of data centers which are connected with each other. Cloud system consists of commodity machines. The client data is stored on these machines. Probability of hardware failure and data corruption of these low performance machines are high. For fault tolerance and improving the reliability of the cloud system the data is replicated to multiple systems.

Hadoop Distributed File System (HDFS) is used for distributed storage in cloud system. The data is stored in the form of fixed-size blocks i.e. 64MB. The data stored in HDFS is replicated on multiple systems for improving the reliability of the cloud system. Block replica placement algorithm is used in HDFS for replicating the data block. In this algorithm, QoS parameter for replicating the data block is not specified between client and service provider in the form of service level agreement.

In this paper, an algorithm QoS-Aware Data Replication in HDFS is suggested which considers the QoS parameter for replicating the data block. The QoS parameter considered is expected replication time of application. The block of data is replicated to remote rack DataNodes which satisfies replication time requirement of application. This algorithm reduces the replication cost as compared to existing algorithm thus, improving the reliability and performance of system.


Keywords


Cloud Computing, Quality of Service, Data Replication, Hadoop Distributed File System, Replication Cost.