Big Data Privacy Preservation Using Two Phase Top-Down Specialization Algorithm with Multidimensional Map Reduce Framework on Hadoop

S. Shalin Eliabeth; S. Sarju

Big Data Privacy Preservation Using Two Phase Top-Down Specialization Algorithm with Multidimensional Map Reduce Framework on Hadoop

Affiliations
1 Department of Computer Science and Engineering, St. Joseph's College of Engineering and Technology, Palai, Kerala, India

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adult's information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.

Keywords

Big Data, Cloud Computing, Data Anonymization, Map Reduce, Privacy Preservation, Top Down Specialization.

I-Scholar

Journal Help

Subscription Login to verify subscription

User

Notifications

Journal Content
Browse

Font Size

Information

Zhang, X., Yang, L. T., Liu, C., & Chen, J. (2014). A scalable two-phase top-down specialization approach for data anonymization using map reduce on cloud. IEEE Transactions on Parallel and Distributed Systems (TPDS), 25(2), 263-373.

Zhang, X., Liu, C., Nepal, S., Pandey, S., & Chen, J. (2012). A privacy leakage upper-bound constraint based approach for cost-effective privacy preserving of intermediate data sets in cloud. IEEE Transaction on Parallel and Distributed Systems.

Zhang, X., Liu, C., Nepal, S., Dou, W., & Chen, J. (2012). Privacy-preserving Layer over Map Reduce on Cloud and Green Computing (CGC 2012), pp. 304-310, Xiangtan, China.

Jurczyk, P., & Xiong, L. (2009). Distributed anonymization: achieving privacy for both data subjects and data providers. Proceedings of 23^rd Annual IFIP WG 11.3 Working Conference Data and Applications Security XXIII (DBSec ’09), (pp. 191-207).

Liu H, Orban D (2011) Cloud map reduce: A Map Reduce implementation on top of a cloud operating system. In IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, (pp. 464-474).

Candan, K. S., Kim, J. W., Nagarkar, P., Nagendra, M., & Yu, R. (2010). RanKloud: Scalable multimedia data processing in server clusters. IEEE MultiMed, 18(1), 64-77.

Dean, J., Ghemawat, D. S. (2008). Map Reduce: Simplified data processing on large clusters. Communication of the ACM, 51, 107-113.

Fung, B. C. M., Wang, K., & Yu, P. S. (2007). Anonymizing classification data for privacy preservation. IEEE Transaction of Knowledge Data Engineering, 19(5), 711-725.

Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., & Fu, A. W. (2006). Utility based anonymization using local recoding. In ACM SIGKDD.

Jiang, W., & Clifton, C. (2006). A secure distributed framework for achieving k-anonymity. VLDB Journal, 15(4), 316-333.

Amazon Web Services. (2013). Amazon Elastic Mapreduce. Retrieved from http://aws.amazon.com/elasticmapreduce/ (accessed on January 05, 2013)

Roy, I., Setty, S. T. V., Kilzer, A., Shmatikov, V., & Witchel, E. (2010). Airavat: Security and privacy for mapreduce. Proceedings of 7^th USENIX Conference on Networked Systems Design and Implementation (NSDI’10), (pp. 297-312).

Brodsky, A., Farkas, C., & Jajodia, S. (2000). Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Transactions on Knowledge and Data Engineering. 12, 900-919.

Cao, N., Wang, C., Li, M., Ren, K., & Lou, W. (2011). Privacy preserving multi-keyword ranked search over encrypted cloud data. Proceedings of IEEE Infocom, ( pp. 829-837).

Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., & Zaharia, M. (2010). A view of cloud computing. Communication of the ACM, 53(4), 50-58.

Mohan, P., Thakurta, A., Shi, E., Song, D. & Culler, D. (2012). Gupt: Privacy preserving data analysis made easy. Proceedings of ACMSIGMOD International Conference on Management of Data (pp. 349-360).

Hsiao-Ying, L., & Tzeng, W. G. (2012). A secure erasure code-based cloud storage system with secure data forwarding. IEEE Transactions and Distributed Systems, 23(6), 995-1003.

Zhang, X., & Dou, W. (2014). Proximity-aware local-recoding anonymization with map reduce for scalable big data privacy preservation in cloud. IEEE Transactions on Computers.

UCI Machine Learning Repository. Retrieved from ftp://ftp.ics.uci.edu/pub/machine-learnng-databases/

Abstract Views: 305

PDF Views: 0

Big Data Privacy Preservation Using Two Phase Top-Down Specialization Algorithm with Multidimensional Map Reduce Framework on Hadoop

Abstract Views: 305 | PDF Views: 0

Authors

S. Shalin Eliabeth
Department of Computer Science and Engineering, St. Joseph's College of Engineering and Technology, Palai, Kerala, India

S. Sarju
Department of Computer Science and Engineering, St. Joseph's College of Engineering and Technology, Palai, Kerala, India

Abstract

Keywords

Big Data, Cloud Computing, Data Anonymization, Map Reduce, Privacy Preservation, Top Down Specialization.

Username
Password
Remember me

Username
Password
Remember me

International Journal of Distributed and Cloud Computing

International Journal of Distributed and Cloud Computing

Big Data Privacy Preservation Using Two Phase Top-Down Specialization Algorithm with Multidimensional Map Reduce Framework on Hadoop

Subscribe/Renew Journal

Keywords

Big Data Privacy Preservation Using Two Phase Top-Down Specialization Algorithm with Multidimensional Map Reduce Framework on Hadoop

Authors

Abstract

Keywords

References