Author Details

Scroll

Refine your search

Collections

Engineering Collection

Co-Authors

Journals

ICTACT Journal on Soft Computing

Year

2019
2020

Authors

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All

Prakash, M.

Privacy Preservation of Micro Data Publishing using Fragmentation

Abstract Views :178 | PDF Views:0

Authors

V. Arul ¹, C. Vairavel ¹, M. Prakash ², N. V. Kousik ³

Affiliations
1 Department of Computer Science and Engineering, Anna University, Chennai, IN
2 Department of Computer Science and Engineering, SRM Institute of Science and Technology, IN
3 Department of Computing Science and Engineering, Galgotias University, IN

Source

ICTACT Journal on Soft Computing, Vol 9, No 3 (2019), Pagination: 1945-1949

Abstract

Organization such as hospitals, publish detailed data or micro data about individuals for research or statistical purposes. Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. When releasing the micro data, it is necessary to prevent the sensitive information of the individuals from being disclosed. Several existing privacy-preserving approaches focus on anonymization techniques such as generalization and bucketization. Recent work has shown that generalization loses considerable amount of information for high dimensional data, the bucketization does not prevent membership disclosure and does not make clear separation between quasi-identifying attributes and sensitive attributes. In this work a novel technique called Fragmentation is proposed for publishing sensitive data with preventing the sensitive information of the individual. Here first the vertical Fragmentation is applied to attributes. In vertical Fragmentation, attributes are segmented into columns. Each column contains a subset of attributes. Secondly, the horizontal Fragmentation is applied to tuples. In this, tuples are segmented into buckets. Each bucket contains a subset of tuples. Finally the real dataset is used for experiments and the results show that this Fragmentation technique preserves better utility while protecting privacy threats and prevents the membership disclosure.

Keywords

Privacy, Privacy Preservation, Data Anonymization, Data Publishing, Data Security.

Full Text

References

Tiancheng Li, Nninghui Li, Jian Zhang and Ian Molloy, “Slicing: A New Approach for Privacy Preserving Data Publishing”, IEEE Transactions on Knowledge and Data Engineering, Vol. 24, No. 3, pp. 561-574, 2012.

C. Aggarwal, “On K-Anonymity and the Curse of Dimensionality”, Proceedings of 31^st International Conference on Very Large Data Bases, pp. 901-909, 2005.

J. Brickell and V. Shmatikov, “The Cost of Privacy: Destruction of Data-Mining Utility in Anonymized Data Publishing”, Proceedings of 14^th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 70-78, 2008.

D.R. Kumar Raja and S. Pushpa, “Diversifying Personalized Mobile Multimedia Application Recommendations through the Latent Dirichlet Allocation and Clustering Optimization”, Multimedia Tools and Applications, pp. 1-20, 2019.

N. Koudas, D. Srivastava, T. Yu, and Q. Zhang, “Aggregate Query Answering on Anonymized Tables”, Proceedings of International Conference on Data Engineering, pp. 116-125, 2007.

K. LeFevre, D. DeWitt and R. Ramakrishnan, “Mondrian Multidimensional K-Anonymity”, Proceedings of International Conference on Data Engineering, pp. 20-25, 2006.

K. Raja and S. Pushpa, “Novelty‐Driven Recommendation by using Integrated Matrix Factorization and Temporal‐Aware Clustering Optimization”, International Journal of Communication Systems, pp. 1-16, 2018.

N. Li, T. Li and S. Venkatasubramanian, “T-Closeness: Privacy Beyond k-Anonymity and ‘-Diversity”, Proceedings of International Conference on Data Engineering, pp. 106-115, 2007.

T. Li and N. Li, “On the Tradeoff between Privacy and Utility in Data Publishing”, Proceedings of International Conference on Knowledge Discovery and Data Mining, pp. 517-526, 2009.

D.J. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke and J.Y. Halpern, “Worst-Case Background Knowledge for Privacy- Preserving Data Publishing”, Proceedings of International Conference on Knowledge Discovery and Data Mining, pp. 126-135, 2007.

U. Selvi and S. Puspha, “A Review of Big Data an Anonymization Algorithms”, International Journal of Applied Engineering Research, Vol. 10, No, 17, pp. 13125-13130, 2015.

L. Sweeney, “Achieving K-Anonymity Privacy Protection using Generalization and Suppression”, International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, Vol. 10, No. 6, pp. 571-588, 2002.

M. Terrovitis, N. Mamoulis, and P. Kalnis, “Privacy-Preserving Anonymization of Set-Valued Data”, Proceedings of 31^st International Conference on Very Large Data Bases, pp. 115-125, 2008.

R.C.W. Wong, J. Li, A.W.C. Fu and K. Wang, “(α, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy Preserving Data Publishing”, Proceedings of International Conference on Knowledge Discovery and Data Mining, pp. 754-759, 2006.

X. Xiao and Y. Tao, “Anatomy: Simple and Effective Privacy Preservation”, Proceedings of 31^st International Conference on Very Large Data Bases, pp. 139-150, 2006.

J. Xu, W. Wang, J. Pei, X. Wang, B. Shi, and A.W.C. Fu, “Utility- Based Anonymization Using Local Recoding”, Proceedings of International Conference on Knowledge Discovery and Data Mining, pp. 785-790, 2006.

Benjamin C.M. Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S. Yu, “Introduction to Privacy-Preserving Data Publishing: Concepts and Techniques”, CRC Press, 2011.

Deepreply - An Automatic Email Reply System with Unsupervised Cloze Translation and Deep Learning

Abstract Views :193 | PDF Views:0

Authors

P. V. Rajaraman ¹, M. Prakash ²

Affiliations
1 Department of Computer Science and Engineering, Rajalakshmi Engineering College, IN
2 Department of Information Technology, Karpagam College of Engineering, IN

Source

ICTACT Journal on Soft Computing, Vol 10, No 3 (2020), Pagination: 2090-2095

Abstract

Electronic mail (E-mail) has been the primary mode of communication for official purposes and it continues to be the same in all work environments even today. With the growing number of emails and most of them requiring only trivial replies, more tools are needed to generate replies to emails by reusing past replies. Although there are expert systems that can assist us in replying to incoming emails, they produce a generic reply to all. So an intelligent system that can generate replies for an incoming email in a very precise manner and generating the text reply in the user’s style is the identified requirement. This work is divided into two portions. First, translating an incoming email into cloze representation and extract the entities from it for generating a context, question and answer triplets. This is used for synthesising the training data for Extractive Question Answering later. The mentioned triplets are generated from a corpus of random emails belonging to different contexts and then the answers are extracted by recognising the named entities and random phrases of nouns from these paragraphs. The second ploy is to find the similarity between an incoming email that requires a reply and an old email that contains the reply to it. As a solution to these challenges, we propose a new deep neural network-based approach that relies on coarse-grained sentence modelling using CNN and a LSTM model. Our experimental results show that the approach outperforms the state-of-the-art approaches that are existing on a cleaner corpus.

Keywords

Deep Learning, E-mail, Unsupervised, Questioning.

Full Text

References

B. Agarwal, H. Ramampiaro, H. Langseth and M. Ruocco, “A Deep Network Model for Paraphrase Detection in Short Text Messages”, Information Processing and Management, Vol. 54, No. 6, pp. 922-937, 2018.

K. Amin, “Answering with Cases: A CBR Approach to Deep Learning”, Proceedings of International Conference on Case-Based Reasoning, pp. 1-12, 2018.

W. Xu, C.C. Burch, W.B. Dolan and Y. Ji, “Extracting Lexically Divergent Paraphrases from Twitter”, Proceedings of International Conference on Transactions of the Association for Computational Linguistics, Vol. 2, pp. 435-448, 2014.

W. Xu, C.C. Burch and W.B. Dolan, “SemEval-2015 Task 1: Paraphrase and Semantic Similarity in Twitter (PIT)”, Proceedings of 9th International Workshop on Semantic Evaluation, pp. 1-7, 2015.

K. Dey, S. Ritvik and K. Saroj, “A Paraphrase and Semantic Similarity Detection System for User Generated Short-Text Content on Microblogs”, Proceedings of International Conference on Computational Linguistics: Technical Papers, pp. 1-7, 2016.

N. Madnani, T. Joel and C. Martin, “Re-Examining Machine Translation Metrics for Paraphrase Identification”, Proceedings of Conference on North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1-8, 2012.

D. Das and N.A. Smith, “Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition”, Proceedings of the Joint Conference of 47th Annual Meeting of Computational Linguistics, 2009.

M. Sahi and V. Gupta, “A Novel Technique for Detecting Plagiarism in Documents Exploiting Information Sources”, Cognitive Computation, Vol. 9, No. 6, pp. 852-867, 2017.

K. Vani and G. Deepa, “Unmasking Text Plagiarism using Syntactic-Semantic based Natural Language Processing Techniques: Comparisons, Analysis and Challenges”, Information Processing and Management, Vol. 54, No. 3, pp. 408-432, 2018. [10] Y. Jiang, “Wikipedia-Based Information Content and Semantic Similarity Computation”, Information Processing and Management, Vol. 53, No. 1, 2017.

Franco-Salvador, Paolo Rosso, and Manuel Montes Y. Gomez. “A Systematic Study of Knowledge Graph Analysis for Cross-Language Plagiarism Detection”, Information Processing and Management, Vol. 52, No. 4, pp. 550-570, 2016.

S. Arora, Y. Liang and T, Ma. “A Simple but Tough-to-Beat Baseline for Sentence Embeddings”, Proceedings of 5th International Conference on Learning Representations, pp. 1-12, 2016.

P. Bojanowski, E. Grave and A. Joulin, “Enriching Word Vectors with Subword Information”, Transactions of the Association for Computational Linguistics, Vol. 5, pp. 135-146, 2017.

M. Pagliardini, P. Gupta and M. Jaggi, “Unsupervised Learning of Sentence Embeddings using Compositional N-Gram Features”, Proceedings of North American Conference on Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 528-540, 2017.

Y. Kim, “Convolutional Neural Networks for Sentence Classification”, Proceedings of Conference on Empirical Methods in Natural Language Processing, pp. 1746-1751, 2014.

R. Kiros, Y. Zhu, R.S. Zemel and S. Fidler, “Skip-Thought Vectors”, Proceedings of International Conference on Advances in Neural Information Processing Systems, pp. 3294-3302, 2015.

Y. Kim, “Character-Aware Neural Language Models”, Proceedings of 13th AAAI Conference on Artificial Intelligence, pp. 1111-1119, 2016.

X. Wang, J. Weijie and L. Zhiyong, “Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts”, Proceedings of 26th International Conference on Computational Linguistics: Technical Papers, pp. 1-9, 2016.

W. Guo and M. Diab, “Modeling Sentences in the Latent Space”, Proceedings of 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, pp. 864-872, 2012.

G. Zarrella, J.C. Henderson, E.M. Merkhofer and L. Strickhart, “MITRE: Seven Systems for Semantic Similarity in Tweets”, Proceedings of 9th International Workshop on Semantic Evaluation, pp. 12-17, 2015.

J. Zhao and M. Lan, “ECNU: Leveraging Word Embeddings to Boost Performance for Paraphrase in Twitter”, Proceedings of 9th International Workshop on Semantic Evaluation, pp. 34-39, 2015.

N.P.A. Vo, S. Magnolini and O. Popescu, “Paraphrase Identification and Semantic Similarity in Twitter with Simple Features”, Proceedings of International Conference on Association for Computational Linguistics, pp. 10-19, 2015.

Username
Password
Remember me