Data-Deduplication in Linux Kernel File-System

Amit Savyanavar; Sachin Katarnaware; Pritam Bankar; Prashant Jadhav; Nikhil Bagde

Data-Deduplication in Linux Kernel File-System

Amit Savyanavar , Sachin Katarnaware , Pritam Bankar , Prashant Jadhav , Nikhil Bagde

Affiliations
1 Department of Computer Engineering, MIT's College of Engineering, Kothrud, Pune-38, India

Subscribe/Renew Journal

Abstract
References
Article Metrics
Refbacks

The Data Deduplication is basically a compression technique to eliminate redundant data from hard disk or storage space to efficiently use the storage space. As in every operating system the storage space is manage by file system or we can say data is stored on secondary storage space by file system. So we are modifying the file system so that it can eliminate the redundant block of data before storing to the secondary space which is also called as Inline Data Deduplication. Ext4 is latest file system which is used in Linux, which is having so many new features, so we are modifying Ext4 and adding this one more feature called as Data Deduplication. In our method Inline data deduplication we create a table to store a hash key, and the corresponding block number, which contains the data for that hash key. The hash key is generated using sha1 algorithm. Every time whenever the new data comes it is given to sha1 before allocating any blocks for it and the key is generated. Then this key is compare with already stored keys in the table, it the key is already present then in that case only the corresponding counter of the key is modified or incremented, this counter is basically used to keep track of count of pointers that are pointing to block on the physical device. Whenever the key is not present in that case key is stored and the control is passed to superblock which allocates the free blocks, from the list which it contains and then returns the allocated block numbers to table where they are stored corresponding there key and the counter is also incremented. So by using this method we can eliminate redundant allocation of data blocks, as result we can save the space and increase the efficiency of the storage space. This is how enterprises and big organization can save space as there data is growing exponentially in their field. An also as this method is block level elimination it elimination ratio is also good and good save of storage space.

I-Scholar

Journal Help

User

Subscription Login to verify subscription

Notifications

Journal Content
Browse

Font Size

Information

Abstract Views: 169

PDF Views: 2

Data-Deduplication in Linux Kernel File-System

Abstract Views: 169 | PDF Views: 2

Authors

Amit Savyanavar
Department of Computer Engineering, MIT's College of Engineering, Kothrud, Pune-38, India

Sachin Katarnaware
Department of Computer Engineering, MIT's College of Engineering, Kothrud, Pune-38, India

Pritam Bankar
Department of Computer Engineering, MIT's College of Engineering, Kothrud, Pune-38, India

Prashant Jadhav
Department of Computer Engineering, MIT's College of Engineering, Kothrud, Pune-38, India

Nikhil Bagde
Department of Computer Engineering, MIT's College of Engineering, Kothrud, Pune-38, India

Username
Password
Remember me

Username
Password
Remember me

Data Mining and Knowledge Engineering

Data Mining and Knowledge Engineering

Data-Deduplication in Linux Kernel File-System

Subscribe/Renew Journal

Data-Deduplication in Linux Kernel File-System

Authors

Abstract