Open Access Open Access  Restricted Access Subscription Access

Searching Gapped Palindromes in DNA Sequences using Dynamic Suffix Array


Affiliations
1 Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College, Ghaziabad - 201009, Uttar Pradesh, India
2 Department of Computer Science, Yobe State University, Damaturu, Nigeria
 

Background/Objectives: In the biological sequences, palindromes can create structures that differ from the common structure of non-palindromic sequences. Unfortunately, mutations introduced by evolutionary methods, make it difficult to search the palindromes in DNA sequences. Methods/Statistical Analysis: The mutations occur in DNA sequences with spacer (i.e. set of characters). One version of such algorithms has been intended to search for palindromes with gaps (i.e. spacer) - gapped palindromes. The concept of Dynamic Suffix Array (DSA) is used to propose algorithms to search two classes of gapped palindromes-length constrained and long armed. DSA modifies the previous built suffix arrays when there is insertion and deletion of a new character, due to which efficiency is improved. DNA datasets obtained from National Centre for Biotechnology Information (NCBI) is taken as input. The execution time, palindrome weight and length of palindrome arms and spacer are analysed. Findings: Our proposed algorithms search maximal length constrained and long armed gapped palindromes in DNA sequences efficiently. Time complexity of our proposed algorithms is O(n), where n is input parameters. Also, we compute palindrome weights in the DNA sequences. For length constrained gapped palindromes, our proposed algorithm is compared with existing one. The existing algorithm uses only suffix array. Experimental Observations reveal that by the use of DSA, execution time of our algorithm on different DNA sequences has been improved by maximum 57.89%. It also shows a decrease in the execution time over existing approach, proving our designed algorithm is space efficient, faster and easy to implement. Applications/Improvement: Our algorithms analyze short DNA sequences easily. These algorithms can be executed and tested on standard DNA and datasets with large number of base pairs.

Keywords

Dynamic Suffix Array, Gapped Palindromes, Length Constrained, Long Armed, Palindromes
User

Abstract Views: 186

PDF Views: 0




  • Searching Gapped Palindromes in DNA Sequences using Dynamic Suffix Array

Abstract Views: 186  |  PDF Views: 0

Authors

Shivika Gupta
Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College, Ghaziabad - 201009, Uttar Pradesh, India
Rajesh Prasad
Department of Computer Science, Yobe State University, Damaturu, Nigeria
Sunita Yadav
Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College, Ghaziabad - 201009, Uttar Pradesh, India

Abstract


Background/Objectives: In the biological sequences, palindromes can create structures that differ from the common structure of non-palindromic sequences. Unfortunately, mutations introduced by evolutionary methods, make it difficult to search the palindromes in DNA sequences. Methods/Statistical Analysis: The mutations occur in DNA sequences with spacer (i.e. set of characters). One version of such algorithms has been intended to search for palindromes with gaps (i.e. spacer) - gapped palindromes. The concept of Dynamic Suffix Array (DSA) is used to propose algorithms to search two classes of gapped palindromes-length constrained and long armed. DSA modifies the previous built suffix arrays when there is insertion and deletion of a new character, due to which efficiency is improved. DNA datasets obtained from National Centre for Biotechnology Information (NCBI) is taken as input. The execution time, palindrome weight and length of palindrome arms and spacer are analysed. Findings: Our proposed algorithms search maximal length constrained and long armed gapped palindromes in DNA sequences efficiently. Time complexity of our proposed algorithms is O(n), where n is input parameters. Also, we compute palindrome weights in the DNA sequences. For length constrained gapped palindromes, our proposed algorithm is compared with existing one. The existing algorithm uses only suffix array. Experimental Observations reveal that by the use of DSA, execution time of our algorithm on different DNA sequences has been improved by maximum 57.89%. It also shows a decrease in the execution time over existing approach, proving our designed algorithm is space efficient, faster and easy to implement. Applications/Improvement: Our algorithms analyze short DNA sequences easily. These algorithms can be executed and tested on standard DNA and datasets with large number of base pairs.

Keywords


Dynamic Suffix Array, Gapped Palindromes, Length Constrained, Long Armed, Palindromes



DOI: https://doi.org/10.17485/ijst%2F2015%2Fv8i23%2F115391