Open Access Open Access  Restricted Access Subscription Access

Statistical Analysis of Differential Expression Level of Genes in Glaciozyma Antarctica PII2


Affiliations
1 School of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
2 School of Biological Sciences, Universiti Sains Malaysia, 11800 Georgetown, Penang, Malaysia
3 Malaysia Genome Institute, Ministry of Science, Technology and Innovation, 43000 Kajang, Selango, Malaysia
4 School of Bioscience and Biotechnology, Faculty of Science & Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
 

The synthesis of small RNA (sRNA) were extracted from Glaciozyma Antarctica PII2 (G. Antarctica) yeast using the Next-Generation Sequencing (NGS) technology. Statistical approach is used in this study to analyze sRNA from G. Antarctica in order to increase the production of this yeast given in biological conditions such as temperature, time and medium of growth. Hence, this study uses the analysis of variance (ANOVA) with F-test statistic (FANOVA) and shrinkage F-test (FS) to analyze the NGS data in identifying factors affecting the differential expression level of genes from G. Antarctica. FANOVA statistics are computed on a gene-by-gene basis from the residual sum of squares (SSE). Whereas FS-test refers to the shrinking of variance estimators from the variance estimators of the gene-by-gene ( σ  g 2 ) F-value obtained from ANOVA. Then, the analysis results between FANOVA-test and FS-test are compared in order to identify which statistical test is best in analyzing significantly differentially expressed gene based on accuracy value and area under the Receiver Operator Characteristics (ROC) curve. The statistical test with higher accuracy value and has a larger area under the ROC curve is the best statistical test. We found that both FANOVA and FS tests show that the majority of genes that are significantly differentially expressed are most affected by the main effects temperature (A) and time (B) and the interaction effect between temperature and time (AB). As for the best test, we found that FS-test is the best statistical test compared to FANOVA-test in this study of identifying significantly differentially expressed genes in G. Antarctica.

Keywords

ANOVA Model, F Statistics, Shrinkage Estimator, Yeast
User

Abstract Views: 126

PDF Views: 0




  • Statistical Analysis of Differential Expression Level of Genes in Glaciozyma Antarctica PII2

Abstract Views: 126  |  PDF Views: 0

Authors

Nurul Nadia Zulkefri
School of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
Nora Muda
School of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
Mohd Nazalan Mohd Najimudin
School of Biological Sciences, Universiti Sains Malaysia, 11800 Georgetown, Penang, Malaysia
Nor Muhammad Mahadi
Malaysia Genome Institute, Ministry of Science, Technology and Innovation, 43000 Kajang, Selango, Malaysia
Abdul Munir Abdul Murad
School of Bioscience and Biotechnology, Faculty of Science & Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
Nursyafiqi Zainuddin
School of Biological Sciences, Universiti Sains Malaysia, 11800 Georgetown, Penang, Malaysia

Abstract


The synthesis of small RNA (sRNA) were extracted from Glaciozyma Antarctica PII2 (G. Antarctica) yeast using the Next-Generation Sequencing (NGS) technology. Statistical approach is used in this study to analyze sRNA from G. Antarctica in order to increase the production of this yeast given in biological conditions such as temperature, time and medium of growth. Hence, this study uses the analysis of variance (ANOVA) with F-test statistic (FANOVA) and shrinkage F-test (FS) to analyze the NGS data in identifying factors affecting the differential expression level of genes from G. Antarctica. FANOVA statistics are computed on a gene-by-gene basis from the residual sum of squares (SSE). Whereas FS-test refers to the shrinking of variance estimators from the variance estimators of the gene-by-gene ( σ  g 2 ) F-value obtained from ANOVA. Then, the analysis results between FANOVA-test and FS-test are compared in order to identify which statistical test is best in analyzing significantly differentially expressed gene based on accuracy value and area under the Receiver Operator Characteristics (ROC) curve. The statistical test with higher accuracy value and has a larger area under the ROC curve is the best statistical test. We found that both FANOVA and FS tests show that the majority of genes that are significantly differentially expressed are most affected by the main effects temperature (A) and time (B) and the interaction effect between temperature and time (AB). As for the best test, we found that FS-test is the best statistical test compared to FANOVA-test in this study of identifying significantly differentially expressed genes in G. Antarctica.

Keywords


ANOVA Model, F Statistics, Shrinkage Estimator, Yeast



DOI: https://doi.org/10.17485/ijst%2F2015%2Fv8i12%2F75088