Computational Biology and Bioinformatics

Submit a Manuscript

Publishing with us to make your research visible to the widest possible audience.

Propose a Special Issue

Building a community of authors and readers to discuss the latest research and develop new ideas.

Research Article |

Visualizing Biclusters of Gene Expression Data and Their Overlaps Based on a Two-Dimensional Matrix Technique

Biclustering is a data mining technique used to analyze gene expression data. It consists of classifying subgroups of genes that behave similarly under subgroups of conditions and can behave independently under other conditions. These discovered co-expressed genes (called biclusters) can help to find specific biological aims like finding characteristics of a specific disease. A large number of biclustering algorithms have been developed. Generally, these algorithms give as output a large number of overlapped biclusters. The visualization of these biclusters is still a non-trivial task. In this paper, we present a new approach to display biclustering results from gene expression data on the same screen. It is based on a two-dimensional matrix where each bicluster is represented as a column and each overlap between a set of biclusters is represented as a row. We illustrated the usefulness of our method with biclustering results from real and synthetic datasets and we compared it to other techniques that concentrate on biclustering overlaps issue. The method is implemented in a web-based interactive visualization tool called VisBicluster available at http://vis.usal.es/~visusal/visbicluster.

Biclustering Visualization, Two-Dimensional Matrix, Filtering, Overlaps, InfoVis

Haithem Aouabed, Mourad Elloumi. (2023). Visualizing Biclusters of Gene Expression Data and Their Overlaps Based on a Two-Dimensional Matrix Technique. Computational Biology and Bioinformatics, 11(2), 19-32. https://doi.org/10.11648/j.cbb.20231102.11

Copyright © 2023 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. M. B. Eisen, P. T. Spellman, P. O. Brown, D. Botstein, Cluster analysis and display of genome-wide expression patterns, Proceedings of the National Academy of Sciences of the United States of America. 95 (1998) 14863–14868. doi: 10.1073/pnas.95.25.14863.
2. R. R. Sokal, C. D. Michener, A statistical method for evaluating systematic relationships, Univ. Kansas, Sci. Bull. 38 (1958) 1409–1438. https://ci.nii.ac.jp/naid/10004143217/.
3. J. A. Hartigan, M. A. Wong, Algorithm AS 136: A K-Means Clustering Algorithm, 1979. http://www.labri.fr/perso/bpinaud/userfiles/downloads/hartigan_1979_kmeans.pdf (accessed July 6, 2019).
4. Y. Cheng, G. M. Church, Biclustering of expression data., Proceedings. International Conference on Intelligent Systems for Molecular Biology. 8 (2000) 93–103. http://www.ncbi.nlm.nih.gov/pubmed/10977070 (accessed April 4, 2017).
5. S. C. Madeira, A. L. Oliveira, Biclustering algorithms for biological data analysis: a survey, IEEE/ACM Trans. Comput. Biol. Bioinforma. 1 (2004) 24–45. doi: 10.1109/TCBB.2004.2.
6. C. North, Information Visualization, in: Handbook of Human Factors and Ergonomics, John Wiley & Sons, Inc., Hoboken, NJ, USA, 2006: pp. 1222–1245. doi: 10.1002/0470048204.ch46.
7. C. Ware, Information visualization: perception for design, Morgan Kaufman, 2004. https://dokumen.tips/documents/information-visualization-perception-for-design-2nd-edition.html (accessed July 13, 2019).
8. B. Pontes, R. Giráldez, J. S. Aguilar-Ruiz, Biclustering on expression data: A review, Journal of Biomedical Informatics. 57 (2015) 163–180. doi: 10.1016/j.jbi.2015.06.028.
9. H. Aouabed, M. Elloumi, R. Santamaría, An evaluation study of biclusters visualization techniques of gene expression data, Journal of Integrative Bioinformatics. 18 (2021). doi: 10.1515/JIB-2021-0019/MACHINEREADABLECITATION/RIS.
10. H. Aouabed, R. Santamaria, M. Elloumi, Visualizing biclustering results on gene expression data: A survey, ACM International Conference Proceeding Series. (2021) 170–179. doi: 10.1145/3473258.3473284.
11. D. Gonçalves, R. S. Costa, R. Henriques, Context-situated visualization of biclusters to aid decisions: going beyond subspaces with parallel coordinates, ACM International Conference Proceeding Series. (2022). doi: 10.1145/3531073.3531124.
12. N. K. Verma, T. Sharma, S. Dixit, P. Agrawal, S. Sengupta, V. Singh, BIDEAL: A Toolbox for Bicluster Analysis—Generation, Visualization and Validation, SN Computer Science. 2 (2021). doi: 10.1007/S42979-020-00411-9.
13. M. Sözdinler, A Review of Visualization Methods and Tools for the Biclustering, International Journal of Innovative Science and Research Technology. 6 (2021). www.ijisrt.com (accessed June 5, 2023).
14. H. Aouabed, R. Santamaría, M. Elloumi, Suitable Overlapping Set Visualization Techniques and Their Application to Visualize Biclustering Results on Gene Expression Data, in: Springer, Cham, 2018: pp. 191–201. doi: 10.1007/978-3-319-99133-7_16.
15. R. Santamaría, R. Therón, L. Quintales, BicOverlapper 2.0: visual analysis for gene expression, Bioinformatics. 30 (2014) 1785. doi: 10.1093/BIOINFORMATICS/BTU120.
16. M. Streit, S. Gratzl, M. Gillhofer, A. Mayr, A. Mitterecker, S. Hochreiter, Furby: fuzzy force-directed bicluster visualization., BMC Bioinformatics. 15 Suppl 6 (2014) S4. doi: 10.1186/1471-2105-15-S6-S4.
17. H. Aouabed, R. Santamaria, M. Elloumi, VisBicluster: A Matrix-Based Bicluster Visualization of Expression Data, J. Comput. Biol. (2020) cmb.2019.0385. doi: 10.1089/cmb.2019.0385.
18. A. Lex, N. Gehlenborg, H. Strobelt, R. Vuillemot, H. Pfister, UpSet: Visualization of intersecting sets, IEEE Trans. Vis. Comput. Graph. 20 (2014) 1983–1992. doi: 10.1109/TVCG.2014.2346248.
19. M. E. Baron, A Note on the Historical Development of Logic Diagrams: Leibniz, Euler and Venn, Math. Gaz. 53 (1969) 113. doi: 10.2307/3614533.
20. R. Santamaría, R. Therón, L. Quintales, A visual analytics approach for understanding biclustering results from microarray data, BMC Bioinformatics. 9 (2008) 247. doi: 10.1186/1471-2105-9-247.
21. S. Barkow, S. Bleuler, A. Prelić, P. Zimmermann, E. Zitzler, BicAT: A biclustering analysis toolbox, Bioinformatics. 22 (2006) 1282–1283. doi: 10.1093/bioinformatics/btl099.
22. A. Prelić, S. Bleuler, P. Zimmermann, A. Wille, P. Bühlmann, W. Gruissem, L. Hennig, L. Thiele, E. Zitzler, A systematic comparison and evaluation of biclustering methods for gene expression data, Bioinformatics. 22 (2006) 1122–1129. doi: 10.1093/bioinformatics/btl060.
23. V. I. Levenshtein, Binary Codes Capable of Correcting Deletions, Insertions and Reversals, Sov. Phys. Dokl. Vol. 10, p.707. 10 (1966) 707. http://adsabs.harvard.edu/abs/1966SPhD...10..707L.
24. R. Santamaria, Visual analysis of gene expression data by means of biclustering, University of Salamanca, Spain, 2009.
25. L. Lazzeroni, A. Owen, Plaid Models for Gene Expression Data, CEUR Workshop Proc. 1542 (2000) 33–36. doi: 10.1017/CBO9781107415324.004.
26. V. A. Padilha, R. J. G. B. Campello, A systematic comparative evaluation of biclustering techniques, BMC Bioinformatics. 18 (2017) 55. doi: 10.1186/s12859-017-1487-1.
27. A. Bhattacharjee, W. G. Richards, J. Staunton, C. Li, S. Monti, P. Vasa, C. Ladd, J. Beheshti, R. Bueno, M. Gillette, M. Loda, G. Weber, E. J. Mark, E. S. Lander, W. Wong, B. E. Johnson, T. R. Golub, D. J. Sugarbaker, M. Meyerson, Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses, Proceedings of the National Academy of Sciences of the United States of America. 98 (2001) 13790–13795. doi: 10.1073/pnas.191502998.
28. B. Alsallakh, L. Micallef, W. Aigner, H. Hauser, S. Miksch, P. Rodgers, Visualizing Sets and Set-typed Data: State-of-the-Art and Future Challenges, Eurographics Conference on Visualization (EuroVis)– State of The Art Reports. (2014) 1–21. doi: 10.2312/eurovisstar.20141170.
29. U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, Knowledge Discovery and Data Mining: Towards a Unifying Framework, 1996. www.aaai.org (accessed January 4, 2020).
30. D. Chang, L. Dooley, J. E. Tuovinen, Gestalt theory in visual screen design: a new look at an old subject, in: Seventh World Conference on Computers in Education, 2002. https://www.semanticscholar.org/paper/Gestalt-theory-in-visual-screen-design%3A-a-new-look-Chang-Dooley/41ca82e97d5ad678c9578d6a18d4600b708277d2 (accessed November 17, 2019).
31. J. Mackinlay, Applying a theory of graphical presentation to the graphic design of user interfaces, in: Proceedings of the 1st Annual ACM SIGGRAPH Symposium on User Interface Software and Technology, UIST 1988, Association for Computing Machinery, Inc, 1988: pp. 179–189. doi: 10.1145/62402.62431.
32. B. Shneiderman, The eyes have it: A task by data type taxonomy for information visualizations, Proceedings IEEE Symposium on Visual Languages. (1996) 336--343. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.224.3197 (accessed November 16, 2019).
33. D. Keim, K. Jörn, G. Ellis, M. Florian, Mastering the information age: solving problems with visual analytics, Eurographics Association, 2010.
34. S. Kaiser, R. Santamaria, T. Khamiakova, M. Sill, R. Theron, L. Quintales, F. Leisch, E. De, T. Maintainer, biclust: BiCluster Algorithms. R package version 1.0.2., (2013). https://cran.r-project.org/web/packages/biclust/biclust.pdf (accessed April 22, 2017).