RT Journal Article
SR Electronic
T1 NIMAA: an R/CRAN package to accomplish NomInal data Mining AnAlysis
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 2022.01.13.475835
DO 10.1101/2022.01.13.475835
A1 Mohieddin Jafari
A1 Cheng Chen
A1 Mehdi Mirzaie
A1 Jing Tang
YR 2022
UL http://biorxiv.org/content/early/2022/01/18/2022.01.13.475835.abstract
AB Summary Nominal data is data that has been “labeled” and can be designated into a number of non-overlapping unordered groups. The analysis of this type of data is often superficial or trivial because it is not feasible to conduct extensive numerical methods on this type of data. Graphs or networks, on the other hand, are comprised of sets of nodes and edges that can also be considered as nominal variables. By integrating graph theory and data mining approaches, we offer the R package NIMAA to define a nominal data-mining pipeline to explore more information. Using nominal variables in a dataset, NIMAA provides functions for constructing weighted and unweighted bipartite graphs, analysing the similarity of labels in nominal variables, clustering labels or categories to super-labels, validating clustering results, predicting bipartite edges by missing weight imputation, and providing a variety of visualization tools. Here, we also indicated the application of nominal data mining in a biological dataset with well-riched nominal variables.Availability NIMAA’s official release and the beta update are available on CRAN and Github, respectively. URLs: https://CRAN.R-project.org/package=NIMAA and https://github.com/jafarilab/NIMAAContact mohieddin.jafari{at}helsinki.fi; jing.tang{at}helisnki.fiContributions MJ conceived the study and developed the models, MJ and CC adopted and implemented the methods, MM improved the methods, JT provided the funding, MJ, CC, MM and JT wrote the paper.Competing Interest StatementThe authors have declared no competing interest.