Classification and Characterization of Networks
Karl R. B. Schmitt
Arts and Sciences
Mathematics and Statistics
0000-0002-5127-375X, 0000-0002-3887-8113, 0000-0003-3342-2835
Networks are often labeled according to the underlying phenomena that they represent, such as re-tweets, protein interactions, or web page links. It is generally believed that networks from different categories have inherently unique network characteristics. Our research provides conclusive evidence to validate this belief by presenting the results of global network clustering and classification into common categories using machine learning algorithms. The machine learning techniques of decisions trees, random forests, linear support vector classification and Gaussian Naive Bayes were applied to a 14-feature 'identifying vector' for each graph. During cross-validation, the best technique, random forest, achieved an accuracy of 92%, a precision of 90% and a recall of 90%. After training the machine learning algorithm it was applied to a collection of initially unlabeled graphs from the Network Repository (www.networkrepository.com). Results were then manually checked by determining (when possible) original sources for these graphs. We conclude by examining the accuracy of our results and discussing how future researchers can make use of this process.
Ingram, Emma E.; Ortiz Aquino, Adriana M.; and Canning, James P., "Classification and Characterization of Networks" (2017). Summer Interdisciplinary Research Symposium. 16.