Document Type
Thesis
Degree
Master of Science
Major
Computer Science
Date of Defense
4-18-2018
Graduate Advisor
Dr. Sharlee Climer
Committee
Dr. Cezary Janikow
Dr. Sanjiv Bhatia
Dr. Mark Hauschild
Dr. Sharlee Climer
Abstract
The problem of community structure identification has been an extensively investigated area for biology, physics, social sciences, and computer science in recent years for studying the properties of networks representing complex relationships. Most traditional methods, such as K-means and hierarchical clustering, are based on the assumption that communities have spherical configurations. Lately, Genetic Algorithms (GA) are being utilized for efficient community detection without imposing sphericity. GAs are machine learning methods which mimic natural selection and scale with the complexity of the network. However, traditional GA approaches employ a representation method that dramatically increases the solution space to be searched by introducing redundancies. They also utilize a crossover operator which imposes a linear ordering that is not suitable for community detection.
The algorithm presented here is a framework to detect communities for complex biological networks that removes both redundancies and linearity. We also introduce a novel operator, named Gene Repair. This algorithm is unique as it is a flexible community detection technique aimed at maximizing the value of any given mathematical objective for the network. We reduce the memory requirements by representing chromosomes as a 3-dimensional bit array. Furthermore, in order to increase diversity while retaining promising chromosomes, we use natural selection process based on tournament selection with elitism. Additionally, our approach doesn’t require prior information about the number of true communities in the network. We apply our novel algorithm to benchmark datasets and also to a network representing a large cohort of AD cases and controls.
By utilizing this efficient and flexible implementation that is cognizant of characteristics for networks representing complex disease genetics, we sift out communities representing patterns of interacting genetic variants that are associated with this enigmatic disease.
Recommended Citation
Gururaj Rao, Aditya Karnam, "Efficient Reduced BIAS Genetic Algorithm for Generic Community Detection Objectives" (2018). Theses. 331.
https://irl.umsl.edu/thesis/331
Included in
Artificial Intelligence and Robotics Commons, Bioinformatics Commons, Computational Biology Commons, Computational Neuroscience Commons, Databases and Information Systems Commons, Discrete Mathematics and Combinatorics Commons, Genetics Commons, Genomics Commons, Molecular Genetics Commons, Other Genetics and Genomics Commons, Theory and Algorithms Commons