Dec 15, 2006: Computational Discovery in Evolving Complex Networks
Filed in: Ph.D. Thesis
Yongqin Gao, University of Notre Dame
The field of study into evolving complex networks has more and more researchers working using various methods. We designed and developed a computational discovery methodology to study these evolving complex networks. This methodology is a cyclic procedure involving four different processes: data mining, network analysis, computer simulation, and collaboration. The data mining process is responsible for discovering potential associations and patterns. The network analysis process is responsible for assessing the discovery found in the data mining process and analyzing the network measures in the network. The computer simulation process is responsible for generating a "fit" model to simulate the evolution of the complex network based on the discoveries and measures used in the previous processes. Finally, the collaboration process is responsible for designing and maintaining a research collaboratory to host our research and support any possible similar research. To better explain the methodology we developed, we applied this methodology in a case study, Open Source Software research, in particular, a study of the SourceForge.net development community. Through applying the methodology, we generated a distribution based predictor to predict the "popularity" of a project; had more insights about the structure and evolution of the SourceForge.net community network, generated a "fit" model to simulate the evolution of the community, and finally implemented and maintained a research collaboratory for the Open Source Software related research.
Abstract
The field of study into evolving complex networks has more and more researchers working using various methods. We designed and developed a computational discovery methodology to study these evolving complex networks. This methodology is a cyclic procedure involving four different processes: data mining, network analysis, computer simulation, and collaboration. The data mining process is responsible for discovering potential associations and patterns. The network analysis process is responsible for assessing the discovery found in the data mining process and analyzing the network measures in the network. The computer simulation process is responsible for generating a "fit" model to simulate the evolution of the complex network based on the discoveries and measures used in the previous processes. Finally, the collaboration process is responsible for designing and maintaining a research collaboratory to host our research and support any possible similar research. To better explain the methodology we developed, we applied this methodology in a case study, Open Source Software research, in particular, a study of the SourceForge.net development community. Through applying the methodology, we generated a distribution based predictor to predict the "popularity" of a project; had more insights about the structure and evolution of the SourceForge.net community network, generated a "fit" model to simulate the evolution of the community, and finally implemented and maintained a research collaboratory for the Open Source Software related research.