Genetic-like algorithm applied on citation networks for evaluating scientific publications

, , , , , ,

Conference on Complex Systems (2016) .


Citation networks of scientific publications track the evolution of human scientific knowledge in our modern age. Grasping the role of a publication in this complex system is a challenge for scientometry. We consider a biological analogy between the citation network and the evolutionary reproduction network. Taking a scientific publication as a biological organism, by this analogy, we define the cited papers as the “parents” and the publications citing the given work as the “offspring” or descendants. This way the fitness of a paper becomes exactly the number of its received citations, which is already the most common way of evaluation in scientometrics. We know, however, that the most basic units of evolution are not species or individuals, but genes, which survive and propagate through living organisms. In this case, let us consider these “memes” by Richard Dawkins as ideas or pieces of scientific knowledge. Assuming that every new work published introduces some new idea (e.g. mutation/variation in biology), this knowledge propagates down on the tree of descendants in a similar manner to genes. The content of the paper represents the "genome" of a node and it is composed of the own newly introduced memes (knowledge) and the weighted content of the papers cited. The more papers a given publication cites the more ideas it synthesizes, meaning that the weight of each of idea decreases in the new publication with the shared parenthood. In our study, we analyze this simple algorithm on different citation networks and show how the fitness defined for new ideas could be a better scientometric indicator than the simple citation number of papers. This model could also provide a framework of tracking individual ideas and innovations within the abundance of intertwined citations, giving additional insight from historical, sociological and economic perspectives.

