Predicting Potential Co-Authorship Using Random Forest: Case of Scientific Publications in Indonesian Institute of Sciences

Rizka Rahmaida, Asep Saefuddin, Bagus Sartono


Research collaboration is one of the strength in research management due to its advantages in quantity and quality of the research. Co-authorship network is one of the proxies to evaluate the emerging research collaborations. Co-authorship that happens for the first time among a pair of author plays an important role as the key of success for their co-authorship in the future. Therefore, the research aims to build a model predicting new co-authorship as potential co-authorship. This research used scientific articles in Indonesian biodiversity research published in Scopus during 2006-2015. New co-authorship of between 4,628 pair of authors were analyzed in terms of their similarity in co-authorship network, research interest, and community to predict whether a pair of author will have a new co-authorship in future. Random forest classifier was used to build the model after applying 10-fold cross validation in various parameter and random undersampling technique as preprocessing procedures. The result shows that the similarity in network, community network, and research interest and becomes good features to predict the potential co-authorship among a pair of author. Furthermore, paired authors that predicted to be co-authored and involving authors from Indonesian Institute of Sciences are identified as the potential patners recommended for development of research teams.

Full Text:



[UNEP] United Nations Environment Programme. (2001). Global Biodiversity Outlook. Retrieved from

Aouay, S., Jamoussi, S., & Gargouri, F. (2014). Feature based link prediction. In ACS 11th International Conference on Computer Systems and Applications (AICCSA 2014) (Vol. 2014, pp. 523–527). Doha: IEEE.

Asil, A., & Gurgen, F. (2017). Supervised and fuzzy rule based link prediction in weighted co-authorship networks. In 2nd International Conference on Computer Science and Engineering (pp. 1–5). IEEE.

Bammer, G. (2008). Enhancing research collaborations: Three key management challenges. Research Policy, 37(5), 875–887.

Beaver, D. D. (2001). Reflections on scientific collaboration (and its study): Past, present, and future. Scientometrics, 52(3), 365–377.

Breiman, L. (1999). Random Forests, 5–32. Retrieved from

Chuan, P. M., Son, L. H., Ali, M., Khang, T. D., Huong, L. T., & Dey, N. (2017). Link prediction in co-authorship networks based on hybrid content similarity metric. Applied Intelligence, 48(8), 2470–2486.

Curiskis, S. A., Osborn, T. R., & Kennedy, P. J. (2015). Link prediction and topological feature importance in social networks. In M. Zahidul Islam, L. Chen, K.-L. Ong, Y. Zhao, R. Nayak, & K. Paul (Eds.), Thirteenth Australasian Data Mining Conf. (pp. 39–50). Sydney: Australian Comp. Soc. Inc.

Glanzel, W. (2003). Bibliometrics as a research field : A course on theory and application of bibliometric indicators. Retrieved from

Guns, R., & Rousseau, R. (2014). Recommending research collaborations using link prediction and random forest classifiers. Scientometrics, 101(2), 1461–1473.

Handayani, T., Amelia, M., Rahmaida, R., Hardiyati, R., & Nadhiroh, I. M. (2016). Kajian Saintometrika Perkembangan Publikasi Ilmiah Keanekaragaman Hayati Indonesia Sebagai Bahan Rekomendasi Kebijakan Arah Penelitian Keanekaragaman Hayati Nasional. Pappiptek LIPI Jakarta.

Huang, L., Zhu, Y., Zhang, Y., Zhou, X., & Jia, X. (2018). A link prediction-based method for identifying potential cooperation partners: A case study on four journals of informetrics. In 2018 Proceedings of PICMET ’18: Technology Management for Interconnected World (pp. 1–6). Honolulu: PICMET, Inc.

Ibáñez, A., Bielza, C., & Larrañaga, P. (2013). Relationship among research collaboration, number of documents and number of citations: A case study in Spanish computer science production in 2000-2009. Scientometrics, 95(2), 689–716.

Katz, J. S., & Martin, B. R. (1997). What is research collaboration? Research Policy, 26(1), 1–18.

Maglaughlin, K. L., & Sonnenwald, D. H. (2005). Factors that Impact Interdisciplinary Natural Science Research Collaboration in Academia. In The 10th International Conference of the International Society for Scientometrics and Informetrics (pp. 499–508). Stockholm: Karolinska University Press. Retrieved from

Nadhiroh, I. M. (2015). Jaringan co-authorship dan potensi kolaborasi penelitian Indonesia dengan analisis jaringan sosial [tesis]. Bogor (ID): Institut Pertanian Bogor.

Nadhiroh, I. M., Hardiyati, R., Amelia, M., & Handayani, T. (2018). Mathematics and statistics related studies in Indonesia using co-authorship network analysis. International Journal of Advances in Intelligent Informatics, 4(2), 142–153.

Newman, M. E. J. (2001). The structure of scientific collaboration networks. In Proceedings of the National Academy of Sciences (Vol. 98, pp. 404–409).

Owusu-nimo, F., & Boshoff, N. (2016). Research collaboration in Ghana: patterns , motives and roles. Scientometrics, 110(3), 1099–1121.

Pavlov, M., & Ichise, R. (2007). Finding experts by link prediction in co-authorship networks. In CEUR Workshop Proceedings (Vol. 290, pp. 42–55). Busan: ISWSA.

Ponomariov, B., & Boardman, C. (2016). What is co-authorship? Scientometrics, 109(3), 1939–1963.

Roopashree, N., & Umadevi, V. (2014). Future Collaboration Prediction in Co-authorship Network. In Proceedings - 2014 3rd International Conference on Eco-Friendly Computing and Communication Systems, ICECCS 2014 (pp. 183–188). Mangalore: IEEE.

Sonnenwald, D. H. (2005). Scientific Collaboration. Annual Review of Information Science and Technology, 41(1), 643–681.

Yu, Q., Long, C., He, P., Shao, H., Duan, Z., Lv, Y., & Yu, Q. (2014). Predicting Co-Author Relationship in Medical Co-Authorship Networks. PLoS ONE, 9(7), 1–7.



  • There are currently no refbacks.

Copyright (c) 2019 STI Policy and Management Journal

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Copyright of Journal of STI (Science Technology Innovation) Policy and Management Journal (e-ISSN 2502-5996 p-ISSN 1907-9753). Powered by OJS.