Master Thesis Defense on Investigating Dependency Code Reuse Using Callgraphs

Nov 30, 2020 09:00 — 11:00
Online (Zoom)

Software development is more and more reliant on external code. This external code is developed, bundled into packages and shared using package repositories like or npm. Reusing shared code bundles greatly improves development speed but without proper care and knowledge of included external code it can cause issues as well. Over-reliance on simple trivial packages that can be easily implemented locally can cause severe consequences like not being able to build if the package is no longer available. Furthermore, including large packages when only a small amount of it is being used can cause software projects become bloated with unnecessary code. This thesis proposes several metrics: leanness index, software composition index and utilization index that quantify how software projects reuse their dependencies or how much of the dependencies are utilized by their dependents. All three of the metrics are based on full function callgraph of the applications. Computation of these metrics was implemented for Rust and applied on every package in Rust package repository General findings showed that dependency leanness was rather low: 21.7% of all packages in used less than 5% of their included dependencies while 95% of all crates used less than 62.4%. Another discovery revealed that packages in tend to consist mostly of code from external dependencies as opposed to local code: on average 91% of lines of code come from external dependencies. Lastly, using utilization index functions that were never used by any other crate in were identified: in 95% of packages, 71% of their callgraphs are never used.