Publications – Analytics of Software, GAmes And Repository Data (ASGAARD) Lab

( = Paper PDF, = Presentation slides, = Presentation video)

Hao Li; Cor-Paul Bezemer

Bridging the language gap: an empirical study of bindings for open source machine learning libraries across software package ecosystems Journal Article

Empirical Software Engineering, 30 (6), 2024.

Files:

Abstract | BibTeX | Tags: Library bindings, Machine learning, SE4AI, SE4ML

@article{li_MLbindings,

title = {Bridging the language gap: an empirical study of bindings for open source machine learning libraries across software package ecosystems},

author = {Hao Li and Cor-Paul Bezemer},

year  = {2024},

date = {2024-10-18},

urldate = {2024-10-18},

journal = {Empirical Software Engineering},

volume = {30},

number = {6},

abstract = {Open source machine learning (ML) libraries enable developers to

integrate advanced ML functionality into their own applications. However,

popular ML libraries, such as TensorFlow, are not available natively in all

programming languages and software package ecosystems. Hence, developers

who wish to use an ML library which is not available in their programming lan-

guage or ecosystem of choice, may need to resort to using a so-called binding

library (or binding). Bindings provide support across programming languages

and package ecosystems for reusing a host library. For example, the Keras

.NET binding provides support for the Keras library in the NuGet (.NET)

ecosystem even though the Keras library was written in Python. In this pa-

per, we collect 2,436 cross-ecosystem bindings for 546 ML libraries across 13

software package ecosystems by using an approach called BindFind, which can

automatically identify bindings and link them to their host libraries. Further-

more, we conduct an in-depth study of 133 cross-ecosystem bindings and their

development for 40 popular open source ML libraries. Our findings reveal that

the majority of ML library bindings are maintained by the community, with

npm being the most popular ecosystem for these bindings. Our study also

indicates that most bindings cover only a limited range of the host library’s

releases, often experience considerable delays in supporting new releases, and

have widespread technical lag. Our findings highlight key factors to consider

for developers integrating bindings for ML libraries and open avenues for re-

searchers to further investigate bindings in software package ecosystems.},

keywords = {Library bindings, Machine learning, SE4AI, SE4ML},

pubstate = {published},

tppubtype = {article}

}

Hao Li; Gopi Krishnan Rajbahadur; Cor-Paul Bezemer

Studying the Impact of TensorFlow and PyTorch Bindings on Machine Learning Software Quality Journal Article

ACM Transactions on Software Engineering and Methodology, 2024.

Files:

Abstract | BibTeX | Tags: Library bindings, Machine learning, SE4AI, SE4ML, Software quality