University of Minnesota
Software Engineering Center
/

You are here

Ye Yang

Recent Publications

When to Use Data from Other Projects for Effort Estimation

Collecting the data required for quality prediction within a development team is time-consuming and expensive. An alternative to make predictions using data that crosses from other projects or even other companies. We show that with/without relevancy filtering, imported data performs the same/worse (respectively) than using local data. Therefore, we recommend the use of relevancy filtering whenever generating estimates using data from another project.

Measuring the Heterogeneity of Crosscompany Datasets

As a standard practice, general effort estimate models are calibrated from large cross-company datasets. However, many of the records within such datasets are taken from companies that have calibrated the model to match their own local practices. Locally calibrated models are a double-edged sword; they often improve estimate accuracy for that particular organization, but they also encourage the growth of local biases. Such biases remain present when projects from that firm are used in a new cross-company dataset. Over time, such biases compound, and the