University of Minnesota
Software Engineering Center

You are here

Ye Yang

Recent Publications

When to Use Data from Other Projects for Effort Estimation

Collecting the data required for quality prediction within a development team is time-consuming and expensive. An alternative to make predictions using data that crosses from other projects or even other companies. We show that with/without relevancy filtering, imported data performs the same/worse (respectively) than using local data. Therefore, we recommend the use of relevancy filtering whenever generating estimates using data from another project.

Measuring the Heterogeneity of Crosscompany Datasets

As a standard practice, general effort estimate models are calibrated from large cross-company datasets. However, many of the records within such datasets are taken from companies that have calibrated the model to match their own local practices. Locally calibrated models are a double-edged sword; they often improve estimate accuracy for that particular organization, but they also encourage the growth of local biases. Such biases remain present when projects from that firm are used in a new cross-company dataset. Over time, such biases compound, and the