On the Analysis of Power Law Distribution in Software Component Sizes

Component-based software development (CBSD) is an active area of research. Ascertaining the quality of components is important for overall software quality assurance in CBSD. One of the important metrics for measuring defects, analyzability, efforts, and cost in CBSD is component size. The paper presents an analytical model based on maximization of Tsallis entropy to obtain closed form expression for component size distribution (maximum Tsallis entropy component size distribution, MTECSD) in steady state. It is found that the component size distribution follows power law asymptotically. A procedure based on generalized Jensen–Shannon measure is developed to estimate model parameters. A detailed analysis of many popular probability distributions along with MTECSD is carried out on many diverse real data sets of component-based softwares. The analysis reveals that lognormal and MTECSD distributions fit well to component sizes in many software conforming the presence of power law behavior. The software whose component size distributions are described by MTECSD are in equilibrium implying that new defects in these software systems occur occasionally. Power law behavior in component sizes also imply high variation leading to difficulty in software analyzability. The precise knowledge of component size distribution also provides an alternative method to compute efforts and cost estimates by modified COCOMO model.

This is the peer reviewed version of the following article: [On the analysis of power law distribution in software component sizes. Journal of Software: Evolution and Process (2021)], which has been published in final form at https://doi.org/10.1002/smr.2417. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions: https://authorservices.wiley.com/author-resources/Journal-Authors/licensing/self-archiving.html#3.

Files

Metadata

Work Title On the Analysis of Power Law Distribution in Software Component Sizes
Access
Open Access
Creators
  1. Shachi Sharma
  2. Parag C. Pendharkar
Keyword
  1. Component based software development
  2. COCOMO model
  3. maximum entropy principle
  4. non-linear regression
  5. power law probability distribution
  6. Tsallis entropy
License In Copyright (Rights Reserved)
Work Type Article
Publisher
  1. Wiley
Publication Date December 28, 2021
Publisher Identifier (DOI)
  1. 10.1002/smr.2417
Source
  1. Journal of Software: Evolution and Process
Deposited August 03, 2022

Versions

Analytics

Collections

This resource is currently not in any collection.

Work History

Version 1
published

  • Created
  • Added SoftwareEvolution-1.pdf
  • Added Creator Shachi Sharma
  • Added Creator Parag C. Pendharkar
  • Published
  • Updated Work Title, Keyword, Description Show Changes
    Work Title
    • On the Analysis of Power Law Distribution in Software Component Sizes
    • ! On the Analysis of Power Law Distribution in Software Component Sizes
    Keyword
    • Component based software development, COCOMO model, maximum entropy principle, non-linear regression, power law probability distribution, Tsallis entropy
    Description
    • <p>Component-based software development (CBSD) is an active area of research. Ascertaining the quality of components is important for overall software quality assurance in CBSD. One of the important metrics for measuring defects, analyzability, efforts, and cost in CBSD is component size. The paper presents an analytical model based on maximization of Tsallis entropy to obtain closed form expression for component size distribution (maximum Tsallis entropy component size distribution, MTECSD) in steady state. It is found that the component size distribution follows power law asymptotically. A procedure based on generalized Jensen–Shannon measure is developed to estimate model parameters. A detailed analysis of many popular probability distributions along with MTECSD is carried out on many diverse real data sets of component-based softwares. The analysis reveals that lognormal and MTECSD distributions fit well to component sizes in many software conforming the presence of power law behavior. The software whose component size distributions are described by MTECSD are in equilibrium implying that new defects in these software systems occur occasionally. Power law behavior in component sizes also imply high variation leading to difficulty in software analyzability. The precise knowledge of component size distribution also provides an alternative method to compute efforts and cost estimates by modified COCOMO model.</p>
    • Component-based software development (CBSD) is an active area of research. Ascertaining the quality of components is important for overall software quality assurance in CBSD. One of the important metrics for measuring defects, analyzability, efforts, and cost in CBSD is component size. The paper presents an analytical model based on maximization of Tsallis entropy to obtain closed form expression for component size distribution (maximum Tsallis entropy component size distribution, MTECSD) in steady state. It is found that the component size distribution follows power law asymptotically. A procedure based on generalized Jensen–Shannon measure is developed to estimate model parameters. A detailed analysis of many popular probability distributions along with MTECSD is carried out on many diverse real data sets of component-based softwares. The analysis reveals that lognormal and MTECSD distributions fit well to component sizes in many software conforming the presence of power law behavior. The software whose component size distributions are described by MTECSD are in equilibrium implying that new defects in these software systems occur occasionally. Power law behavior in component sizes also imply high variation leading to difficulty in software analyzability. The precise knowledge of component size distribution also provides an alternative method to compute efforts and cost estimates by modified COCOMO model.
  • Updated Work Title Show Changes
    Work Title
    • ! On the Analysis of Power Law Distribution in Software Component Sizes
    • On the Analysis of Power Law Distribution in Software Component Sizes
  • Updated