By Topic

Software Metrics Symposium, 1997. Proceedings., Fourth International

Date 5-7 Nov. 1997

Filter Results

Displaying Results 1 - 19 of 19
  • Proceedings Fourth International Software Metrics Symposium

    Save to Project icon | Request Permissions | PDF file iconPDF (182 KB)  
    Freely Available from IEEE
  • Author index

    Page(s): 175
    Save to Project icon | Request Permissions | PDF file iconPDF (36 KB)  
    Freely Available from IEEE
  • Software reuse metrics for an industrial project

    Page(s): 165 - 173
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (724 KB)  

    In 1990 a project was established at AT&T to build applications that manage telephone systems. Since then the project has successfully completed over 20 applications comprising about 500,000 lines of source code. These systems are used daily by hundreds of managers and operators to monitor and provision the AT&T long distance telephone network. The project's success can be attributed directly to an early commitment in making software reuse a major component of its software development process. A critical factor was the establishment of a feedback loop between consumers and producers of reusable software to foster continual improvement and extension of reusable code repositories. Progresses in the feedback loop are measured by five different reuse measures. While no one measure is “best” as each provides a different perspective on reuse, two derived from the consumer/producer model have proven particularly useful: use of reusable library components and reuse growth factor. The latter developed in the study, helped uncover a new opportunity for reuse that was not obvious from other measures View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The impact of costs of misclassification on software quality modeling

    Page(s): 54 - 62
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (720 KB)  

    A software quality model can make timely predictions of the class of a module, such as not fault prone or fault prone. These enable one to improve software development processes by targeting reliability improvement techniques more effectively and efficiently. Published software quality classification models generally minimize the number of misclassifications. The contribution of the paper is empirical evidence, supported by theoretical considerations, that such models can significantly benefit from minimizing the expected cost of misclassifications, rather than just the number of misclassifications. This is necessary when misclassification costs for not fault prone modules are quite different from those of fault prone modules. We illustrate the principles with a case study using nonparametric discriminant analysis. The case study examined a large subsystem of the Joint Surveillance Target Attack Radar System, JS-TARS, which is an embedded, real time, military application. Measures of the process history of each module were independent variables. Models with equal costs of misclassification were unacceptable, due to high misclassification rates for fault prone modules, but cost weighted models had acceptable, balanced misclassification rates View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A case study of evaluating configuration management practices with goal-oriented measurement

    Page(s): 144 - 151
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (708 KB)  

    The paper describes the application of goal-oriented measurement for evaluating configuration management practices at Societa Interbancaria per l'Automazione (SIA). SIA is in charge of running, developing, and maintaining the National Interbank Network of Italy. The results of a CMM-based process assessment indicated that configuration management (CM) practice was one of the most premising areas for improvement. A project was initiated aimed at establishing an improved CM process supported by state-of-the-art tools and incorporating sound practices. It was decided to apply the new process to one of the most important products of SIA, which deals with the development of a new generation of networking software. Goal-oriented measurement following the goal/question/metric (GQM) approach was applied to monitor the establishment of the CM process. The paper describes the establishment and execution of the measurement program and reports about related product and process modeling. Different techniques for qualitative and quantitative analysis of experimental data were performed. Selected results and experiences are reported View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Metrics for database systems: an empirical study

    Page(s): 99 - 107
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (740 KB)  

    An important task for any software project manager is to be able to predict and control project size and development effort. Unfortunately, there is comparatively little work, other than function points, that tackles the problem of building prediction systems for software that is dominated by data considerations, in particular systems developed using 4GLs. We describe an empirical investigation of 70 such systems. Various easily obtainable counts were extracted from data models (e.g. number of entities) and from specifications (e.g. number of screens). Using simple regression analysis, a prediction system of implementation size with accuracy of MMRE=21% was constructed. This approach offers several advantages. First there tend to be fewer counting problems than with function points since the metrics we used were based upon simple counts. Second, the prediction systems were calibrated to specific local environments rather than being based upon industry weights. We believe this enhanced their accuracy. Our work shows that it is possible to develop simple and useful local prediction systems based upon metrics easily derived from functional specifications and data models, without recourse to overly complex metrics or analysis techniques. We conclude that this type of use of metrics can provide valuable support for the management and control of 4GL and database projects View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Software metrics model for quality control

    Page(s): 127 - 136
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (784 KB)  

    A model is developed for validating and applying metrics for quality control, using the Space Shuttle flight software as an example. We validate metrics with respect to a quality factor in accordance with the metrics validation methodology previously developed. Boolean discriminant functions (BDFs) are developed for use in the quality control process. These functions make fewer mistakes in classifying software that is low quality than is the case when linear vectors of metrics are used because the BDFs include additional information for discriminating quality: critical values. Critical values are threshold values of metrics that are used to either accept or reject modules when the modules are inspected during the quality control process. A series of nonparametric statistical methods is used to: 1) identify a set of candidate metrics for further analysis; 2) identify the critical values of the metrics, and 3) find the optimal function of metrics and critical values. A marginal analysis should be performed when making a decision about how many metrics to use in a quality control process. Certain metrics are dominant in their effects on classifying quality and additional metrics are not needed to accurately classify quality. This effect is called dominance. Related to the property of dominance is the property of concordance, which is the degree to which a set of metrics produces the same result in classifying software quality. A high value of concordance implies that additional metrics will not make a significant contribution to accurately classifying quality; hence, these metrics are redundant View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Inheritance tree shapes and reuse

    Page(s): 34 - 42
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (640 KB)  

    The shapes of forests of inheritance trees can affect the amount of code reuse in an object-oriented system. Designers can benefit from knowing how structuring decisions affect reuse, so that they can make more optimal decisions. We show that a set of objective measures can classify forests of inheritance trees into a set of five shape classes. These shape classes determine bounds on reuse measures based on the notion of code savings. The reuse measures impart an ordering on the shape classes that demonstrates that some shapes have more capacity to support reuse through inheritance. An initial empirical study shows that the application of the measures and demonstrates that real inheritance forests can be objectively and automatically classified into one of the five shape classes View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Predicting fault detection effectiveness

    Page(s): 82 - 89
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (728 KB)  

    Regression methods are used to model software fault detection effectiveness in terms of several product and testing process measures. The relative importance of these product/process measures for predicting fault detection effectiveness is assessed for a specific data set. A substantial family of models is considered, specifically, the family of quadratic response surface models with two way interaction. Model selection is based on “leave one out at a time” cross validation using the predicted residual sum of squares (PRESS) criterion. Prediction intervals for fault detection effectiveness are used to generate prediction intervals for the number of residual faults conditioned on the observed number of discovered faults. High levels of assurance about measures like fault detection effectiveness (residual faults) require more than just high (low) predicted values, they also require that the prediction intervals have high lower (low upper) bounds View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A unified framework for cohesion measurement in object-oriented systems

    Page(s): 43 - 53
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1076 KB)  

    The increasing importance being placed on software measurement has led to an increased amount of research in developing new software measures. Given the importance of object oriented development techniques, one specific area where this has occurred is cohesion measurement in object oriented systems. However despite an interesting body of work, there is little understanding of the motivations and empirical hypotheses behind many of these new measures. It is often difficult to determine how such measures relate to one another and for which application they can be used. As a consequence, it is very difficult for practitioners and researchers to obtain a clear picture of the state of the art in order to select or define cohesion measures for object oriented systems. To help remedy this situation a unified framework, based on the issues discovered in a review of object oriented cohesion measures, is presented. The unified framework contributes to an increased understanding of the state of the art as it is a mechanism for: (i) comparing measures and their potential use, (ii) integrating existing measures which examine the same concepts in different ways, and (iii) facilitating more rigorous decision making regarding the definition of new measures and the selection of existing measures for a specific goal of measurement View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A methodology for risk assessment of functional specification of software systems using colored Petri nets

    Page(s): 108 - 117
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (816 KB)  

    The paper presents a methodology for risk assessment in complex real time software systems at the early stages of development, namely the analysis/design phase. A heuristic risk assessment technique is described based on colored Petri net (CPN) models. The technique uses complexity metrics and severity measures in developing a heuristic risk factor from software functional specifications. The objective of risk assessment is to classify the software components according to their relative importance in terms of such factors as severity and complexity. Both traditional static and dynamic complexity measures are supported. Concurrency complexity is presented as a new dynamic complexity metric. This metric measures the added dynamic complexity due to concurrency in the system. Severity analysis is conducted using failure mode and effect analysis (FMEA). The methodology presented here is applied to a large scale software system as presented in a companion paper (H. Ammar et al., 1997) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Testability measurements for data flow designs

    Page(s): 91 - 98
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (660 KB)  

    The paper focuses on data flow designs. It presents a testability measurement based on the controllability/observability pair of attributes. A case study provided by AEROSPATIALE illustrates the testability analysis of an embedded data flow design. Applying such an analysis during the specification stage allows detection of weaknesses and appraisal of improvements in terms of testability View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Assessing feedback of measurement data: relating Schlumberger RPS practice to learning theory

    Page(s): 152 - 164
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1288 KB)  

    Schlumberger RPS successfully applies software measurement to support their software development projects. It is proposed that the success of their measurement practices is mainly based on the organization of the interpretation process. This interpretation of the measurement data by the project team members is performed in so-called `feedback sessions'. Many researchers identify the feedback process of measurement data as crucial to the success of a quality improvement program. However, few guidelines exist about the organization of feedback sessions. For instance, with what frequency should feedback sessions be held, how much information should be presented in a single session, and what amount of user involvement is advisable? Within the Schlumberger RPS search to improve feedback sessions, the authors explored learning theories to provide guidelines to these type of questions. After all, what is feedback more than learning? View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Evaluating the interrater agreement of process capability ratings

    Page(s): 2 - 11
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1440 KB)  

    The reliability of process assessments has received some study in the recent past, much of it being conducted within the context of the SPICE (Software Process Improvement and Capability dEtermination) trials. In this paper, we build upon this work by evaluating the reliability of ratings on each of the practices that make up the SPICE capability dimension. The type of reliability that we evaluate is inter-rater agreement: the agreement amongst independent assessors' capability ratings. Inter-rater agreement was found to be generally high. We also identify one particular practice that exhibits low agreement in its ratings View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards a theoretical framework for measuring software attributes

    Page(s): 119 - 126
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (712 KB)  

    Several attributes (e.g., size, complexity, cohesion, coupling) are commonly used in software engineering to refer to software product properties. A large number of measures have been proposed in the literature to measure these attributes. However, since software attributes are often defined in fuzzy and ambiguous ways, it is sometimes unclear whether the proposed measures are adequate for the software attributes they purport to measure (i.e., their construct validity). In recent years, a few approaches have been proposed to lay theoretical foundations for defining measures for software attributes, but no widespread agreement has been reached on a rigorous, unambiguous definition of software attributes. We first extend previous work carried out on axiomatic approaches for the definition of measures for software attributes (E. Weyuker, 1988; K.B. Lakshmanan et al., 1991). Second, we show how a hierarchical axiomatic framework can be constructed to support the definition of consistent measures for a given attribute at different levels of measurement. The paper shows how axiomatic approaches can be combined with the theory of measurement scales so that, depending on the level of sophistication of our empirical understanding of the attribute, we can select an appropriate level of measurement and a suitable axiomatic framework View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Metrics and laws of software evolution-the nineties view

    Page(s): 20 - 32
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1216 KB)  

    The process of E-type software development and evolution has proven most difficult to improve, possibly due to the fact that the process is a multi-input, multi-output system involving feedback at many levels. This observation, first recorded in the early 1970s during an extended study of OS/360 evolution, was recently captured in a FEAST (Feedback, Evolution And Software Technology) hypothesis: a hypothesis being studied in on-going two-year project, FEAST/1. Preliminary conclusions based on a study of a financial transaction system-Logica's Fastwire (FW)-are outlined and compared with those reached during the earlier OS/360 study. The new analysis supports, or better does not contradict, the laws of software evolution, suggesting that the 1970s approach to metric analysis of software evolution is still relevant today. It is hoped that FEAST/1 will provide a foundation for mastering the feedback aspects of the software evolution process, opening up new paths for process modelling and improvement View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using relative complexity to allocate resources in gray-box testing of object-oriented code

    Page(s): 74 - 81
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (604 KB)  

    Software testing costs would be reduced if managers and testing engineers could gauge which parts of a system were more complex and thus more likely to have faults. Once these areas are identified, testing resources and testing priority could be assigned accordingly. The paper defines a method that uses the relative complexity metric to allocate resources for gray box testing in an environment where object oriented code is used and historical data are not available. The proposed method can also be applied to black box and white box testing as well as software quality assessments such as maintainability and reliability. The work on an industrial C++ software subsystem presented here shows that the rank order of minor test areas of the subsystem by relative test complexity is significantly similar to the rank order obtained from the experts who designed, wrote and tested the code View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Some misconceptions about lines of code

    Page(s): 137 - 142
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (428 KB)  

    Source lines of code (SLOC) is perhaps the oldest of software metrics, and still a benchmark for evaluating new ones. Despite the extensive experience with the SLOC metric, there are still a number of misconceptions about it. The paper addresses three of them: (1) that the format of SLOC is relevant to how to properly count it (a simple experiment shows that, in fact, it does not matter), (2) that SLOC is most useful as a predictor of software quality (in fact it is most useful as a covariate of other predictors), and (3) that there is an important inverse relationship between defect density and code size (in fact, this is an arithmetic artifact of plotting bugs-per-SLOC against SLOC) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An empirical analysis of equivalence partitioning, boundary value analysis and random testing

    Page(s): 64 - 73
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (924 KB)  

    An experiment comparing the effectiveness of equivalence partitioning (EP), boundary value analysis (BVA) and random testing was performed, based on an operational avionics system of approximately 20000 lines of Ada code. The paper introduces an experimental methodology that considers all possible input values that satisfy a test technique and all possible input values that would cause a module to fail (rather than arbitrarily chosen values from these sets) to determine absolute values for the effectiveness for each test technique. As expected, an implementation of BVA was found to be most effective, with neither EP nor random testing half as effective. The random testing results were surprising, requiring just 8 test cases per module to equal the effectiveness of EP, although somewhere in the region of 50000 random test cases were required to equal the effectiveness of BVA View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.