By Topic

Software Engineering, IEEE Transactions on

Issue 8 • Date Aug. 2013

Filter Results

Displaying Results 1 - 9 of 9
  • Active learning and effort estimation: Finding the essential content of software effort estimation data

    Page(s): 1040 - 1053
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2308 KB) |  | HTML iconHTML  

    Background: Do we always need complex methods for software effort estimation (SEE)? Aim: To characterize the essential content of SEE data, i.e., the least number of features and instances required to capture the information within SEE data. If the essential content is very small, then 1) the contained information must be very brief and 2) the value added of complex learning schemes must be minimal. Method: Our QUICK method computes the euclidean distance between rows (instances) and columns (features) of SEE data, then prunes synonyms (similar features) and outliers (distant instances), then assesses the reduced data by comparing predictions from 1) a simple learner using the reduced data and 2) a state-of-the-art learner (CART) using all data. Performance is measured using hold-out experiments and expressed in terms of mean and median MRE, MAR, PRED(25), MBRE, MIBRE, or MMER. Results: For 18 datasets, QUICK pruned 69 to 96 percent of the training data (median = 89 percent). K = 1 nearest neighbor predictions (in the reduced data) performed as well as CART's predictions (using all data). Conclusion: The essential content of some SEE datasets is very small. Complex estimation methods may be overelaborate for such datasets and can be simplified. We offer QUICK as an example of such a simpler SEE method. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Balancing Privacy and Utility in Cross-Company Defect Prediction

    Page(s): 1054 - 1068
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3081 KB) |  | HTML iconHTML  

    Background: Cross-company defect prediction (CCDP) is a field of study where an organization lacking enough local data can use data from other organizations for building defect predictors. To support CCDP, data must be shared. Such shared data must be privatized, but that privatization could severely damage the utility of the data. Aim: To enable effective defect prediction from shared data while preserving privacy. Method: We explore privatization algorithms that maintain class boundaries in a dataset. CLIFF is an instance pruner that deletes irrelevant examples. MORPH is a data mutator that moves the data a random distance, taking care not to cross class boundaries. CLIFF+MORPH are tested in a CCDP study among 10 defect datasets from the PROMISE data repository. Results: We find: 1) The CLIFFed+MORPHed algorithms provide more privacy than the state-of-the-art privacy algorithms; 2) in terms of utility measured by defect prediction, we find that CLIFF+MORPH performs significantly better. Conclusions: For the OO defect data studied here, data can be privatized and shared without a significant degradation in utility. To the best of our knowledge, this is the first published result where privatization does not compromise defect prediction. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Featured Transition Systems: Foundations for Verifying Variability-Intensive Systems and Their Application to LTL Model Checking

    Page(s): 1069 - 1089
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4414 KB) |  | HTML iconHTML  

    The premise of variability-intensive systems, specifically in software product line engineering, is the ability to produce a large family of different systems efficiently. Many such systems are critical. Thorough quality assurance techniques are thus required. Unfortunately, most quality assurance techniques were not designed with variability in mind. They work for single systems, and are too costly to apply to the whole system family. In this paper, we propose an efficient automata-based approach to linear time logic (LTL) model checking of variability-intensive systems. We build on earlier work in which we proposed featured transitions systems (FTSs), a compact mathematical model for representing the behaviors of a variability-intensive system. The FTS model checking algorithms verify all products of a family at once and pinpoint those that are faulty. This paper complements our earlier work, covering important theoretical aspects such as expressiveness and parallel composition as well as more practical things like vacuity detection and our logic feature LTL. Furthermore, we provide an in-depth treatment of the FTS model checking algorithm. Finally, we present SNIP, a new model checker for variability-intensive systems. The benchmarks conducted with SNIP confirm the speedups reported previously. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MADMatch: Many-to-Many Approximate Diagram Matching for Design Comparison

    Page(s): 1090 - 1111
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3129 KB) |  | HTML iconHTML  

    Matching algorithms play a fundamental role in many important but difficult software engineering activities, especially design evolution analysis and model comparison. We present MADMatch, a fast and scalable many-to-many approximate diagram matching approach based on an error-tolerant graph matching (ETGM) formulation. Diagrams are represented as graphs, costs are assigned to possible differences between two given graphs, and the goal is to retrieve the cheapest matching. We address the resulting optimization problem with a tabu search enhanced by the novel use of lexical and structural information. Through several case studies with different types of diagrams and tasks, we show that our generic approach obtains better results than dedicated state-of-the-art algorithms, such as AURA, PLTSDiff, or UMLDiff, on the exact same datasets used to introduce (and evaluate) these algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Monitor-Based Instant Software Refactoring

    Page(s): 1112 - 1126
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1892 KB) |  | HTML iconHTML  

    Software refactoring is an effective method for improvement of software quality while software external behavior remains unchanged. To facilitate software refactoring, a number of tools have been proposed for code smell detection and/or for automatic or semi-automatic refactoring. However, these tools are passive and human driven, thus making software refactoring dependent on developers' spontaneity. As a result, software engineers with little experience in software refactoring might miss a number of potential refactorings or may conduct refactorings later than expected. Few refactorings might result in poor software quality, and delayed refactorings may incur higher refactoring cost. To this end, we propose a monitor-based instant refactoring framework to drive inexperienced software engineers to conduct more refactorings promptly. Changes in the source code are instantly analyzed by a monitor running in the background. If these changes have the potential to introduce code smells, i.e., signs of potential problems in the code that might require refactorings, the monitor invokes corresponding smell detection tools and warns developers to resolve detected smells promptly. Feedback from developers, i.e., whether detected smells have been acknowledged and resolved, is consequently used to optimize smell detection algorithms. The proposed framework has been implemented, evaluated, and compared with the traditional human-driven refactoring tools. Evaluation results suggest that the proposed framework could drive inexperienced engineers to resolve more code smells (by an increase of 140 percent) promptly. The average lifespan of resolved smells was reduced by 92 percent. Results also suggest that the proposed framework could help developers to avoid similar code smells through timely warnings at the early stages of software development, thus reducing the total number of code smells by 51 percent. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Name-Based Analysis of Equally Typed Method Arguments

    Page(s): 1127 - 1143
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1357 KB) |  | HTML iconHTML  

    When calling a method that requires multiple arguments, programmers must pass the arguments in the expected order. For statically typed languages, the compiler helps programmers by checking that the type of each argument matches the type of the formal parameter. Unfortunately, types are futile for methods with multiple parameters of the same type. How can a programmer check that equally typed arguments are passed in the correct order? This paper presents two simple, yet effective, static program analyses that detect problems related to the order of equally typed arguments. The key idea is to leverage identifier names to infer the semantics of arguments and their intended positions. The analyses reveal problems that affect the correctness, understandability, and maintainability of a program, such as accidentally reversed arguments and misleading parameter names. Most parts of the analyses are language-agnostic. We evaluate the approach with 24 real-world programs written in Java and C. Our results show the analyses to be effective and efficient. One analysis reveals anomalies in the order of equally typed arguments; it finds 54 relevant problems with a precision of 82 percent. The other analysis warns about misleading parameter names and finds 31 naming bugs with a precision of 39 percent. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Quantifying the Effect of Code Smells on Maintenance Effort

    Page(s): 1144 - 1156
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2702 KB) |  | HTML iconHTML  

    Context: Code smells are assumed to indicate bad design that leads to less maintainable code. However, this assumption has not been investigated in controlled studies with professional software developers. Aim: This paper investigates the relationship between code smells and maintenance effort. Method: Six developers were hired to perform three maintenance tasks each on four functionally equivalent Java systems originally implemented by different companies. Each developer spent three to four weeks. In total, they modified 298 Java files in the four systems. An Eclipse IDE plug-in measured the exact amount of time a developer spent maintaining each file. Regression analysis was used to explain the effort using file properties, including the number of smells. Result: None of the 12 investigated smells was significantly associated with increased effort after we adjusted for file size and the number of changes; Refused Bequest was significantly associated with decreased effort. File size and the number of changes explained almost all of the modeled variation in effort. Conclusion: The effects of the 12 smells on maintenance effort were limited. To reduce maintenance effort, a focus on reducing code size and the work practices that limit the number of changes may be more beneficial than refactoring code smells. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Session Reliability of Web Systems under Heavy-Tailed Workloads: An Approach Based on Design and Analysis of Experiments

    Page(s): 1157 - 1178
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2769 KB) |  | HTML iconHTML  

    While workload characterization and performance of web systems have been studied extensively, reliability has received much less attention. In this paper, we propose a framework for session reliability modeling which integrates the user view represented by the session layer and the system view represented by the service layer. A unique characteristic of the session layer is that, in addition to the user navigation patterns, it incorporates the session length in number of requests and allows us to account for heavy-tailed workloads shown to exist in real web systems. The service layer is focused on the request reliability as it is observed at the service provider side. It considers the multifier web server architecture and the way components interact in serving each request. Within this framework, we develop a session reliability model and solve it using simulation. Instead of the traditional one-factor-at-a-time sensitivity analysis, we use statistical design and analysis of experiments, which allow us to identify the factors and interactions that have statistically significant effect on session reliability. Our findings indicate that session reliability, which accounts for the distribution of failed requests within sessions, provides better representation of the user perceived quality than the request-based reliability. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Software Reliability Modeling with Software Metrics Data via Gaussian Processes

    Page(s): 1179 - 1186
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1578 KB) |  | HTML iconHTML  

    In this paper, we describe statistical inference and prediction for software reliability models in the presence of covariate information. Specifically, we develop a semiparametric, Bayesian model using Gaussian processes to estimate the numbers of software failures over various time periods when it is assumed that the software is changed after each time period and that software metrics information is available after each update. Model comparison is also carried out using the deviance information criterion, and predictive inferences on future failures are shown. Real-life examples are presented to illustrate the approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Software Engineering is interested in well-defined theoretical results and empirical studies that have potential impact on the construction, analysis, or management of software. The scope of this Transactions ranges from the mechanisms through the development of principles to the application of those principles to specific environments. Specific topic areas include: a) development and maintenance methods and models, e.g., techniques and principles for the specification, design, and implementation of software systems, including notations and process models; b) assessment methods, e.g., software tests and validation, reliability models, test and diagnosis procedures, software redundancy and design for error control, and the measurements and evaluation of various aspects of the process and product; c) software project management, e.g., productivity factors, cost models, schedule and organizational issues, standards; d) tools and environments, e.g., specific tools, integrated tool environments including the associated architectures, databases, and parallel and distributed processing issues; e) system issues, e.g., hardware-software trade-off; and f) state-of-the-art surveys that provide a synthesis and comprehensive review of the historical development of one particular area of interest.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Matthew B. Dwyer
Dept. Computer Science and Engineering
256 Avery Hall
University of Nebraska-Lincoln
Lincoln, NE 68588-0115 USA
tseeicdwyer@computer.org