Code Clone Analysis


Book Description

This is the first book organized around code clone analysis. To cover the broad studies of code clone analysis, this book selects past research results that are important to the progress of the field and updates them with new results and future directions. The first chapter provides an introduction for readers who are inexperienced in the foundation of code clone analysis, defines clones and related terms, and discusses the classification of clones. The chapters that follow are categorized into three main parts to present 1) major tools for code clone analysis, 2) fundamental topics such as evaluation benchmarks, clone visualization, code clone searches, and code similarities, and 3) applications to actual problems. Each chapter includes a valuable reference list that will help readers to achieve a comprehensive understanding of this diverse field and to catch up with the latest research results. Code clone analysis relies heavily on computer science theories such as pattern matching algorithms, computer language, and software metrics. Consequently, code clone analysis can be applied to a variety of real-world tasks in software development and maintenance such as bug finding and program refactoring. This book will also be useful in designing an effective curriculum that combines theory and application of code clone analysis in university software engineering courses.




Your Code as a Crime Scene


Book Description

Jack the Ripper and legacy codebases have more in common than you'd think. Inspired by forensic psychology methods, you'll learn strategies to predict the future of your codebase, assess refactoring direction, and understand how your team influences the design. With its unique blend of forensic psychology and code analysis, this book arms you with the strategies you need, no matter what programming language you use. Software is a living entity that's constantly changing. To understand software systems, we need to know where they came from and how they evolved. By mining commit data and analyzing the history of your code, you can start fixes ahead of time to eliminate broken designs, maintenance issues, and team productivity bottlenecks. In this book, you'll learn forensic psychology techniques to successfully maintain your software. You'll create a geographic profile from your commit data to find hotspots, and apply temporal coupling concepts to uncover hidden relationships between unrelated areas in your code. You'll also measure the effectiveness of your code improvements. You'll learn how to apply these techniques on projects both large and small. For small projects, you'll get new insights into your design and how well the code fits your ideas. For large projects, you'll identify the good and the fragile parts. Large-scale development is also a social activity, and the team's dynamics influence code quality. That's why this book shows you how to uncover social biases when analyzing the evolution of your system. You'll use commit messages as eyewitness accounts to what is really happening in your code. Finally, you'll put it all together by tracking organizational problems in the code and finding out how to fix them. Come join the hunt for better code! What You Need: You need Java 6 and Python 2.7 to run the accompanying analysis tools. You also need Git to follow along with the examples.




Bayesian Methods for Hackers


Book Description

Master Bayesian Inference through Practical Examples and Computation–Without Advanced Mathematical Analysis Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You’ll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you’ve mastered these techniques, you’ll constantly turn to this guide for the working PyMC code you need to jumpstart future projects. Coverage includes • Learning the Bayesian “state of mind” and its practical implications • Understanding how computers perform Bayesian inference • Using the PyMC Python library to program Bayesian analyses • Building and debugging models with PyMC • Testing your model’s “goodness of fit” • Opening the “black box” of the Markov Chain Monte Carlo algorithm to see how and why it works • Leveraging the power of the “Law of Large Numbers” • Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning • Using loss functions to measure an estimate’s weaknesses based on your goals and desired outcomes • Selecting appropriate priors and understanding how their influence changes with dataset size • Overcoming the “exploration versus exploitation” dilemma: deciding when “pretty good” is good enough • Using Bayesian inference to improve A/B testing • Solving data science problems when only small amounts of data are available Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.




Empirical Research towards a Relevance Assessment of Software Clones


Book Description

Redundancies in program source code - software clones - are a common phenomenon. Although it is often claimed that software clones decrease the maintainability of software systems and need to be managed, research in the last couple of years showed that not all clones can be considered harmful. A sophisticated assessment of the relevance of software clones and a cost-benefit analysis of clone management is needed to gain a better understanding of cloning and whether it is truly a harmful phenomenon. This thesis introduces techniques to model, analyze, and evaluate versatile aspects of software clone evolution within the history of a system. We present a mapping of non-identical clones across multiple versions of a system, that avoids possible ambiguities of previous approaches. Though processing more data to determine the context of each clone to avoid an ambiguous mapping, the approach is shown to be efficient and applicable to large systems for a retrospective analysis of software clone evolution. The approach has been used in several studies to gain insights into the phenomenon of cloning in open-source as well as industrial software systems. Our results show that non-identical clones require more attention regarding clone management compared to identical clones as they are the dominating clone type for the main share of our subject systems. Using the evolution model to investigate costs and benefits of refactorings that remove clones, we conclude that clone removals could not reduce maintenance costs for most systems under study.




HTML and CSS


Book Description

Jon Duckett’s best-selling, full color introduction to HTML and CSS—making complex topics simple, accessible, and fun! Learn HTML and CSS from the book that has inspired hundreds of thousands of beginner-to-intermediate coders. Professional web designers, developers, and programmers as well as new learners are looking to amp up their web design skills at work and expand their personal development—yet finding the right resources online can be overwhelming. Take a confident step in the right direction by choosing the simplicity of HTML & CSS: Design and Build Websites by veteran web developer and programmer Jon Duckett. Widely regarded for setting a new standard for those looking to learn and master web development through his inventive teaching format, Jon Duckett has helped global brands like Philips, Nike, and Xerox create innovative digital solutions, designing and delivering web and mobile projects with impact and the customer at the forefront. In HTML & CSS, Duckett shares his real-world insights in a unique and highly visual style: Introduces HTML and CSS in a way that makes them accessible to everyone―from students to freelancers, and developers, programmers, marketers, social media managers, and more Combines full-color design graphics and engaging photography to explain the topics in an in-depth yet straightforward manner Provides an efficient and user-friendly structure that allows readers to progress through the chapters in a self-paced format Is perfect for anyone looking to update a content management system, run an e-commerce store, or redesign a website using popular web development tools HTML & CSS is well-written and readable, providing organized instruction in ways that other online courses, tutorials, and books have yet to replicate. For readers seeking a comprehensive yet concise guide to HTML and CSS, look no further than this one-of-a-kind guide. HTML & CSS is also available as part of two hardcover and paperback sets depending on your web design and development needs: Web Design with HTML, CSS, JavaScript, and jQuery Set Paperback: 9781118907443 Hardcover: 9781119038634 Front-End Back-End Development with HTML, CSS, JavaScript, jQuery, PHP, and MySQL Set Paperback: 9781119813095 Hardcover: 9781119813088




The Art and Science of Analyzing Software Data


Book Description

The Art and Science of Analyzing Software Data provides valuable information on analysis techniques often used to derive insight from software data. This book shares best practices in the field generated by leading data scientists, collected from their experience training software engineering students and practitioners to master data science. The book covers topics such as the analysis of security data, code reviews, app stores, log files, and user telemetry, among others. It covers a wide variety of techniques such as co-change analysis, text analysis, topic analysis, and concept analysis, as well as advanced topics such as release planning and generation of source code comments. It includes stories from the trenches from expert data scientists illustrating how to apply data analysis in industry and open source, present results to stakeholders, and drive decisions. - Presents best practices, hints, and tips to analyze data and apply tools in data science projects - Presents research methods and case studies that have emerged over the past few years to further understanding of software data - Shares stories from the trenches of successful data science initiatives in industry




Producing Open Source Software


Book Description

The corporate market is now embracing free, "open source" software like never before, as evidenced by the recent success of the technologies underlying LAMP (Linux, Apache, MySQL, and PHP). Each is the result of a publicly collaborative process among numerous developers who volunteer their time and energy to create better software. The truth is, however, that the overwhelming majority of free software projects fail. To help you beat the odds, O'Reilly has put together Producing Open Source Software, a guide that recommends tried and true steps to help free software developers work together toward a common goal. Not just for developers who are considering starting their own free software project, this book will also help those who want to participate in the process at any level. The book tackles this very complex topic by distilling it down into easily understandable parts. Starting with the basics of project management, it details specific tools used in free software projects, including version control, IRC, bug tracking, and Wikis. Author Karl Fogel, known for his work on CVS and Subversion, offers practical advice on how to set up and use a range of tools in combination with open mailing lists and archives. He also provides several chapters on the essentials of recruiting and motivating developers, as well as how to gain much-needed publicity for your project. While managing a team of enthusiastic developers -- most of whom you've never even met -- can be challenging, it can also be fun. Producing Open Source Software takes this into account, too, as it speaks of the sheer pleasure to be had from working with a motivated team of free software developers.




Refactoring for Software Design Smells


Book Description

Awareness of design smells – indicators of common design problems – helps developers or software engineers understand mistakes made while designing, what design principles were overlooked or misapplied, and what principles need to be applied properly to address those smells through refactoring. Developers and software engineers may "know" principles and patterns, but are not aware of the "smells" that exist in their design because of wrong or mis-application of principles or patterns. These smells tend to contribute heavily to technical debt – further time owed to fix projects thought to be complete – and need to be addressed via proper refactoring.Refactoring for Software Design Smells presents 25 structural design smells, their role in identifying design issues, and potential refactoring solutions. Organized across common areas of software design, each smell is presented with diagrams and examples illustrating the poor design practices and the problems that result, creating a catalog of nuggets of readily usable information that developers or engineers can apply in their projects. The authors distill their research and experience as consultants and trainers, providing insights that have been used to improve refactoring and reduce the time and costs of managing software projects. Along the way they recount anecdotes from actual projects on which the relevant smell helped address a design issue. - Contains a comprehensive catalog of 25 structural design smells (organized around four fundamental designprinciples) that contribute to technical debt in software projects - Presents a unique naming scheme for smells that helps understand the cause of a smell as well as pointstoward its potential refactoring - Includes illustrative examples that showcase the poor design practices underlying a smell and the problemsthat result - Covers pragmatic techniques for refactoring design smells to manage technical debt and to create and maintainhigh-quality software in practice - Presents insightful anecdotes and case studies drawn from the trenches of real-world projects




Software Product Quality Control


Book Description

Quality is not a fixed or universal property of software; it depends on the context and goals of its stakeholders. Hence, when you want to develop a high-quality software system, the first step must be a clear and precise specification of quality. Yet even if you get it right and complete, you can be sure that it will become invalid over time. So the only solution is continuous quality control: the steady and explicit evaluation of a product’s properties with respect to its updated quality goals. This book guides you in setting up and running continuous quality control in your environment. Starting with a general introduction on the notion of quality, it elaborates what the differences between process and product quality are and provides definitions for quality-related terms often used without the required level of precision. On this basis, the work then discusses quality models as the foundation of quality control, explaining how to plan desired product qualities and how to ensure they are delivered throughout the entire lifecycle. Next it presents the main concepts and techniques of continuous quality control, discussing the quality control loop and its main techniques such as reviews or testing. In addition to sample scenarios in all chapters, the book is rounded out by a dedicated chapter highlighting several applications of different subsets of the presented quality control techniques in an industrial setting. The book is primarily intended for practitioners working in software engineering or quality assurance, who will benefit by learning how to improve their current processes, how to plan for quality, and how to apply state-of-the-art quality control techniques. Students and lecturers in computer science and specializing in software engineering will also profit from this book, which they can use in practice-oriented courses on software quality, software maintenance and quality assurance.




Bug Patterns in Java


Book Description

Bug Patterns in Java presents a methodology for diagnosing and debugging computer programs. The act of debugging will be presented as an ideal application of the scientific method. Skill in this area is entirely independent of other programming skills, such as designing for extensibility and reuse. Nevertheless, it is seldom taught explicitly. Eric Allen lays out a theory of debugging, and how it relates to the rest of the development cycle. In particular, he stresses the critical role of unit testing in effective debugging. At the same time, he argues that testing and debugging, while often conflated, are properly considered to be distinct tasks. Upon laying this groundwork, Allen then discusses various "bug patterns" (recurring relationships between signaled errors and underlying bugs in a program) that occur frequently in computer programs. For each pattern, the book discusses how to identify them, how to treat them, and how to prevent them. Table of Contents Agile Methods in a Chaotic Environment Bugs, Specifications, and Implementations Debugging and the Development Process Debugging and the Testing Process The Scientific Method of Debugging About the Bug Patterns The Rogue Tile Null Pointers Everywhere! The Dangling Composite The Null Flag The Double Descent The Liar View Saboteur Data The Broken Dispatch The Impostor Type The Split Cleaner The Fictitious Implementation The Orphaned Thread The Run-On Initialization Platform-Dependent Patterns A Diagnostic Checklist Design Patterns for Debugging References