Ethics and Data Science


Book Description

As the impact of data science continues to grow on society there is an increased need to discuss how data is appropriately used and how to address misuse. Yet, ethical principles for working with data have been available for decades. The real issue today is how to put those principles into action. With this report, authors Mike Loukides, Hilary Mason, and DJ Patil examine practical ways for making ethical data standards part of your work every day. To help you consider all of possible ramifications of your work on data projects, this report includes: A sample checklist that you can adapt for your own procedures Five framing guidelines (the Five C’s) for building data products: consent, clarity, consistency, control, and consequences Suggestions for building ethics into your data-driven culture Now is the time to invest in a deliberate practice of data ethics, for better products, better teams, and better outcomes. Get a copy of this report and learn what it takes to do good data science today.




Data Science Ethics


Book Description

Data science ethics is all about what is right and wrong when conducting data science. Data science has so far been primarily used for positive outcomes for businesses and society. However, just as with any technology, data science has also come with some negative consequences: an increase of privacy invasion, data-driven discrimination against sensitive groups, and decision making by complex models without explanations. While data scientists and business managers are not inherently unethical, they are not trained to weigh the ethical considerations that come from their work - Data Science Ethics addresses this increasingly significant gap and highlights different concepts and techniques that aid understanding, ranging from k-anonymity and differential privacy to homomorphic encryption and zero-knowledge proofs to address privacy concerns, techniques to remove discrimination against sensitive groups, and various explainable AI techniques. Real-life cautionary tales further illustrate the importance and potential impact of data science ethics, including tales of racist bots, search censoring, government backdoors, and face recognition. The book is punctuated with structured exercises that provide hypothetical scenarios and ethical dilemmas for reflection that teach readers how to balance the ethical concerns and the utility of data.




97 Things About Ethics Everyone in Data Science Should Know


Book Description

Most of the high-profile cases of real or perceived unethical activity in data science arenâ??t matters of bad intent. Rather, they occur because the ethics simply arenâ??t thought through well enough. Being ethical takes constant diligence, and in many situations identifying the right choice can be difficult. In this in-depth book, contributors from top companies in technology, finance, and other industries share experiences and lessons learned from collecting, managing, and analyzing data ethically. Data science professionals, managers, and tech leaders will gain a better understanding of ethics through powerful, real-world best practices. Articles include: Ethics Is Not a Binary Conceptâ??Tim Wilson How to Approach Ethical Transparencyâ??Rado Kotorov Unbiased ≠ Fairâ??Doug Hague Rules and Rationalityâ??Christof Wolf Brenner The Truth About AI Biasâ??Cassie Kozyrkov Cautionary Ethics Talesâ??Sherrill Hayes Fairness in the Age of Algorithmsâ??Anna Jacobson The Ethical Data Storytellerâ??Brent Dykes Introducing Ethicizeâ?¢, the Fully AI-Driven Cloud-Based Ethics Solution!â??Brian Oâ??Neill Be Careful with "Decisions of the Heart"â??Hugh Watson Understanding Passive Versus Proactive Ethicsâ??Bill Schmarzo




Ethics of Data and Analytics


Book Description

The ethics of data and analytics, in many ways, is no different than any endeavor to find the "right" answer. When a business chooses a supplier, funds a new product, or hires an employee, managers are making decisions with moral implications. The decisions in business, like all decisions, have a moral component in that people can benefit or be harmed, rules are followed or broken, people are treated fairly or not, and rights are enabled or diminished. However, data analytics introduces wrinkles or moral hurdles in how to think about ethics. Questions of accountability, privacy, surveillance, bias, and power stretch standard tools to examine whether a decision is good, ethical, or just. Dealing with these questions requires different frameworks to understand what is wrong and what could be better. Ethics of Data and Analytics: Concepts and Cases does not search for a new, different answer or to ban all technology in favor of human decision-making. The text takes a more skeptical, ironic approach to current answers and concepts while identifying and having solidarity with others. Applying this to the endeavor to understand the ethics of data and analytics, the text emphasizes finding multiple ethical approaches as ways to engage with current problems to find better solutions rather than prioritizing one set of concepts or theories. The book works through cases to understand those marginalized by data analytics programs as well as those empowered by them. Three themes run throughout the book. First, data analytics programs are value-laden in that technologies create moral consequences, reinforce or undercut ethical principles, and enable or diminish rights and dignity. This places an additional focus on the role of developers in their incorporation of values in the design of data analytics programs. Second, design is critical. In the majority of the cases examined, the purpose is to improve the design and development of data analytics programs. Third, data analytics, artificial intelligence, and machine learning are about power. The discussion of power—who has it, who gets to keep it, and who is marginalized—weaves throughout the chapters, theories, and cases. In discussing ethical frameworks, the text focuses on critical theories that question power structures and default assumptions and seek to emancipate the marginalized.




Leveraging Data Science for Global Health


Book Description

This open access book explores ways to leverage information technology and machine learning to combat disease and promote health, especially in resource-constrained settings. It focuses on digital disease surveillance through the application of machine learning to non-traditional data sources. Developing countries are uniquely prone to large-scale emerging infectious disease outbreaks due to disruption of ecosystems, civil unrest, and poor healthcare infrastructure – and without comprehensive surveillance, delays in outbreak identification, resource deployment, and case management can be catastrophic. In combination with context-informed analytics, students will learn how non-traditional digital disease data sources – including news media, social media, Google Trends, and Google Street View – can fill critical knowledge gaps and help inform on-the-ground decision-making when formal surveillance systems are insufficient.




Data Science for Undergraduates


Book Description

Data science is emerging as a field that is revolutionizing science and industries alike. Work across nearly all domains is becoming more data driven, affecting both the jobs that are available and the skills that are required. As more data and ways of analyzing them become available, more aspects of the economy, society, and daily life will become dependent on data. It is imperative that educators, administrators, and students begin today to consider how to best prepare for and keep pace with this data-driven era of tomorrow. Undergraduate teaching, in particular, offers a critical link in offering more data science exposure to students and expanding the supply of data science talent. Data Science for Undergraduates: Opportunities and Options offers a vision for the emerging discipline of data science at the undergraduate level. This report outlines some considerations and approaches for academic institutions and others in the broader data science communities to help guide the ongoing transformation of this field.




Ethical Data Mining Applications for Socio-Economic Development


Book Description

"This book provides an overview of data mining techniques under an ethical lens, investigating developments in research best practices and examining experimental cases to identify potential ethical dilemmas in the information and communications technology sector"--Provided by publisher.




Responsible Data Science


Book Description

Explore the most serious prevalent ethical issues in data science with this insightful new resource The increasing popularity of data science has resulted in numerous well-publicized cases of bias, injustice, and discrimination. The widespread deployment of “Black box” algorithms that are difficult or impossible to understand and explain, even for their developers, is a primary source of these unanticipated harms, making modern techniques and methods for manipulating large data sets seem sinister, even dangerous. When put in the hands of authoritarian governments, these algorithms have enabled suppression of political dissent and persecution of minorities. To prevent these harms, data scientists everywhere must come to understand how the algorithms that they build and deploy may harm certain groups or be unfair. Responsible Data Science delivers a comprehensive, practical treatment of how to implement data science solutions in an even-handed and ethical manner that minimizes the risk of undue harm to vulnerable members of society. Both data science practitioners and managers of analytics teams will learn how to: Improve model transparency, even for black box models Diagnose bias and unfairness within models using multiple metrics Audit projects to ensure fairness and minimize the possibility of unintended harm Perfect for data science practitioners, Responsible Data Science will also earn a spot on the bookshelves of technically inclined managers, software developers, and statisticians.




Modern Data Science with R


Book Description

From a review of the first edition: "Modern Data Science with R... is rich with examples and is guided by a strong narrative voice. What’s more, it presents an organizing framework that makes a convincing argument that data science is a course distinct from applied statistics" (The American Statistician). Modern Data Science with R is a comprehensive data science textbook for undergraduates that incorporates statistical and computational thinking to solve real-world data problems. Rather than focus exclusively on case studies or programming syntax, this book illustrates how statistical programming in the state-of-the-art R/RStudio computing environment can be leveraged to extract meaningful information from a variety of data in the service of addressing compelling questions. The second edition is updated to reflect the growing influence of the tidyverse set of packages. All code in the book has been revised and styled to be more readable and easier to understand. New functionality from packages like sf, purrr, tidymodels, and tidytext is now integrated into the text. All chapters have been revised, and several have been split, re-organized, or re-imagined to meet the shifting landscape of best practice.




Data Feminism


Book Description

A new way of thinking about data science and data ethics that is informed by the ideas of intersectional feminism. Today, data science is a form of power. It has been used to expose injustice, improve health outcomes, and topple governments. But it has also been used to discriminate, police, and surveil. This potential for good, on the one hand, and harm, on the other, makes it essential to ask: Data science by whom? Data science for whom? Data science with whose interests in mind? The narratives around big data and data science are overwhelmingly white, male, and techno-heroic. In Data Feminism, Catherine D'Ignazio and Lauren Klein present a new way of thinking about data science and data ethics—one that is informed by intersectional feminist thought. Illustrating data feminism in action, D'Ignazio and Klein show how challenges to the male/female binary can help challenge other hierarchical (and empirically wrong) classification systems. They explain how, for example, an understanding of emotion can expand our ideas about effective data visualization, and how the concept of invisible labor can expose the significant human efforts required by our automated systems. And they show why the data never, ever “speak for themselves.” Data Feminism offers strategies for data scientists seeking to learn how feminism can help them work toward justice, and for feminists who want to focus their efforts on the growing field of data science. But Data Feminism is about much more than gender. It is about power, about who has it and who doesn't, and about how those differentials of power can be challenged and changed.