Free

# Computational and Inferential Thinking: The Foundations of Data Science

Free
Book Description

Data Science is about drawing useful conclusions from large and diverse data sets through exploration, prediction, and inference. Exploration involves identifying patterns in information. Prediction involves using information we know to make informed guesses about values we wish we knew. Inference involves quantifying our degree of certainty: will those patterns we found also appear in new observations? How accurate are our predictions? Our primary tools for exploration are visualizations and descriptive statistics, for prediction are machine learning and optimization, and for inference are statistical tests and models.

Statistics is a central component of data science because statistics studies how to make robust conclusions with incomplete information. Computing is a central component because programming allows us to apply analysis techniques to the large and diverse data sets that arise in real-world applications: not just numbers, but text, images, videos, and sensor readings. Data science is all of these things, but it more than the sum of its parts because of the applications. Through understanding a particular domain, data scientists learn to ask appropriate questions about their data and correctly interpret the answers provided by our inferential and computational tools.

This is the textbook for the Foundations of Data Science class at UC Berkeley. Read online at the book's website.

• Introduction
• Data Science
• Introduction
• Computational Tools
• Statistical Techniques
• Why Data Science?
• Plotting the Classics
• Literary Characters
• Another Kind of Character
• Causality and Experiments
• John Snow and the Broad Street Pump
• Snow’s “Grand Experiment”
• Establishing Causality
• Randomization
• Endnote
• Programming in Python
• Expressions
• Numbers
• Names
• Example: Growth Rates
• Call Expressions
• Data Types
• Strings
• String Methods
• Comparisons
• Sequences
• Arrays
• Ranges
• More on Arrays
• Tables
• Sorting Rows
• Selecting Rows
• Example: Population Trends
• Example: Trends in Gender
• Visualization
• Categorical Distributions
• Numerical Distributions
• Overlaid Graphs
• Functions and Tables
• Applying Functions to Columns
• Classifying by One Variable
• Cross-Classifying
• Joining Tables by Columns
• Bike Sharing in the Bay Area
• Randomness
• Conditional Statements
• Iteration
• The Monty Hall Problem
• Finding Probabilities
• Sampling
• Empirical Distributions
• Sampling from a Population
• At the Roulette Table
• Empirical Distibution of a Statistic
• Testing Hypotheses
• Jury Selection
• Terminology of Testing
• Error Probabilities
• Example: Deflategate
• Estimation
• Percentiles
• The Bootstrap
• Confidence Intervals
• Using Confidence Intervals
• Why the Mean Matters
• Properties of the Mean
• Variability
• The SD and the Normal Curve
• The Central Limit Theorem
• The Variability of the Sample Mean
• Choosing a Sample Size
• Prediction
• Correlation
• The Regression Line
• The Method of Least Squares
• Least Squares Regression
• Visual Diagnostics
• Numerical Diagnostics
• Inference for Regression
• A Regression Model
• Inference for the True Slope
• Prediction Intervals
• Classification
• Nearest Neighbors
• Training and Testing
• Rows of Tables
• Implementing the Classifier
• The Accuracy of the Classifier
• Comparing Two Samples
• Two Categorical Distributions
• A/B Testing
• Causality
• Updating Predictions
• A "More Likely Than Not" Binary Classifier
• Making Decisions
No review for this book yet, be the first to review.
No comment for this book yet, be the first to comment
You May Also Like
Also Available On
Categories
Curated Lists
• #### Free Machine Learning Books

11 Books

Pattern Recognition and Machine Learning (Information Science and Statistics)
by Christopher M. Bishop
Data mining
by I. H. Witten
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
by Various
See more...
• #### Free Chemistry Textbooks

9 Books

CK-12 Chemistry
by Various
Concept Development Studies in Chemistry
by John Hutchinson
An Introduction to Chemistry - Atoms First
by Mark Bishop
See more...
• #### Free Mathematics Textbooks

21 Books

Microsoft Word - How to Use Advanced Algebra II.doc
by Jonathan Emmons
Advanced Algebra II: Activities and Homework
by Kenny Felder
de2de
by
See more...
• #### Free Children Books

38 Books

The Sun Who Lost His Way
by
Tania is a Detective
by Kanika G
Firenze_s-Light
by
See more...
• #### Free Java Books

10 Books

Java 3D Programming
by Daniel Selman
The Java EE 6 Tutorial
by Oracle Corporation
JavaKid811
by
See more...
• #### Sts Peter And Paul Preparatory School eBook List

8 Books

Jamaica Primary Social Studies 2nd Edition Student's Book 4
by Eulie Mantock, Trineta Fendall, Clare Eastland