更新时间:2021-06-10 19:13:14
coverpage
Title Page
About Packt
Why subscribe?
Packt.com
Contributors
About the authors
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Getting Started with Data Science and R
Introduction to data science
Key components of data science
Computer science
Predictive analytics (machine learning)
Domain knowledge
Active domains of data science
Finance
Healthcare
Pharmaceuticals
Government
Manufacturing and retail
Web industry
Other industries
Solving problems with data science
Using R for data science
Key features of R
Our first R program
UN development index
Summary
Quiz
Descriptive and Inferential Statistics
Measures of central tendency and dispersion
Measures of central tendency
Calculating mean median and mode with base R
Measures of dispersion
Useful functions to draw automated summaries
Statistical hypothesis testing
Running t-tests with R
Decision rule – a brief overview of the p-value approach
Be careful
Running z-tests with R
Elaborating a little longer
A/B testing – a brief introduction and a practical example with R
Data Wrangling with R
Introduction to data wrangling with R
Data types formats and sources
Data extraction transformation and load
Basic tools of data wrangling
Using base R for data manipulation and analysis
Applying families of functions
Aggregation functions
Merging DataFrames
Using tibble and dplyr for data manipulation
Basic dplyr usage
Using select
Filtering with filter
Using arrange for sorting
Summarise
Sampling data
The tidyr package
Converting wide tables into long tables
Joining tables
dbplyr – databases and dplyr
Using data.table for data manipulation
Grouping operations
Adding a column
Ordering columns
What is the advantage of searching using key by?
Creating new columns in data.table
Deleting a column
Pivots on data.table
The melt functionality
Reading and writing files with data.table
A special note on dates and/or time
Miscellaneous topics
Checking data quality
Reading other file formats – Excel SAS and other data sources
On-disk formats
Working with web data
Web APIs
Tutorial – looking at airline flight times data
KDD Data Mining and Text Mining
Good practices of KDD and data mining
Stages of KDD
Scraping a dwarf name
Retrieving text from the web