Machine Learning with R Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

How to do it...

Perform the following steps:

  1. First, load the mtcars data into a DataFrame with a variable named mtcars:
        > data(mtcars)  
  1. To obtain the vector range, the range function will return the lower and upper bound of the vector:
        > range(mtcars$mpg)
        Output:
    
        [1] 10.4 33.9
  1. Compute the length of the variable:
        > length(mtcars$mpg)
        Output:
    
        [1] 32  
  1. To obtain the mean of mpg:
        > mean(mtcars$mpg)
        Output:
    
        [1] 20.09062  
  1. To obtain the median of the input vector:
        > median(mtcars$mpg)
        Output:
    
        [1] 19.2
  1. To obtain the standard deviation of the input vector:
        > sd(mtcars$mpg)
        Output:
    
        [1] 6.026948  
  1. To obtain the variance of the input vector:
        > var(mtcars$mpg)
        Output:
    
        [1] 36.3241 
  1. The variance can also be computed with the square of the standard deviation:
        > sd(mtcars$mpg) ^ 2
        Output:
    
        [1] 36.3241  
  1. To obtain the Interquartile Range (IQR):
        > IQR(mtcars$mpg)
        Output:
    
        [1] 7.375 
  1. To obtain the quantile:
        > quantile(mtcars$mpg,0.67)
        Output:
    
        67% 
        21.4
  1. To obtain the maximum of the input vector:
        > max(mtcars$mpg)
        Output:
    
        [1] 33.9  
  1. To obtain the minima of the input vector:
        > min(mtcars$mpg)
        Output:
    
        [1] 10.4  
  1. To obtain a vector with elements that are the cumulative maxima:
        > cummax(mtcars$mpg)
        Output:
    
        [1] 21.0 21.0 22.8 22.8 22.8 22.8 22.8 24.4 24.4 24.4 24.4 24.4
24.4 24.4 24.4 24.4 [17] 24.4 32.4 32.4 33.9 33.9 33.9 33.9 33.9 33.9 33.9 33.9 33.9
33.9 33.9 33.9 33.9
  1. To obtain a vector with elements that are the cumulative minima:
        > cummin(mtcars$mpg)
        Output:
    
        [1] 21.0 21.0 21.0 21.0 18.7 18.1 14.3 14.3 14.3 14.3 14.3 14.3
14.3 14.3 10.4 10.4 [17] 10.4 10.4 10.4 10.4 10.4 10.4 10.4 10.4 10.4 10.4 10.4 10.4
10.4 10.4 10.4 10.4
  1. To summarize the dataset, you can apply the summary function:
        > summary(mtcars)
  1. To obtain a frequency count of the categorical data, take cyl of mtcars as an example:
        > table(mtcars$cyl)
        Output:
    
        4  6  8 
        11  7 14  
  1. To obtain a frequency count of numerical data, you can use a stem plot to outline the data shape; stem produces a stem-and-leaf plot of the given values:
        > stem(mtcars$mpg)
Output:
The decimal point is at the |

10 | 44
12 | 3
14 | 3702258
16 | 438
18 | 17227
20 | 00445
22 | 88
24 | 4
26 | 03
28 |
30 | 44
32 | 49
  1. You can use a histogram of ggplot to plot the same stem-and-leaf figure:
        > library(ggplot2)
        > qplot(mtcars$mpg, binwidth=2)
Histogram of mpg of mtcars
        > pie(table(mtcars$cyl), main="Number of Cylinders")
Pie chart of number of cylinders in cars