The **range** as a measure of dispersion is simple to calculate. It is obtained by taking the difference between the largest and the smallest values in a data set.

\[ \text{Range} = \text{Largest value} - \text{Smallest value} \]

Let us consider our `students`

data set. We subset the data frame to include numerical data only.

```
students <- read.csv("https://userpage.fu-berlin.de/soga/200/2010_data_sets/students.csv")
quant.vars <- c("age", "nc.score", "height", "weight")
students.quant <- students[quant.vars]
head(students.quant, 10)
```

```
## age nc.score height weight
## 1 19 1.91 160 64.8
## 2 19 1.56 172 73.0
## 3 22 1.24 168 70.6
## 4 19 1.37 183 79.7
## 5 21 1.46 175 71.4
## 6 19 1.34 189 85.8
## 7 21 1.11 156 65.9
## 8 21 2.03 167 65.7
## 9 18 1.29 195 94.4
## 10 18 1.19 165 66.0
```

We use the `range`

function, which returns a vector containing the minimum and maximum of all the given arguments, in combination with the `apply`

function to calculate the minimum and maximum for each particular variable, respectively column, of the data set.

`apply(students.quant, 2, range)`

```
## age nc.score height weight
## [1,] 18 1 135 51.4
## [2,] 64 4 206 116.0
```

Now, to calculate the range for each variable we just have to subtract one row from another.

```
range.studs <- apply(students.quant, 2, range)
range.studs[2,] - range.studs[1,]
```

```
## age nc.score height weight
## 46.0 3.0 71.0 64.6
```

The range, like the mean, has the disadvantage of being influenced by outliers. Consequently, the range is not a good measure of dispersion to use for a data set that contains outliers. Another disadvantage of using the range as a measure of dispersion is that its calculation is based on two values only: the largest and the smallest. All other values in a data set are ignored when calculating the range. Thus, the range is not a very satisfactory measure of dispersion (Mann 2012).