that value (refer to “Dealing with missing values,” earlier in this chapter).
It may seem that this
is translated into
, but that isn’t the case. If you
as a value for the index, R puts
in that place as well. So, in this case, R
keeps the first and second values of
, drops the third, adds one missing value, and
drops the last value of
Combining logical statements
Life would be boring if you couldn’t combine logical statements. If you want to
test whether a number lies within a certain interval, for example, you want to
check whether it’s greater than the lowest value and less than the top value.
Maybe you want to know the games in which Granny scored the fewest or the most
baskets. For that purpose, R has a set of logical operators that — you guessed it —
are nicely vectorized (refer to Table 4-4).
To illustrate, using the knowledge you have now, try to find out the games in
which Granny scored the fewest baskets and the games in which she scored the
1. Create two logical vectors, as follows:
> min.baskets <- baskets.of.Granny == min(baskets.of.Granny)
> max.baskets <- baskets.of.Granny == max(baskets.of.Granny)
tells you whether the value is equal to the minimum, and
tells you whether the value is equal to the maximum.
2. Combine both vectors with the
), as follows:
> min.baskets | max.baskets
 TRUE FALSE FALSE FALSE FALSE TRUE
This method actually isn’t the most efficient way to find those values. You see
how to do things like this more efficiently with the
function in Chapter 13.
But this example clearly shows you how vectorization works for logical operators.
) is another example of the great power of
values in the vector
have caused some trouble already,
so you’d probably like to get rid of them. You know from “Dealing with
undefined outcomes,” earlier in this chapter, that you have to check whether a
value is missing by using the
function. But you need the values that
are not missing values, so invert the logical vector by preceding it with the
operator. To drop the missing values in the vector
, for example, use the
 3 6 2 1
When you’re using R, there’s no way to get around vectorization. After you
understand how vectorization works, however, you’ll save considerable
calculation time and lines of code.
Summarizing logical vectors
You also can use logical values in arithmetic operations as well. In that case, R
. This allows for some pretty interesting constructs.
Suppose that you’re not really interested in finding out the games in which
Granny scored more than Geraldine did, but you want to know how often that
happened. You can use the numerical translation of a logical vector for that
purpose in the
function, as follows:
So, three times, Granny was better than Geraldine. Granny rocks!
In addition, you have an easy way to figure out whether any value in a logical
. Very conveniently, the function that performs that task is called
. To ask R whether Granny was better than Geraldine in any game, use this
We told you that Granny rocks! Well, okay, this result is a bit unfair for
Geraldine, so you should check whether Granny was better than Geraldine in all the
games. The R function you use for this purpose is called — surprise, surprise —
. To find out whether Granny was always better than Geraldine, use the
Still, Granny rocks a bit.
You can use the argument
in the functions
well. By default, both functions return
if any value in the vector argument is
missing (see “Dealing with missing values,” earlier in this chapter).
Powering Up Your Math with Vector Functions
As we suggest throughout this chapter, vectorization is the Holy Grail for every
R programmer. Most beginners struggle a bit with that concept because
vectorization isn’t one little trick, but a way of coding. Using the indices and
vectorized operators, however, can save you a lot of coding and calculation time —
and then you can call a gang of power functions to work on your data, as we show
you in this section.
Why are power functions so helpful? Maybe you’re like us: We’re lazy and
impatient enough to try to translate our code into “something with vectors” as
often as possible. We don’t like to type too much, and we definitely don’t like to
wait for the results. If you can relate, read on.
Using arithmetic vector operations
A third set of arithmetic functions consists of functions in which the outcome is
dependent on more than one value in the vector. Summing a vector with the
function is such an operation. You find an overview of the most important functions
in Table 4-5.
Table 4-5 Vector Operations
What It Does
Calculates the sum of all values in x
Calculates the product of all values in x
Gives the minimum of all values in x
Gives the maximum of all values in x
Gives the cumulative sum of all values in x
Gives the cumulative product of all values in x
Gives the minimum for all values in x from the start of the vector until the position of that value
Gives the maximum for all values in x from the start of the vector until the position of that
Gives for every value the difference between that value and the next value in the vector
Summarizing a vector
You can tell quite a few things about a set of values with one number. If you
want to know the minimum and maximum number of baskets Granny made, for
example, you use the functions
To calculate the sum and the product of all values in the vector, use the
These functions also can take a list of vectors as an argument. If you want to
calculate the sum of all the baskets made by Granny and Geraldine, you can use
the following code:
The same works for the other vector operations in this section.
As we discuss in “Dealing with missing values,” earlier in this chapter, missing
values always return
as a result. The same is true for vector operations as well.
R, however, gives you a way to simply discard the missing values by setting the
. Take a look at the following example:
> x <- c(3,6,2,NA,1)
This argument works in
If you have a vector that contains only missing values and you set the
, the outcome of these functions is set in such a way
that it doesn’t have any effect on further calculations. The sum of missing
, the product is
, the minimum is
, and the maximum is
won’t always generate a warning in such a case, though. Only in the case of
does R tell you that there were no non-missing arguments.
Suppose that after every game, you want to update the total number of
baskets that Granny made during the season. After the second game, that’s the
total of the first two games; after the third game, it’s the total of the first three
games; and so on. You can make this calculation easily by using the cumulative
, as in the following example:
 12 16 21 27 36 39
In a similar way,
gives you the cumulative product. You also can get
the cumulative minimum and maximum with the related functions
. To find the maximum number of baskets Geraldine scored up to any
given game, you can use the following code:
 5 5 5 5 12 12
These functions don’t have an extra argument to remove missing values.
Missing values are propagated through the vector, as shown in the following
 3 3 2 NA NA
The last function we’ll discuss in this section calculates differences between
adjacent values in a vector. You can calculate the difference in the number of
baskets between every two games Granny played by using the following code:
 -8 1 1 3 -6
You get five numbers back. The first one is the difference between the first
and the second game, the second is the difference between the second and the
third game, and so on.
The vector returned by
is always one element shorter than the
original vector you gave as an argument.
The rule about missing values applies here, too. When your vector contains a
missing value, the result from that calculation will be
. So, if you calculate the
difference with the vector
, you get the following result:
 3 -4 NA NA
Because the fourth element of
, the difference between the third and
fourth element and between the fourth and fifth element will be
as well. Just
like the cumulative functions, the
function doesn’t have an argument to
eliminate the missing values.
In Chapter 3 and earlier in this chapter, we mention recycling arguments. Take
a look again at how you calculate the total amount of money Granny and Geraldine
raised (see “Using arithmetic operators,” earlier in this chapter) or how you
combine the first names and last names of three siblings (see Chapter 3). Each
time, you combine a vector with multiple values and one with a single value in a
function. R applies the function, using that single value for every value in the
vector. But recycling goes far beyond these examples.
Any time you give two vectors with unequal lengths to a recycling function,
R repeats the shortest vector as often as necessary to carry out the task you
asked it to perform. In the earlier examples, the shortest vector is only one
Suppose you split up the number of baskets Granny made into two-pointers
> Granny.pointers <- c(10,2,4,0,4,1,4,2,7,2,1,2)
You arrange the numbers in such a way that for every game, first the number
of two-pointers is given, followed by the number of three-pointers.
Now Granny wants to know how many points she’s actually scored this season.
You can calculate that easily with the help of recycling:
> points <- Granny.pointers * c(2,3)
 20 6 8 0 8 3 8 6 14 6 2 6
Now, what did you do here?
1. You made a vector with the number of points for each basket:
2. You told R to multiply that vector by the vector
R multiplied the first number in
by 2, the second by 3, the
third by 2 again, and so on.
3. You put the result in the variable
4. You summed all the numbers in
to get the total number of
In fact, you can just leave out Step 3. The nesting of functions allows you to
do this in one line of code:
> sum(Granny.pointers * c(2,3))
Recycling can be a bit tricky. If the length of the longer vector isn’t exactly a
multiple of the length of the shorter vector, you can get unexpected results.
Now Granny wants to know how much she improved every game. Being lazy,
you have a cunning plan. With
, you calculate how many more or fewer
baskets Granny made than she made in the game before. Then you use the
vectorized division to divide these differences by the number of baskets in the
game. To top it off, you multiply by 100 and round the whole vector. All these
calculations take one line of code:
> round(diff(baskets.of.Granny) / baskets.of.Granny * 100 )
Documents you may be interested
Documents you may be interested