Vectorizing your functions
Vectorized functions are a very useful feature of R, but programmers who are
used to other languages often have trouble with this concept at first. A vectorized
function works not just on a single value, but on a whole vector of values at the
same time. Your natural reflex as a programmer may be to loop over all values of
the vector and apply the function, but vectorization makes that unnecessary. Trust
us: When you start using vectorization in R, it’ll help simplify your code.
To try vectorized functions, you have to make a vector. You do this by using
function, which stands for concatenate. The actual values are separated by
Here’s an example: Suppose that Granny plays basketball with her friend
Geraldine, and you keep a score of Granny’s number of baskets in each game. After
six games, you want to know how many baskets Granny has made so far this
season. You can put these numbers in a vector, like this:
> baskets.of.Granny <- c(12,4,4,6,9,3)
 12 4 4 6 9 3
To find the total number of baskets Granny made, you just type the following:
You could get the same result by going over the vector number by number,
adding each new number to the sum of the previous numbers, but that method
would require you to write more code and it would take longer to calculate. You
won’t notice it on just six numbers, but the difference will be obvious when you
have to sum a few thousand of them.
In this example of vectorization, a function uses the complete vector to give
you one result. Granted, this example is trivial (you may have guessed that
would accomplish the same goal), but for other functions in R, the vectorization
may be less obvious.
A less obvious example of a vectorized function is the
function. If you
make a vector with the first names of the members of your family,
the last name to all of them with one command, as in the following example:
> firstnames <- c(“Joris”, “Carolien”, “Koen”)
> lastname <- “Meys”
 “Joris Meys” “Carolien Meys” “Koen Meys”
R takes the vector
and then pastes the
into each value.
How cool is that? Actually, R combines two vectors. The second vector — in this
— is only one value long. That value gets recycled by the
function as long as necessary (for more on recycling, turn to Chapter 4).
You also can give R two longer vectors, and R will combine them element by
element, like this:
> authors <- c(“Andrie”,”Joris”)
> lastnames <- c(“de Vries”,”Meys”)
 “Andrie de Vries” “Joris Meys”
No complicated code is needed. All you have to do is make the vectors and put
them in the function. In Chapter 5, we give you more information on the power of
Putting the argument in a function
Most functions in R have arguments that give them more information about
exactly what you want them to do. If you use
, you give the
function a value:
. Indeed, the first
default argument of the
function is called
. You can check this yourself by
looking at the Help file of
In R, you have two general types of arguments:
Arguments with default values
Arguments without default values
If an argument has no default value, the value may be optional or required. In
general, the first argument is almost always required. Try entering the following:
R tells you that it needs the argument
Error in .Internal(print.default(x, digits, quote, na.print, print.gap, :
‘x’ is missing
You can specify an argument like this:
> print(x = “Isn’t this fun?”)
Sure it is. But wait — when you entered the
in Chapter 2, you didn’t add the name of the argument, and the function worked.
That’s because R knows the names of the arguments and just assumes that you
give them in exactly the same order as they’re shown in the usage line of the Help
page for that function. (For more information on reading the Help pages, turn to
If you type the values for the arguments in Help-page order, you don’t have
to specify the argument names. You can list the arguments in any order you
want, as long as you specify their names.
Try entering the following example:
> print(digits=4, x = 11/7)
You may wonder where the
argument comes from, because it’s not
explained in the Help page for
. That’s because it isn’t an argument of the
function itself, but of the function
. Take a look again at
the error you got if you typed print(). R mentions the
instead of the
is called a generic function. It determines the type of the
object that’s given as an argument and then looks for a function that can deal
with this type of object. That function is called the method for the specific
object type. In case there is no specific function, R will call the default method.
This is the function that works on all object types that have no specific
method. In this case, that’s the
function. Keep in mind that a
default method doesn’t always exist. We explain this in more detail in Chapter
8. For now, just remember that arguments for a function can be shown on the
Help pages of different methods.
If you forgot which arguments you can use, you can find that information in
the Help files. Don’t forget to look at the arguments of specific methods as
well. You often find a link to those specific methods at the bottom of the Help
By default, R keeps track of all the commands you use in a session. This
tracking can come in handy if you need to reuse a command you used earlier or
want to keep track of the work you did before. These previously used commands
are kept in the history.
You can browse the history from the command line by pressing the up-arrow
and down-arrow keys. When you press the up-arrow key, you get the commands
you typed earlier at the command line. You can press Enter at any time to run the
command that is currently displayed.
Saving the history is done using the
function. By default, R
saves the history in a file called
in your current working directory. This
file is automatically loaded again the next time you start R, so you have the history
of your previous session available.
If you want to use another filename, use the argument
> savehistory(file = “Chapter3.Rhistory”)
Be sure to add the quotation marks around the filename.
You can open an Explorer window and take a look at the history by opening
the file in a normal text editor, like Notepad.
You don’t need to use the file extension
— R doesn’t care about
extensions that much. But using
as a file extension will make it easier to
recognize as a history file.
If you want to load a history file you saved earlier, you can use the
function. This will replace the history with the one saved in the
file in the current working directory. If you want to load the history from
a specific file, you use the
argument again, like this:
Keeping Your Code Readable
You may wonder why you should bother about reading code. You wrote the
code yourself, so you should know what it does, right? You do now, but will you be
able to remember what you did if you have to redo that analysis six months from
now on new data? Besides, you may have to share your scripts with other people,
and what seems obvious to you may be far less obvious for them.
Some of the rules you’re about to see aren’t that strict. In fact, you can get
away with almost anything in R, but that doesn’t mean it’s a good idea. In this
section, we explain why you should avoid some constructs even though they aren’t
Following naming conventions
R is very liberal when it comes to names for objects and functions. This
freedom is a great blessing and a great burden at the same time. Nobody is
obliged to follow strict rules, so everybody who programs something in R can
basically do as he or she pleases.
Choosing a correct name
Although almost anything is allowed when giving names to objects, there are
still a few rules in R that you can’t ignore:
Names must start with a letter or a dot. If you start a name with a dot, the
second character can’t be a digit.
Names should contain only letters, numbers, underscore characters (_),
and dots (.). Although you can force R to accept other characters in names, you
shouldn’t, because these characters often have a special meaning in R.
You can’t use the following special keywords as names:
R is case sensitive, which means that, for R,
different objects. If R tells you it can’t find an object or function and you’re sure
it should be there, check to make sure you used the right case.
Choosing a clear name
When Joris was young, his parents bought a cute little lamb that needed a
name. After much contemplation, he decided to call it Blacky. Never mind that the
lamb was actually white and its name made everybody else believe that it was a
dog; Joris thought it was a perfect name.
Likewise, calling the result of a long script
may be a bit confusing for
the person who has to read your code later on, even if it makes all kinds of sense
to you. Remember: You could be the one who, in three months, is trying to figure
out exactly what you were trying to achieve. Using descriptive names will allow you
to keep your code readable.
Although you can name an object whatever you want, some names will cause
less trouble than others. You may have noticed that none of the functions we’ve
used until now are mentioned as being off-limits (see the preceding section). That’s
right: If you want to call an object
, you’re free to do so:
> paste <- paste(“This gets”,”confusing”)
 “This gets confusing”
 “Don’t you think?”
R will always know perfectly well when you want the vector
you need the function
. That doesn’t mean it’s a good idea to use the same
name for both items, though. If you can avoid giving the name of a function to an
object, you should.
One situation in which you can really get into trouble is when you use
as an object name. You can do it, but you’re likely to break code
at some point. Although it’s a very bad idea,
are too often used as
, respectively. But
are not reserved
keywords. So, if you change them, R will first look for the object
then try to replace
. And any code that still expects
will fail from this point on. Never use
, not as an object name and not as
Choosing a naming style
If you have experience in programming, you’ve probably heard of camel case,
before. It’s a way of giving longer names to objects and functions. You capitalize
every first letter of a word that is part of the name to improve the readability. So,
you can have a
and still be able to read it.
Contrary to many other languages, R doesn’t use the dot (
) as an operator,
so the dot can be used in names for objects as well. This style is called dotted
style, where you write everything in lowercase and separate words or terms in a
name with a dot. In fact, in R, many function names use dotted style. You’ve met a
function like this earlier in the chapter:
. Some package authors
also use an underscore instead of a dot.
is the default method for the
on the arguments is given on the Help page for
You would expect the function to be called
, but it’s called
without the dot. Likewise, you would expect a function
, but instead it’s
will give you all the
information on the version of R you’re running, including the platform you’re
running it on. Sometimes, the people writing R use camel case: If you want
to get only the version number of R, you have to use the function
. Some package authors choose to use underscores (
of dots for separation of the words; this style is used often within some
packages we discuss later in this book (for example, the
You’re not obligated to use dotted style; you can use whatever style you want.
We use dotted style throughout this book for objects and camel case for functions.
R uses dotted style for many base functions and objects, but because some parts of
the internal mechanisms of R rely on that dot, you’re safer to use camel case for
functions. Whenever you see a dot, though, you don’t have to wonder what it does
— it’s just part of the name.
The whole naming issue reveals one of the downsides of using open-source
software: It’s written by very intelligent and unselfish people with very strong
Documents you may be interested
Documents you may be interested