56
Here is another warning that pops up regularly and may point to a semantic or
logic error in your code:
> x <- 4
> sqrt(x - 5)
[1] NaN
Warning message:
In sqrt(x - 5) : NaNs produced
Because
x - 5
is negative when
x
is 4, R cannot calculate the square root and
warns you that the square root of a negative number is not a number (
NaN
).
If you’re a mathematician, you may point out that the square root of –1 is
0
- 1i
. R can, in fact, do calculations on complex numbers, but then you have to
define your variables as complex numbers. You can check, for example, the
Help file
?complex
for more information.
Although most warnings result from either semantic or logic errors in your
code, even a simple syntax error can generate a warning instead of an error. If you
want to plot some points in R, you use the
plot()
function, as shown in Chapter
16. It takes an argument
col
to specify the color of the points, but you could
mistakenly try to color the points using the following:
> plot(1:10, 10:1, color=’green’)
If you try this, you get six warning messages at once, all telling you that
color
is probably not the argument name you were looking for:
Warning messages:
1: In plot.window(...) : “color” is not a graphical parameter
2: In plot.xy(xy, type, ...) : “color” is not a graphical parameter
....
Notice that the warning messages don’t point toward the code you typed at
the command line; instead, they point to functions you never used before, like
plot.window()
and
plot.xy()
. Remember: You can pass arguments from one
function to another using the dots argument (see Chapter 8). That’s exactly what
plot()
does here. So,
plot()
itself doesn’t generate a warning, but every function
that
plot()
passes the
color
argument to does.
37
If you get warning or error messages, a thorough look at the Help pages of
the function(s) that generated the error can help in determining what the
reason is for the message you got. For example, at the Help page of
?plot.xy
,
you find that the correct name for the argument is
col
.
So, to summarize, most warnings point to one of the following problems:
The function gave you a result, but for some reason that result may not be
correct.
The function generated an atypical outcome, like
NA
or
NaN
values.
The function couldn’t deal with some of the arguments and ignored them.
Only the last one tells you there’s a problem with your syntax. For the other
ones, you have to examine your code a bit more.
Going Bug Hunting
Although the error message always tells you which line of code generates the
error, it may not be the line of code where things started going wrong. This makes
bug hunting a complex business, but some simple strategies can help you track
down these pesky creatures.
Calculating the logit
To illustrate some bug-hunting strategies in R, we use a simple example. Say,
for example, your colleague wrote two functions to calculate the logit from both
proportions and percentages, but he can’t get them to work. So, he asks you to
help find the bugs. Here’s the code he sends you:
# checks input and does logit calculation
logit <- function(x){
x <- ifelse(x < 0 | x > 1, “NA”, x)
log(x / (1 - x) )
}
# transforms percentage to number and calls logit
logitpercent <- function(x){
x <- gsub(“%”, “”, x)
VB.NET PDF - View PDF with WPF PDF Viewer for VB.NET Barcoding. XImage.Barcode Reader. XImage.Barcode Generator. Hand. Pan around the PDF document. Ⅱ. Select text and image to copy and paste using Ctrl+C and Ctrl+V
search text in pdf using java; search pdf documents for text C# WPF PDF Viewer SDK to view PDF document in C#.NET Barcoding. XImage.Barcode Reader. XImage.Barcode Generator. Hand. Pan around the PDF document. Ⅱ. Select text and image to copy and paste using Ctrl+C and Ctrl+V
pdf find and replace text; pdf text select tool
51
logit(as.numeric(x))
}
Copy and paste this code into the editor, and save the file using, for example,
logitfunc.R
as its name. After that, source the file in R from the editor using either
the
source()
function or the source button or command from the editor of your
choice. Now the function code is loaded in R, and you’re ready to start hunting.
The logit is nothing else but the logarithm of the odds, calculated as
log(x
/ (1-x))
if
x
is the probability of some event taking place. Statisticians use
this when modeling binary data using generalized linear models. If you ever
need to calculate a logit yourself, you can use the function
qlogis()
for that.
To calculate probabilities from logit values, you use the
plogis()
function.
Knowing where an error comes from
Your colleague complained that he got an error when trying the following
code:
> logitpercent(‘50%’)
Error in 1 - x : non-numeric argument to binary operator
Sure enough, but you don’t find the code
1 - x
in the body of
logit
percent()
. So, the error comes from somewhere else. To know from where, you
can use the
traceback()
function immediately after the error occurred, like this:
> traceback()
2: logit(as.numeric(x)) at logitfunc.R#9
1: logitpercent(“50%”)
This
traceback()
function prints what is called the call stack that lead to
the last error. This call stack represents the sequence of function calls, but in
reverse order. The function at the top is the function in which the actual error
is generated.
In this example, R called the
logitpercent()
function, and that function, in
turn, called
logit()
. The traceback tells you that the error occurred inside the
46
logit()
function. Even more, the
traceback()
function tells you that the error
occurred in line 9 of the
logitfunc.R
code file, as indicated by
logitfunc.R#9
in the
traceback()
output.
The call stack gives you a whole lot of information — sometimes too much.
It may point to some obscure internal function as the one that threw the error.
If that function doesn’t ring a bell, check higher in the call stack for a function
you recognize and start debugging from there.
Looking inside a function
Now that you know where the error came from, you can try to find out how the
error came about. If you look at the code, you expect that the
as.numeric()
function in
logitpercent()
sends a numeric value to the
logit()
function. So, you
want to check what’s going on in there.
In ancient times, programmers debugged a function simply by letting it print
out the value of variables they were interested in. You can do the same by
inserting a few
print()
statements in the
logit()
function. This way, you can’t
examine the object though, and you have to add and delete print statements at
every point where you want to peek inside the function. Luckily, we’ve passed the
age of the dinosaurs; R gives you better methods to check what’s going on.
Telling R which function to debug
You can step through a function after you tell R you want to debug it using the
debug()
function, like this:
> debug(logit)
From now on, R will switch to the browser mode every time that function is
called from anywhere in R, until you tell R explicitly to stop debugging or until you
overwrite the function by sourcing it again. To stop debugging a function, you
simply use
undebug(logit)
.
47
If you want to step through a function only once, you can use the function
debugonce()
instead of
debug()
. R will go to browser mode the next time the
function is called, and only that time — so you don’t need to use
undebug()
to
stop debugging.
If you try the function
logitpercent()
again after running the code
debug(logit)
, you see the following:
> logitpercent(‘50%’)
debugging in: logit(as.numeric(x))
debug at D:/RForDummies/Ch10/logitfunc.R#2: {
x <- ifelse(x < 0 | x > 1, “NA”, x)
log(x/(1 - x))
}
Browse[2]>
You see that the prompt changed. It now says
Browse[2]
. This prompt tells
you that you’re browsing inside a function.
The number indicates at which level of the call stack you’re browsing at that
moment. Remember from the output of the
traceback()
function that the
logit()
function occurred as the second function on the call stack. That’s the
number
2
in the output above.
The additional text above the changed prompt gives you the following
information:
The line from where you called the function — in this case, the line
logit(as.numeric(x))
from the
logitpercent()
function
The file or function that you debug — in this case, the file
logitfunc.R
, starting
at the second line
Part of the code you’re about to browse through
Stepping through the function
44
When you’re in browser mode, you can use any R code you want in order to
check the state of different objects. You can browse through the function now with
the following commands:
To run the next line of code, type n and press Enter. R enters the step-through
mode. To run the subsequent lines of code line by line, you don’t have to type n
anymore (although you still can). Just pressing Enter suffices.
To run the remaining part of the code, type c and press Enter.
To exit browser mode, type Q and press Enter.
If you want to look at an object that’s named like any of the special browse
commands, you have to specifically print it out, using either
print(n)
or
str(n)
.
You can try it out yourself now. Just type n in the console, press Enter, and
you see the following:
Browse[2]> n
debug at D:/RForDummies/Ch10 /logitfunc.R#3: x <- ifelse(x < 0 | x > 1,
“NA”, x)
R now tells you what line it will run next. Because this is the first line in your
code,
x
still has the value that was passed by the
logitpercent()
function.
It’s always smart to check whether that value is what you expect it to be.
The
logitpercent()
function should pass the value
0.50
to
logit()
, because
this is the translation of 50 percent into a proportion. However, if you look at
the value of
x
, you see the following:
Browse[2]> str(x)
num 50
Okay, it is a number, but it’s 100 times larger than it should be. So, in the
logitpercent()
function, your colleague made a logical error and forgot to divide
by 100. If you correct that in the editor window and then save and source the file
again, the test command gives the correct answer:
44
> logitpercent(‘50%’)
[1] 0
Start browsing from within the function
This still doesn’t explain the error. Your colleague intended to return
NA
if the
number wasn’t between 0 and 1, but the function doesn’t do that. It’s the
ifelse()
line in the code where the number is checked, so you know where the problem lies.
You can easily browse through the
logit()
function until you reach that point,
but when your function is larger, that task can become tedious. R allows you to
start the browser at a specific point in your code if you insert a
browser()
statement at that point. For example, to start the browser mode right after the
ifelse()
line, you change the body of the
logit()
function, as in the following
code, and source it again:
logit <- function(x){
x <- ifelse(x < 0 | x > 1, “NA”, x)
browser()
log(x / (1 - x) )
}
By sourcing the same function again, you implicitly stop debugging the
function. That’s why you don’t have to un-debug the function explicitly using
undebug(logit)
.
If you now try to run this function again, you see the following:
> logit(50)
Called from: logit(50)
Browse[1]>
You get less information than you do when you use
debug()
, but you can use
the browser mode in exactly the same way as with
debug()
.
You can put a
browser()
statement inside a loop as well. If you use the
command c to run the rest of the code, in this case, R will carry out the
Documents you may be interested
Documents you may be interested