42
18
Another function is scan() which allows us to read data in several formats. Usually we use
scan() to parse R scripts, but we can also use to import text (characters)
2.3.1 Reading tables
If the data we want to import is in some tabular format we can use the set of functions to
read tables like read.table() and its sister functions |e.g. read.csv(), read.delim(),
read.fwf()|. These functions read a le in table format and create a data frame from it,
with rows corresponding to cases, and columns corresponding to elds in the le.
Functions to read les in tabular format
Function
Description
read.table()
main function to read le in table format
read.csv()
reads csv les separated by a comma ","
read.csv2()
reads csv les separated by a semicolon ";"
read.delim()
reads les separated by tabs "nt"
read.delim2() similar to read.delim()
read.fwf()
read xed width format les
Let’s see a simple example reading a le from the Australian radio broadcaster ABC (http:
//www.abc.net.au/radio/). Inparticular, , we’ll read a csv v le that contains data from
ABC’s radio stations. Such le is located at:
http://www.abc.net.au/local/data/public/stations/abc-local-radio.csv
To import the le abc-local-radio.csv, we can use either read.table() or read.csv()
(just choose the right parameters). Here’s the code to read the le with read.table():
# abc radio stations data URL
abc = = "http://www.abc.net.au/local/data/public/stations/abc-local-radio.csv"
# read data from URL
radio = = read.table(abc, header = TRUE, sep = = ",", stringsAsFactors = FALSE)
In this case, the location of the le is dened in the object abc which is the rst argument
passed to read.table(). Then we choose other arguments such as header = TRUE, sep =
",", and stringsAsFactors = FALSE. The argument header = TRUE indicates that therst
row of the le contains the names of the columns. The separator (a comma) is specifcied by
sep = ",". And nally, to keep the character strings in the le as "character" in the data
frame, we use stringsAsFactors = FALSE.
If everything went ne during the le reading operation, the next thing to do is to chek the
size of the created data frame using dim():
CC BY-NC-SA 3.0 GastonSanchez
Handling and Processing Strings in R