Introduction

The following examples and exercises should give you a first look at what R does and how it works.

R is a command-line program, which just means commands are entered line-by-line at the prompt. Being a programming language it is very finicky. Everything has to be entered just right - including case-sensitivity.

There are two ways of entering commands: Either typing them out carefully into the “Console Window” (the lower-left window in Rstudio) and hitting Enter. The alternative approach is to write and edit lines in the script window (upper-left window in Rstudio), and “passing” the code into the console by hitting Ctrl-Enter.

For the most part, we will be only doing single commands and the command window is sufficient. But in general, it is smarter to do all of your coding in a script window, and then save the raw code file as a text document, which you can revisit and re-run at any point later.

R is a calculator

1+2
## [1] 3
3^6
## [1] 729
sqrt((20-19)^2 + (19-19)^2 + (19-18)^2)/2
## [1] 0.7071068
12345*54312
## [1] 670481640

and so on.

Assigning variable names

The assignment operator is <-. It’s supposed to look like an arrow pointing left.

X <- 5      # sets X equal to 5

Using the assignment operator sets the value of X but doesn’t print any output. To see what X is, you need to type

X
## [1] 5

Note also that X now appears in the upper-right panel of Rstudio, letting you know that there is now an object in memory called X.

Now you can use X as if it were a number

X*2
## [1] 10
X^X
## [1] 3125

Note that you can name a variable ANYTHING, as long as it starts with a character.

Fred <- 5
Nancy <- Fred*2
Fred + Nancy
## [1] 15

Vectors

Obviously, X can be many things more than just a single number. The most important kind of object in R is a “vector”, which is a series of numbers (and therefore resembles “data”). We create it as follows:

X<-c(3,4,5)     # sets X equal to the vector (3,4,5)
X
## [1] 3 4 5

c() is a function - a very very useful function that creates “vectors”. In all functions, arguments are passed within parentheses.

Now, let’s do some arithmetic with this vector:

X + 1
## [1] 4 5 6
X*2
## [1]  6  8 10
X^2
## [1]  9 16 25
((X+X^2/2)/X)^2
## [1]  6.25  9.00 12.25

Note that in all of these cases, the arithmetic operations are performed on a term-by-term basis.

You can also make a vector of character strings:

Names <- c("Alice", "Boris", "Chaozhi", "Diego", "Eliza")
Names
## [1] "Alice"   "Boris"   "Chaozhi" "Diego"   "Eliza"

How R deals with vectors

Usual operators ‘+’, ‘-’, ’*‘and’/’ operate on vectors element by element, so we can try:

X+X
X-X
X*X
X^2
X/X

Some ‘semi-intuitive’ things that R does:

X+1
## [1] 4 5 6
Y <- 1
X+Y
## [1] 4 5 6

but R does not like this (though it will swallow it reluctantly):

Y <- c(1,2)
X+Y
## Warning in X + Y: longer object length is not a multiple of shorter object
## length
## [1] 4 6 6

Making Something Worth Plotting

x <- seq(0,4*pi,len=1000)
y <- sin(x)
plot(x,y)

plot is a function with lots and lots of options (see ?plot). Try the following:

plot(x,y, type="l")
plot(x,y, type="l", col=2, main="My plot of cos(x)")

# points adds points
x2 <- pi*(1:16)/4
points(x2, sin(x2))
points(x2, cos(x2), pch=19, col=2)

“asp” maintains a fixed ratio between the x and y axes

plot(x,cos(x), type="l", main="My plot of cos(x) and sin(x)", asp=1)
lines(x,sin(x), col=2)

Exercise: Plot a cosine curve on top of the sine curve using the lines() command.

The curve() function

a shortcut for plotting a funciton is the “curve” function

curve(cos(x), xlim=c(0,12))
curve(sin(x), col=2, add=T)
curve(cos(x-pi/4), col="darkblue", lty=2, add=T)

Exercise: Make a vector called Data that contains, in order, the number of people living in your household, the number of pets you have ever owned, your age, and your shoesize. Make another vector called Names, which contains short names for each of these numbers

Vectors and Functions

Now that we have some data, we can study them with simple functions:

Data
## [1]  3.0  8.0 40.0  9.5
Names
## [1] "Household" "Pets"      "Age"       "Shoesize"
sum(Data)
## [1] 60.5
length(Data)
## [1] 4

Note that if you try to do something mathematical with characters, you get an error.

sum(Names)
## Error in sum(Names): invalid 'type' (character) of argument

You can learn more about how functions work by using ?, e.g. ?sum, ?length.

A help window will appear (in the lower right panel in Rstudio) with all sorts of information (only occasionally useful) about the functions. Often, there are some useful examples at the bottom of the help window.

Exercise: Create an object called N which is the length of your Data, and use that object with the sum() function to calculate the arithmetic mean of your data.

Follow-up: Try applying the mean() function to Data. Does it agree with your calculation above?

Plotting of basic data

It is easy (but not very interesting) to plot the Data:

plot(Data)

Notice that the figure appears in the lower right panel of the RStudio screen.

A “barplot” might be more a little nicer:

barplot(Data)

And we can give labels to the barplot by adding a names “argument” (note, that “argument” refers to any information that you give to a function).

barplot(Data, names=Names)

Note that you can save the image to a file using the Export button in the Rstudio plotting panel.

Here’s a pie chart:

pie(Data, labels=Names)

Note: in barplot() the labeling is names, in pie() the labeling is labels! WHY?! and HOW CAN WE KNOW?!

We have to check the help files!

?pie
?barplot

As for “why” - its because for all of its advantages, R is extremely inconsistent.

Exercise: The International Rhino Federation estimates that there 17,800 rhinoceroses (Rhinocerotidae spp.) living in the wild in Africa and Asia. Here is a breakdown of their numbers:

Species Population
1 Black 3610
2 White 11330
3 Sumatran 300
4 Javan 60
5 Indian 2500

Enter these data into R objects and make a barplot and pie chart of these data.