## Thursday, November 29, 2007

### Convert factors to numbers

If you have a vector of factors it is easy to get the factor level; however, I always forget how to extract the factor value. I ran into the answer here.
`> x<-factor(c(round(rnorm(10),2),"A","B",NA))> x [1] 1.61  1.12  1.26  0.09  -0.13 0.16  -0.03 -0.1  0.09  -0.47  [11] A     B    Levels: -0.03 0.09 -0.1 -0.13 0.16 -0.47 1.12 1.26 1.61 A B> as.numeric(x) [1]  9  7  8  2  4  5  1  3  2  6 10 11 NA> as.numeric(levels(x)[x]) [1]  1.61  1.12  1.26  0.09 -0.13  0.16 -0.03 -0.10  0.09 -0.47     [11]   NA    NA   NAWarning message:NAs introduced by coercion `

**UPDATE**
And another method mentioned in the comments (that I like better):

> as.numeric(as.character(x))

## Thursday, November 15, 2007

### Preparing plots for publication

The plotting capabilities of R are excellent; however, when I am preparing a figure for publication, I often need to combine multiple plots or add objects (e.g., arrows or text) to an existing plot. While this can be accomplished in R, my patience for tweaking layout parameters tends to run out quickly. I searched around and found a nice solution using open source software (an example with Matlab plots is described here).

1) Create the desired figure in R and export as an eps file:

`library(ggplot2)postscript("myfig.eps",horizontal=FALSE, paper="special",height=10,width=10)qplot(x,y,data=mydat)dev.off()`

2) Install the necessary software on your computer, and then convert the image file to one more easily edited.

`#install softwaresudo aptitude install pstoedit tgif#clean up the eps file so it is more easily readeps2eps myfig.eps myfig2.eps#convert the eps file to a tgif objectpstoedit -f tgif myfig2.eps myfig2.obj`

3) Edit the new file myfig2.obj in tgif.

## Thursday, October 18, 2007

### Approximate sunrise and sunset times

This function is not perfect, but it does a reasonable job estimating sunrise and sunset times for my field site. If more accurate data are required, try here. Note, the command to calculate day of year is: strptime(x, "%m/%d/%Y")\$yday+1
``````suncalc<-function(d,Lat=48.1442,Long=-122.7551){
## d is the day of year
## Lat is latitude in decimal degrees
## Long is longitude in decimal degrees (negative == West)

##This method is copied from:
##Teets, D.A. 2003. Predicting sunrise and sunset times.
##  The College Mathematics Journal 34(4):317-321.

## At the default location the estimates of sunrise and sunset are within
## seven minutes of the correct times (http://aa.usno.navy.mil/data/docs/RS_OneYear.php)
## with a mean of 2.4 minutes error.

## Function to convert degrees to radians

R=6378

##Radians between the xy-plane and the ecliptic plane

## Calculate offset of sunrise based on longitude (min)
## If Long is negative, then the mod represents degrees West of
## a standard time meridian, so timing of sunrise and sunset should
timezone = -4*(abs(Long)%%15)*sign(Long)

## The earth's mean distance from the sun (km)
r = 149598000

theta = 2*pi/365.25*(d-80)

z.s = r*sin(theta)*sin(epsilon)
r.p = sqrt(r^2-z.s^2)

t0 = 1440/(2*pi)*acos((R-z.s*sin(L))/(r.p*cos(L)))

that = t0+5

## Adjust "noon" for the fact that the earth's orbit is not circular:
n = 720-10*sin(4*pi*(d-80)/365.25)+8*sin(2*pi*d/365.25)

## now sunrise and sunset are:
sunrise = (n-that+timezone)/60
sunset = (n+that+timezone)/60

return(list("sunrise" = sunrise,"sunset" = sunset))
}

``````

## Sunday, October 14, 2007

### Convert polar coordinates to Cartesian

When I want to calculate the coordinates of a location (e.g., a nest or burrow) based on distance and bearing from a grid point, this function helps me avoid writing down SOH-CAH-TOA every time. Just note that the bearing in this case is from the grid point (known location) to the unknown location.
`polar2cart<-function(x,y,dist,bearing,as.deg=FALSE){  ## Translate Polar coordinates into Cartesian coordinates  ## based on starting location, distance, and bearing  ## as.deg indicates if the bearing is in degrees (T) or radians (F)    if(as.deg){    ##if bearing is in degrees, convert to radians    bearing=bearing*pi/180  }    newx<-x+dist*sin(bearing)  ##X  newy<-y+dist*cos(bearing)  ##Y  return(list("x"=newx,"y"=newy))}##Exampleoldloc=c(0,5) bearing=200 #degreesdist=5newloc<-polar2cart(oldloc[1],oldloc[2],dist,bearing,TRUE)plot(oldloc[1],oldloc[2],xlim=c(-10,10),ylim=c(-10,10))points(newloc\$x,newloc\$y,col="red")`

## Wednesday, October 3, 2007

### Reorder factor levels

Very often, especially when plotting data, I need to reorder the levels of a factor because the default order is alphabetical. There must be many ways of reordering the levels; however, I always forget which package to look in. A direct way of reordering, using standard syntax is as follows:
`## generate datax = factor(sample(letters[1:5],100, replace=TRUE))print(levels(x))  ## This will show the levels of x are "Levels: a b c d e"## To reorder the levels:## note, if x is not a factor use levels(factor(x))x = factor(x,levels(x)[c(4,5,1:3)])print(levels(x))  ## Now "Levels: d e a b c"`
In this case, the level order could be set in the first line; however, if there are many levels (and you don't want to type out all of the levels explicitly), the above method is preferred. Again, if there is a better way to do this, please let me know in the comments.

## Wednesday, September 5, 2007

### Extract objects from a list

When using Rmpi to send processes to many nodes, it is convenient to create a list of tasks that are assigned to nodes as they become available. In my case, I was working through a large factorial set of simulations and needed to use a unique set of variable values for each task. Rather than assign these variables individually within the slave function, I decided to extract the variables directly from the list using a variation of the following code:
`mylist=(list(a=1,b=2,c="string1",d=list("r"=2,"z"="string2")))for(i in 1:length(mylist)){  ##first extract the object value  tempobj=mylist[[i]]  ##now create a new variable with the original name of the list item  eval(parse(text=paste(names(mylist)[[i]],"= tempobj")))}> print(a)[1] 1> print(b)[1] 2> print(c)[1] "string1"> print(d)\$r[1] 2\$z[1] "string2"`

If there is a more elegant way to do this, please let me know in the comments.

### Parallel computing in R

Lately I have been running R in parallel on a Beowulf cluster using the Rmpi package. Instrumental in my ability to do this were two excellent tutorials:

The Division of Biostatistics at UCSF provides some of the background required to get things running on the Linux side of things. (Note: sometimes the url for this site comes up with an extra "biostat" in the address -- just delete that and the page should load.

The Acadia Centre for Mathematical Modelling and Computation has a variety of sample scripts to point you in the right direction. Once you read through the code, it becomes very easy to set up a template you can re-use in the future.

## Thursday, August 23, 2007

### Execute system commands within an R Script

To execute a system command from within an R script, just use system(). This is handy for zipping large output files on the fly.
`write.csv(mydat, "mydat.csv")system("gzip mydat.csv", wait=FALSE)`

## Friday, August 17, 2007

### Offset in glm ()

To add an offset to the linear predictor of a generalized linear model (or models from the survival package such as coxph and clogit), use offset(x) in the formula. This will add an offset to the linear predictor with known coefficient 1.

## Thursday, August 16, 2007

### Including arguments in R CMD BATCH mode

When you have multiple computers or processors at your disposal and wish to run the same script with different arguments, use the following at the command line (here described for Linux; remove the linebreak, it is just there for display purposes):
`\$ R CMD BATCH --no-save --no-restore '--args a=1 b=c(2,5,6)'test.R test.out &`

Where test.R is the R script file you wish to run and test.out is a text file to include the screen output of the R terminal. A key point here is that each argument must have no spaces because --args is space delimited.

To include the variables listed in --args, adapt the following code from test.R:
`##First read in the arguments listed at the command lineargs=(commandArgs(TRUE))##args is now a list of character vectors## First check to see if arguments are passed.## Then cycle through each element of the list and evaluate the expressions.if(length(args)==0){    print("No arguments supplied.")    ##supply default values    a = 1    b = c(1,1,1)}else{    for(i in 1:length(args)){         eval(parse(text=args[[i]]))    }}print(a*2)print(b*3)`
This produces the following in test.out:
`> print(a*2)[1] 2> print(b*3)[1]  6 15 18`

## Friday, July 20, 2007

### Variance-Covariance Matrix in glm

This is a small function Venables and Ripley provide in their MASS book. You don't need it anymore because vcov() has a method for the glm class. However, it is useful to see how to extract bits from a fitted model object.

`vcov.glm<-function(obj){  #return the variance-covariance matrix of a glm object  #from p. 188 in Venables and Ripley. 2002.   #Modern Applied Statistics With S. Springer. New York.  so <- summary(obj,corr=F)  so\$dispersion * so\$cov.unscaled}`

## Wednesday, July 18, 2007

### Visualizing Data

Sometimes it is helpful to see the plots created from the example
files within R libraries. Of course you can paste them directly
into R, but this web page allows you to view the images within the
context of the help files:

R Graphical Manuals

Also, check out this link for the R Graph Gallery.

## Thursday, May 17, 2007

### Random number generation

Luc Devroye from McGill University's School of Computer Science has provided pdf copies of his book, "Non-Uniform Random Variate Generation," on-line.

http://cg.scs.carleton.ca/~luc/rnbookindex.html

Most common distributions are already built into R, but sometimes you need to build your own -- Luc provides the algorithms to do it.

### Repeat elements of a vector

Two methods. The first produces a vector, the second a one column matrix.

x=1:10

# Method 1
rep(x,each=3)

# Method 2
matrix(t(matrix(x,length(x),3)))

### Convert image to matrix in R

This is how to use the Pixmap library to read in an image as a matrix.

`> library(pixmap)# the next command may only work on Linux> system("convert foo.tiff foo.ppm")> img <- read.pnm("foo.ppm")`

To get info on your new object:
`> str(img)`

Although included in the previous output, the size of the image can be extracted by:
`>img@size`

Then to extract the red channel from the image for the first ten rows:
`> myextract <- img@red[1:10,]`

Or to extract the entire red channel to an actual matrix:
`> red.mat<-matrix(NA,img@size[1],img@size[2])> red.mat<-img@red  `

## Wednesday, May 16, 2007

### Calculate turning angles and step lengths from location data

`anglefun <- function(xx,yy,bearing=TRUE,as.deg=FALSE){  ## calculates the compass bearing of the line between two points  ## xx and yy are the differences in x and y coordinates between two points  ## Options:  ## bearing = FALSE returns +/- pi instead of 0:2*pi  ## as.deg = TRUE returns degrees instead of radians  c = 1  if (as.deg){    c = 180/pi  }    b<-sign(xx)  b[b==0]<-1  #corrects for the fact that sign(0) == 0  tempangle = b*(yy<0)*pi+atan(xx/yy)  if(bearing){    #return a compass bearing 0 to 2pi    #if bearing==FALSE then a heading (+/- pi) is returned    tempangle[tempangle<0]<-tempangle[tempangle<0]+2*pi  }  return(tempangle*c)}bearing.ta <- function(loc1,loc2,loc3,as.deg=FALSE){  ## calculates the bearing and length of the two lines  ##    formed by three points  ## the turning angle from the first bearing to the  ##    second bearing is also calculated  ## locations are assumed to be in (X,Y) format.  ## Options:  ## as.deg = TRUE returns degrees instead of radians  if (length(loc1) != 2 | length(loc2) != 2 | length(loc3) !=2){    print("Locations must consist of either three vectors, length == 2,or three two-column dataframes")    return(NaN)  }  c = 1  if (as.deg){    c = 180/pi  }    locdiff1<-loc2-loc1  locdiff2<-loc3-loc2  bearing1<-anglefun(locdiff1[1],locdiff1[2],bearing=F)  bearing2<-anglefun(locdiff2[1],locdiff2[2],bearing=F)  if(is.data.frame(locdiff1)){    dist1<-sqrt(rowSums(locdiff1^2))    dist2<-sqrt(rowSums(locdiff2^2))  }else{    dist1<-sqrt(sum(locdiff1^2))    dist2<-sqrt(sum(locdiff2^2))  }    ta=(bearing2-bearing1)    ta[ta < -pi] = ta[ta < -pi] + 2*pi  ta[ta > pi] = ta[ta > pi] - 2*pi  return(list(bearing1=unlist(bearing1*c),bearing2=unlist(bearing2*c),  ta=unlist(ta*c),dist1=unlist(dist1),dist2=unlist(dist2)))}`