Monday, October 26, 2009

plots boxplot, stripchart and mean

## CAN BE REPLACED BY qplot in ggplot2 package

qplot(x,y,geom=c("boxplot","jitter")) ## plots boxplot, stripchart and mean boxjitter = function(x,y=NULL,border='gray',jitter=.1,vertical=T,...) { if(!is.null(y)) { boxplot(y~x,border=border,ylim = range(y,na.rm=T),outline=F,...) stripchart(y~x,vertical=vertical,method="jitter",jitter=jitter,add=T) meanvec = tapply(y,x,mean.na) points(1:length(meanvec),meanvec,pch="M",col='blue') } else { x = x[!is.na(x)] boxplot(x,border=border,ylim = range(x),outline=F,...) stripchart(x,vertical=vertical,method="jitter",jitter=jitter,add=T) points(1,mean.na(x),pch="M",col='blue') } }

Thursday, October 22, 2009

calculate number of minor alleles from A G format

num.minal = function(a1,a2)
{
  aall = c(a1,a2)
  tab = table(aall)
  miname = names(tab)[tab==min(tab)]
  apply(cbind(a1==miname,a2==miname),1,sum)  
}

plink coding --recodeAD


plink --file data --recodeAD

which, assuming C is the minor allele, will recode genotypes as follows:

     SNP       SNP_A ,  SNP_HET
     ---       -----    -----
     A A   ->    0   ,   0
     A C   ->    1   ,   1
     C C   ->    2   ,   0
     0 0   ->   NA   ,  NA


R table, percentages and chi square test like in stata



table.all = function(...)
{
  args = list(...)
  dnn = names(args)
  x = args[[1]]
  y = args[[2]]
  print(table(x,y,dnn=dnn))
  print("percentage by row")
  print( round( prop.table(table(x,y,dnn=dnn),1) * 100 ) )
  print("percentage by col")
  print( round( prop.table(table(x,y,dnn=dnn),2) * 100 ) )
  print(chisq.test(table(x,y,dnn=dnn)) )
}


Friday, October 2, 2009

order data frames in R

say data has read and prog as colums

data[order(read, prog), ]

this will order first by read then by prog

data[order(-read), ]

- reverses the order

** there can be conflicts if read is a variable outside of the data frame. Use data[order(data[,1]),] assuming read is column 1

from: http://www.ats.ucla.edu/stat/R/faq/sort.htm

Thursday, October 1, 2009

list names from data that contain charname

## list names from data frames that contain charname

grepnames = function(charname,data)
  {
    names(data)[grep(charname,names(data))]
  }

Contributors

google