Tuesday, December 15, 2009

pvalue plot vs uniform; FDR

qqunif  = function(p,BH=T,...)
{
  nn = length(p)
  xx =  -log10((1:nn)/(nn+1))
  plot( xx,  -sort(log10(p)),
     xlab=expression(Expected~~-log[10](italic(p))),
        ylab=expression(Observed~~-log[10](italic(p))),
       ... )
  abline(0,1,col='gray')
  if(BH)
    {
      lines( xx, -log10(0.05*(1:nn)/(nn+1)), col='red')
      lines( xx, -log10(0.10*(1:nn)/(nn+1)), col='orange')
      lines( xx, -log10(0.25*(1:nn)/(nn+1)), col='yellow')
      legend('bottomright', c("FDR = 0.05","FDR = 0.10","FDR = 0.25"),
             col=c('red','orange','yellow'),lty=1, cex=1.4)
    }
}


Tuesday, December 8, 2009

imagesc in CDAT

The imagesc function in MATLAB creates plots that have color bars scaled to a user-specified range. This allows you to simply create a whole series of plots that can be easily intercompared.

The following function imitates imagesc:

    import vcs
    def imagesc(x, mini, maxi):
        t = vcs.init()
        a = t.getboxfill('quick')
        a.level_1 = mini
        a.level_2 = maxi
        t.plot(x, a)

x is the input array and mini and maxi define the range in values to plot. The function establishes a VCS canvas, imports the "quick" boxfill graphics method, sets the level range to mini and maxi, and plots x using this customized graphics method. 


other tips for CDAT/Python check http://www.johnny-lin.com/cdat_tips/

Thursday, December 3, 2009

likelihood ratio test

> logLik.test
function(fit0,fit)
{
  l0 = logLik(fit0)
  l1 = logLik(fit)
  lratio = abs(as.numeric(l0)-as.numeric(l1))
  df = abs( attr(l1,'df') - attr(l0,'df') )
  pval = 1-pchisq( 2*lratio, df)
  return(pval)
}

log likelihood from lm, glm, polr


> fit = lm(y ~ x)
> logLik(fit)
'log Lik.' -36.4 (df=4)
> attr(logLik(fit),'df')
[1] 4

Tuesday, November 17, 2009

how to embed a pdf or ppt on a blog

get the code from

https://docs.google.com/viewer

Monday, October 26, 2009

plots boxplot, stripchart and mean

## CAN BE REPLACED BY qplot in ggplot2 package

qplot(x,y,geom=c("boxplot","jitter")) ## plots boxplot, stripchart and mean boxjitter = function(x,y=NULL,border='gray',jitter=.1,vertical=T,...) { if(!is.null(y)) { boxplot(y~x,border=border,ylim = range(y,na.rm=T),outline=F,...) stripchart(y~x,vertical=vertical,method="jitter",jitter=jitter,add=T) meanvec = tapply(y,x,mean.na) points(1:length(meanvec),meanvec,pch="M",col='blue') } else { x = x[!is.na(x)] boxplot(x,border=border,ylim = range(x),outline=F,...) stripchart(x,vertical=vertical,method="jitter",jitter=jitter,add=T) points(1,mean.na(x),pch="M",col='blue') } }

Thursday, October 22, 2009

calculate number of minor alleles from A G format

num.minal = function(a1,a2)
{
  aall = c(a1,a2)
  tab = table(aall)
  miname = names(tab)[tab==min(tab)]
  apply(cbind(a1==miname,a2==miname),1,sum)  
}

plink coding --recodeAD


plink --file data --recodeAD

which, assuming C is the minor allele, will recode genotypes as follows:

     SNP       SNP_A ,  SNP_HET
     ---       -----    -----
     A A   ->    0   ,   0
     A C   ->    1   ,   1
     C C   ->    2   ,   0
     0 0   ->   NA   ,  NA


R table, percentages and chi square test like in stata



table.all = function(...)
{
  args = list(...)
  dnn = names(args)
  x = args[[1]]
  y = args[[2]]
  print(table(x,y,dnn=dnn))
  print("percentage by row")
  print( round( prop.table(table(x,y,dnn=dnn),1) * 100 ) )
  print("percentage by col")
  print( round( prop.table(table(x,y,dnn=dnn),2) * 100 ) )
  print(chisq.test(table(x,y,dnn=dnn)) )
}


Friday, October 2, 2009

order data frames in R

say data has read and prog as colums

data[order(read, prog), ]

this will order first by read then by prog

data[order(-read), ]

- reverses the order

** there can be conflicts if read is a variable outside of the data frame. Use data[order(data[,1]),] assuming read is column 1

from: http://www.ats.ucla.edu/stat/R/faq/sort.htm

Thursday, October 1, 2009

list names from data that contain charname

## list names from data frames that contain charname

grepnames = function(charname,data)
  {
    names(data)[grep(charname,names(data))]
  }

Friday, September 25, 2009

power calculation in SAS

proc power; 
  twosamplefreq test=pchi 
  groupproportions = (.5034  .6459) 
  power = .
  groupns = ( 1494 373)
  alpha = 0.00005;
run;

proc power; 
      twosamplesurvival test=logrank 
         gexphs = 0.6733 | 0.4155
         accrualtime = .00001
   follouptime = 1 
         groupns = 200 | 200 
   grouplossexphazards = (0 0)
         power = .; 
proc power; 
      twosamplesurvival test=logrank 
         curve("Control")   = (1):(.45) 
         curve("Treatment") = (1):(.66) 
         groupsurvival = "Control" | "Treatment" 
   GROUPLOSSEXPHAZARDS = (.2231 .2231)
   /*grouplossexphazards = (0,0)*/
         accrualtime = .00001
         follouptime = 1 
         groupns = 120 | 120 
         power = .; 
run;

proc power; 
      twosamplesurvival test=logrank 
         curve("Control")   = (1):(.51) 
         curve("Treatment") = (1):(.66) 
         groupsurvival = "Control" | "Treatment" 
   GROUPLOSSEXPHAZARDS = (.2231 .2231)
   /*grouplossexphazards = (0,0)*/
         accrualtime = .00001
         follouptime = 1 
         groupns = 100 | 100 
         power = .; 
run;

proc power; 
      twosamplesurvival test=logrank 
         curve("Control")   = (1):(.45) 
         curve("Treatment") = (1):(.70) 
         groupsurvival = "Control" | "Treatment" 
   GROUPLOSSEXPHAZARDS = (.2231 .2231)
   /*grouplossexphazards = (0,0)*/
         accrualtime = .00001
         follouptime = 1 
         groupns = 200 | 200 
         power = .; 
run;
 


how to get rid of emac's automatic switching of _ into ->

Solution by Paul R. (thanks!)

M-x
set-variable 
ess-S-assign 
"-"  

That's it.


Monday, April 13, 2009



File Management in Python
( Page 1 of 5 )

File management is a basic function, and an important part of many applications. Python makes file management surprisingly easy, especially when compared to other languages. Peyton McCullough explains the basics.

Introduction

The game you played yesterday uses files to store game saves. The order you placed yesterday was saved in a file. That report you typed up this morning was, obviously, stored in a file as well.

File management is an important part of many applications written in nearly every language. Python is no exception to this. In this article, we will explore the task of manipulating files using several modules. We'll read, write to, append and do other strange things to files. Let's get started.

Reading and Writing

The most basic tasks involved in file manipulation are reading data from files and writing data to files. This is a very simple task to learn. Let's open a file for writing:

fileHandle = open ( 'test.txt', 'w' )

The "w" indicates that we will be writing to the file, and the rest is pretty simple to understand. The next step is to write data to the file:

fileHandle.write ( 'This is a test.\nReally, it is.' )

This will write the string "This is a test." to the file's first line and "Really, it is." to the file's second line. Finally, we need to clean up after ourselves and close the file:

fileHandle.close()

As you can see, it's very easy, especially with Python's object orientation. Note that when you use the "w" mode to write to the file again, all its contents will be deleted. To get past this, use the "a" mode to append data to a file, adding data to the bottom:

fileHandle = open ( 'test.txt', 'a' )
fileHandle.write ( '\n\n\nBottom line.' )
fileHandle.close()

Now let's read our file and display the contents:

fileHandle = open ( 'test.txt' )
print fileHandle.read()
fileHandle.close()

This will read the entire file and print the data within it. We can also read a single line in the file:

fileHandle = open ( 'test.txt' )
print fileHandle.readline() # "This is a test."
fileHandle.close()

It is also possible to store the lines of a file into a list:

fileHandle = open ( 'test.txt' )
fileList = fileHandle.readlines()
for fileLine in fileList:
print '>>', fileLine
fileHandle.close()

When reading a file, Python's place in the file will be remembered, illustrated in this example:

fileHandle = open ( 'test.txt' )
garbage = fileHandle.readline()
fileHandle.readline() # "Really, it is."
fileHandle.close()

Only the second line is displayed. We can, however, get past this by telling Python to resume reading from a different position:

fileHandle = open ( 'test.txt' )
garbage = fileHandle.readline()
fileHandle.seek ( 0 )
print fileHandle.readline() # "This is a test."
fileHandle.close()

In the above example, we tell Python to continue reading from the first byte in the file. Thus, the first line is printed. We can also request Python's place within the file:

fileHandle = open ( 'test.txt' )
print fileHandle.readline() # "This is a test."
print fileHandle.tell() # "17"
print fileHandle.readline() # "Really, it is."

It is also possible to read the file a few bytes at a time:

fileHandle = open ( 'test.txt' )
print fileHandle.read ( 1 ) # "T"
fileHandle.seek ( 4 )
print FileHandle.read ( 1 ) # "T"

When working with Windows and Macintosh, sometimes you are required to read and write files in binary mode, such as images or executional files. To do this, simply append "b" to the file mode:

fileHandle = open ( 'testBinary.txt', 'wb' )
fileHandle.write ( 'There is no spoon.' )
fileHandle.close()

fileHandle = open ( 'testBinary.txt', 'rb' )
print fileHandle.read()
fileHandle.close()


#write to file
f = open('c:/temp/workfile','w')
f.write('hallo')
f.close

#open from file
f = open('file.txt')

#read linewise
for i in f:
print i

#read characterwise
for i in f.read():
print i

Contributors

google