Wednesday, November 12, 2014

Human chr6 DAF peak contains PRIM2 gene

Continuing the discussion on DAF's, we look at chr6 and specifically the PRIM2 gene region thought to be under balancing selection.
 chr <- "chr6";   
 jpeg(paste("DAF.",chr,".jpeg",sep=""),width=1420)   
 par(mfrow=c(2,1))   
 read.table(file=paste("h.",chr,".mean.bed",sep=""),header=F,stringsAsFactors=F)->M   
 plot(as.numeric(M$V2),as.numeric(M$V4),xlab="Position along chromosome",ylab="Mean derived allele Frequency",main=chr)   
 lines(c(58830166,61830166),c(0.2,0.2),col="red",lwd=5)   
 text(63830166, 0.25,labels="Centromere",col="red")   
 lines(c(171105067,171115067),c(0.3,0.3),col="blue",lwd=5)   
 text(171105067, 0.35,labels="Telomere",col="blue")   
 lines(c(0,10000),c(0.3,0.3),col="blue",lwd=5)   
 text(0, 0.35,labels="Telomere",col="blue")   
 lines(c(57182422,57513376),c(0.25,0.25),col="brown",lwd=5)   
 text(57182422, 0.3,labels="PRIM2 gene",col="brown")   
 M[M$V2>57182422 & M$V3<57513376,]->N  
 points(N$V2,N$V4,pch=13,col="blue")  
 read.table(file=paste("h.",chr,".countdgv.bed",sep=""),header=FALSE)->C   
 plot(C$V2,C$V4,xlab="Position along chromosome",ylab="Count of known structural variants",main=chr)   
 cor.test(as.numeric(M$V4),C$V4,method="spearman")   
 dev.off()  
The correlation coefficient of 0.2559672 between the mean derived allele frequency and Number of CNV's is line with the results from chr2. The PRIM2 gene that has been shown to have high values of diversity and Tajima's D is located near the centromere and has high mean DAF values. The windows within the PRIM2 gene are marked by blue crosses. 


No comments: