statistics bootstrap - Bootstrapping multiple columns with R -
i'm relatively new @ r , i'm trying build function loop through columns in imported table , produce output consists of means , 95% confidence intervals. ideally should possible bootstrap columns different sample sizes, first iteration working. have sort-of works, can't way there. code looks like, sample data , output included:
#cdata<-read.csv(file.choose(),header=t)#read data selected file, works, commented out because data provided below #cdata #check imported data #sample data # wall nrpk cisc whsc lkwh ylpr #1 21 8 1 2 2 5 #2 57 9 3 1 0 1 #3 45 6 9 1 2 0 #4 17 10 2 0 3 0 #5 33 2 4 0 0 0 #6 41 4 13 1 0 0 #7 21 4 7 1 0 0 #8 32 7 1 7 6 0 #9 9 7 0 5 1 0 #10 9 4 1 0 0 0 x<-cdata[,c("wall","nrpk","lkwh","ylpr")] #only select relevant species i<-nrow(x) #count number of rows bootstrapping g<-ncol(x) #count number of columns iteration #build bootstrapping function, works first column doesn't iterate bootfun <- function(bootdata, reps) { boot <- function(bootdata){ s1=sample(bootdata, size=i, replace=true) ms1=mean(s1) return(ms1) } # single bootstrap bootrep <- replicate(n=reps, boot(bootdata)) return(bootrep) } #replicates bootstrap of "bootdata" "reps" number of times , outputs vector of results cvr1 <- bootfun(x$ylpr,50000) #have unsuccessfully tried iterating location various ways (i.e. x[i]) cvrquantile<-quantile(cvr1,c(0.025,0.975)) cvrmean<-mean(cvr1) vec<-c(cvrmean,cvrquantile) #puts results suitable form output vecr<-sapply(vec,round,1) #rounds results vecr 2.5% 97.5% 28.5 19.4 38.1 #apply(x[1:g],2,bootfun) ##doesn't work in case #desired output: #species mean lowerci upperci #wall 28.5 19.4 38.1 #nrpk 6.1 4.6 7.6 #ylpr 0.6 0.0 1.6
i've tried using boot package, , works beautifully iterate through means can't same confidence intervals. "ordinary" code above has advantage can retrieve bootstrapping results, might used other calculations. sake of completeness here boot code:
#bootstrapping using boot package library(boot) #data<-read.csv(file.choose(),header=true) #read data selected file #x<-data[,c("wall","nrpk","lkwh","ylpr")] #only select relevant columns #x #check data #sample data # wall nrpk lkwh ylpr #1 21 8 2 5 #2 57 9 0 1 #3 45 6 2 0 #4 17 10 3 0 #5 33 2 0 0 #6 41 4 0 0 #7 21 4 0 0 #8 32 7 6 0 #9 9 7 1 0 #10 9 4 0 0 i<-nrow(x) #count number of rows resampling g<-ncol(x) #count number of columns step through bootstrapping boot.mean<-function(x,i){boot.mean<-mean(x[i])} #bootstrapping function mean z<-boot(x, boot.mean,r=50000) #bootstrapping function, uses mean , number of reps boot.ci(z,type="perc") #derive 95% confidence intervals apply(x[1:g],2, boot.mean) #bootstrap columns #output: #wall nrpk lkwh ylpr #28.5 6.1 1.4 0.6
i've gone through of resources can find , can't seem things working. output bootstrapped means associated confidence intervals each column. thanks!
note: apply(x[1:g],2, boot.mean) #bootstrap columns
doesn't bootstrap. calculating mean each column.
for bootstrap mean , confidence interval, try this:
apply(x,2,function(y){ b<-boot(y,boot.mean,r=50000); c(mean(b$t),boot.ci(b,type="perc", conf=0.95)$percent[4:5]) })
Comments
Post a Comment