r - How do I split data into 4 groups with even amounts of time? -
this data frame result of video analysis of 11 subjects under 2 separate conditions: vigil vs. video. have start , stop columns time measured in seconds. grouped data frame subject(sbj) , cognitive load (condition) , found amount of time of each video subtracting last stop time , first start time of each subject each condition. divided overall time of video 4 see how long each quartile (in seconds). here example of data looks like, although actual data bit more complex:
library(dplyr) start <- c(35, 44, 53, 62, 71, 80) stop <- c(42, 50, 59, 70, 77, 85) condition <- c('video', 'vigil', 'video', 'vigil', 'video', 'vigil') sbj <- c(1, 1, 2, 2, 3, 3) df <- data.frame(start, stop, condition, sbj) df1 <- group_by(df, sbj, condition) df2 <- summarize(df1, time = last(stop)-first(start)) hd2 <- transform(df2, quartile = time/4) hd3 <- inner_join(df1, hd2) hd3 start stop condition sbj time quartile 1 35 42 video 1 7 1.75 2 44 50 vigil 1 6 1.50 3 53 59 video 2 6 1.50 4 62 70 vigil 2 8 2.00 5 71 77 video 3 6 1.50 6 80 85 vigil 3 5 1.25
i split data 4 groups, each group equals length of 1/4 of overall video time (the quartile). since start of each video not @ 0 seconds (for subject 1 can see starts @ 35 seconds), need add first start value each subject under each condition value of quartile appropriate time 1/4 of overall video. tried ifelse statement, resulting answer splits quartiles roughly.
attach(hd3) fx <- first(start) + quartile hd3$q <- with(hd3, ifelse(start <= fx, 1, ifelse(start <= fx * 2, 2, ifelse(start <= fx * 3, 3, ifelse(start <= fx * 4, 4)))))
i'm hoping can suggest way split quartiles more elegantly , correctly. in advance!
okay, i've edited answer, , providing tested code now.
the data provide should have multiple rows per condition
, sbj
results interesting.
library(dplyr) start <- c(35, 44, 53, 62, 71, 80, 87, 90) stop <- c(42, 50, 59, 70, 77, 85, 89, 95) condition <- c('video', 'vigil', 'video', 'vigil', 'video', 'vigil', 'video', 'vigil') sbj <- c(1, 1, 1, 1, 2, 2, 2, 2) df <- data.frame(start, stop, condition, sbj) df1 <- group_by(df, sbj, condition) df1$med <- with(df1, (start + stop)/2) df4 <- summarize(df1, min = first(start), range = last(stop)-first(start) ) hd4 <- inner_join(df1, df4) hd4$quant <- with(hd4, (med-min)/range) hd4$group <- cut(hd4$quant, breaks=seq(0, 1, length=5), include.lowest=true, labels=false)
Comments
Post a Comment