regex - Avoiding this use of sapply in R data.table -
i've written function remove first parentheses onwards in string:
until_parentheses <- function(string) { 1 <- stringr::str_split_fixed(string, "\\(", 2)[1, 1] res <- stringr::str_trim(one) return(res) }
and have data.table column looks (something) this:
messy <- paste(letters[1:10], paste0(c(" (", letters[1:2], ")"), collapse = "")) dt <- data.table(messy)
when try use until_parentheses()
on messy column so
dt[, ":=" (clean = until_parentheses(messy))]
the function applied first element of messy , clean column result repeated 10 times.
in order have clean column come out how want using sapply:
dt[, ":=" (clean_2 = sapply(messy, until_parentheses))]
this gives result want takes long time run when dt long.
i feel there problems both until_parenthese()
function , data.table method. have solution makes redundant use of sapply in instance?
thanks!
you can use gsub
vectorized:
dt[,clean_3:=gsub(' +[(].*','',messy)] ## replace after first ( blank
Comments
Post a Comment