regex - Avoiding this use of sapply in R data.table -


i've written function remove first parentheses onwards in string:

until_parentheses <- function(string) {    1 <- stringr::str_split_fixed(string, "\\(", 2)[1, 1]    res <- stringr::str_trim(one)    return(res)  } 

and have data.table column looks (something) this:

messy <- paste(letters[1:10], paste0(c(" (", letters[1:2], ")"), collapse = ""))  dt <- data.table(messy) 

when try use until_parentheses() on messy column so

dt[, ":=" (clean = until_parentheses(messy))] 

the function applied first element of messy , clean column result repeated 10 times.

in order have clean column come out how want using sapply:

dt[, ":=" (clean_2 = sapply(messy, until_parentheses))] 

this gives result want takes long time run when dt long.

i feel there problems both until_parenthese() function , data.table method. have solution makes redundant use of sapply in instance?

thanks!

you can use gsub vectorized:

dt[,clean_3:=gsub(' +[(].*','',messy)] ## replace after first ( blank 

Comments

Popular posts from this blog

Delphi XE2 Indy10 udp client-server interchange using SendBuffer-ReceiveBuffer -

Qt ActiveX WMI QAxBase::dynamicCallHelper: ItemIndex(int): No such property in -

Enable autocomplete or intellisense in Atom editor for PHP -