How to remove strings from in between brackets with regex...python -
i need pull out single string containing words extracted fields:
[[cat]][[dog]][[mouse]][[apple]][[banana]][[pear]][[plum]][[pool]]
so need: cat dog mouse apple banana pear plum pool
.
i've been trying 2 hours make regular expression this.
the best (?<=[[]\s)(.*)(?=]])
gets me:
cat]][[dog]][[mouse]][[apple]][[banana]][[pear]][[plum]][[pool
any ideas? thanks!
here's solution re.finditer
. let string s
. assumes there can in between [[ , ]]. otherwise, comment @noob applies.
>>> [x.group(1) x in re.finditer('\[\[(.*?)\]\]', s)] ['cat', 'dog', 'mouse', 'apple', 'banana', 'pear', 'plum', 'pool']
alternatively, lookarounds , re.findall
:
>>> re.findall('(?<=\[\[).*?(?=\]\])', s) ['cat', 'dog', 'mouse', 'apple', 'banana', 'pear', 'plum', 'pool']
for large strings, finditer
version seemed faster when timed alternatives.
in [5]: s=s*1000 in [6]: timeit [x.group(1) x in re.finditer('\[\[(.*?)\]\]', s)] 100 loops, best of 3: 3.61 ms per loop in [7]: timeit re.findall('(?<=\[\[).*?(?=\]\])', s) 100 loops, best of 3: 5.93 ms per loop
Comments
Post a Comment