Using ^ to match beginning of line in Python regex -
i'm trying extract publication years isi-style data thomson-reuters web of science. line "publication year" looks (at beginning of line):
py 2015
for script i'm writing have defined following regex function:
import re f = open('savedrecs.txt') wosrecords = f.read() def findyears(): result = re.findall(r'py (\d\d\d\d)', wosrecords) print result findyears()
this, however, gives false positive results because pattern may appear elsewhere in data.
so, want match pattern @ beginning of line. use ^
purpose, r'^py (\d\d\d\d)'
fails @ matching results. on other hand, using \n
seems want, might lead further complications me.
re.findall(r'^py (\d\d\d\d)', wosrecords, flags=re.multiline)
should work, let me know if doesn't. don't have data.
Comments
Post a Comment