Using ^ to match beginning of line in Python regex -
i'm trying extract publication years isi-style data thomson-reuters web of science. line "publication year" looks (at beginning of line):
py 2015 for script i'm writing have defined following regex function:
import re f = open('savedrecs.txt') wosrecords = f.read() def findyears(): result = re.findall(r'py (\d\d\d\d)', wosrecords) print result findyears() this, however, gives false positive results because pattern may appear elsewhere in data.
so, want match pattern @ beginning of line. use ^ purpose, r'^py (\d\d\d\d)' fails @ matching results. on other hand, using \n seems want, might lead further complications me.
re.findall(r'^py (\d\d\d\d)', wosrecords, flags=re.multiline) should work, let me know if doesn't. don't have data.
Comments
Post a Comment