python - Selecting specific lists in a file -
i've been working python dictionary replace md5 values cog/nog identifiers. have done far...
#!/usr/bin/python import sys fil = sys.argv[1] # load md5 -> cog dictionary open(fil) fin: rows = ( line.strip().split('\t') line in fin ) d = { row[0]:row[1] row in rows } # open blast output, replace md5 cog looking md5 in dictionary blasted = open(sys.argv[2]) line in blasted: linearr = line.split() if linearr[2] > '90.00': line.split() needed = linearr[0:2] md5 = linearr[1] ret = [] md5 in needed: ret.append(d.get(md5,md5)) "".join(ret) print ret
this has brought me output, lists of various size , content...
['fig|357276.26.peg.4486'] ['fig|357276.26.peg.4486', 'f3e68ef307f962ba6b836a94ff0e2216'] ['fig|357276.26.peg.4486'] ['fig|357276.26.peg.4486', 'cog0860'] ['fig|357276.26.peg.4486'] ['fig|357276.26.peg.4486', '05e94199eef6fbaf225618f9deaf847c']
so single item lists need tossed lists retain md5 value. need select lists have cog/nog second element, in 4th list above.
i can't select second item of lists filter these results because not lists have second item. can suggest method this?
update: able remove lists 1 item. lists this...
['fig|357276.26.peg.4485', 'nog73961'] ['fig|357276.26.peg.4485', '19c060b530e8fa9598de068387bc3225'] ['fig|357276.26.peg.4486', '8daa25fe83eb1a204c51861cb77945f5'] ['fig|357276.26.peg.4486', '5c253078a0a6c51eca320dfd92991a70'] ['fig|357276.26.peg.4486', '8707bd7fa7489ff69233ce735c1c6cbf'] ['fig|357276.26.peg.4486', 'f3e68ef307f962ba6b836a94ff0e2216'] ['fig|357276.26.peg.4486', 'cog0860'] ['fig|357276.26.peg.4486', '05e94199eef6fbaf225618f9deaf847c']
now need select lists containing second item starting nog or cog...any advice?
lets have list of lists values = [ [1], [1,2], [3,4] ]
first remove items filter
function:
values1 = filter(lambda x: len(x) > 1, values)
now need filter based on cog/nog. there lists 2 elements now, can directly choose second element :
filter(lambda x: "nog" in x[1] of "cog" in x[1], values1)
to reduce whole thing down, can merge both:
def check_cog_nog(x): if len(x) > 1: y = x[1].lower() if "nog" in y or "cog" in y: return true return false filter(check_cog_nog, values)
Comments
Post a Comment