dataset - Sorting and making "genes" in output bitstrings from a genetic algorithm -
i wondering if had suggestions how analyze output bitstring being permuted genetic algorithm. in particular nice if try identify patterns of bits (i'm calling them genes here) seem yield desirable cv score. difficulty comes in trying examine these datasets because there lot of them (i have 30 million bitstrings 140 bits long , i'll hit on 100 million pretty quickly), after sort out desirable data there still alot of potential datasets , doing similarity comparisons eye out of question. questions are:
how should compare similarity between these bitstrings?
how can identify "genes" in these bitstrings in algorithmic (aka programmable) way?
as want extract common gene-patterns, looking @ intersection of 2 strings. if have
set1 = 11011101110011... set2 = 11001100000110... # apply bitwise '==' set1 && set2 == 11101110000010...
the result shows genes same, , used in further analysis.
Comments
Post a Comment