mapreduce - A node with massive degree in graph brings taking distinct() edges trouble -


i have graph around 75% connectivity comes 1 node

e.g. if sum of degree of nodes 100, node's degree 75.

after manipulations, massive duplicate edges exist regarding node.

assume 1 kind of node

1,2
1,2
1,2
1,2
1,2
1,2
1,3
1,3
1,3

however, has many duplicate keys taking distinct() edges. have tried re-partition before taking distinct() still doesn't work out of many duplicate keys, , writing disk , taking distinct() solves problem.

is there better way handle kind of extremely skew problem?


Comments

Popular posts from this blog

javascript - Using jquery append to add option values into a select element not working -

Android soft keyboard reverts to default keyboard on orientation change -

Rendering JButton to get the JCheckBox behavior in a JTable by using images does not update my table -