pyspark - spark error "It appears that you are attempting to reference SparkContext from a broadcast " -
here functions in class:
def labeling(self, value, labelmap, dtype='string'): if dtype.value == 'string': result = [i v,i in labelmap.value if value==v][0] return result else: result = [i v,i in labelmap.value if value<v][0] return result def labelbyvalue(self, labelmap, dtype='string'): labeling = self.labeling labelmap = self.sc.broadcast(labelmap) dtype = self.sc.broadcast(dtype) self.rdd = self.rdd.map(labeling)
but when call function below in "main", report error like:""it appears attempting reference sparkcontext broadcast ""
class.rdd.labelbyvalue((('a', 1), ('b', 2), ('c', 3)))
i not find myself. came here in advance.
i finished error.
the wrong point user defined function should put in global environment, not in class.
so labeling should this:
def labeling(value, labelmap, dtype='string'): if dtype.value == 'string': result = [i v,i in labelmap.value if value==v][0] return result else: result = [i v,i in labelmap.value if value<v][0] return result
Comments
Post a Comment