scala - How do I filter rows based on whether a column value is in a Set of Strings in a Spark DataFrame -


is there more elegant way of filtering based on values in set of string?

def myfilter(actions: set[string], mydf: dataframe): dataframe = {   val containsaction = udf((action: string) => {     actions.contains(action)   })    mydf.filter(containsaction('action)) } 

in sql can do

select * mytable action in ('action1', 'action2', 'action3') 

how this:

mydf.filter("action in (1,2)") 

or

import org.apache.spark.sql.functions.lit        mydf.where($"action".in(seq(1,2).map(lit(_)):_*)) 

or

import org.apache.spark.sql.functions.lit        mydf.where($"action".in(seq(lit(1),lit(2)):_*)) 

additional support added make cleaner in 1.5


Comments

Popular posts from this blog

user interface - how to replace an ongoing process of image capture from another process call over the same ImageLabel in python's GUI TKinter -

javascript - Restarting Supervisor and effect on FlaskSocketIO -

android - Format a french phone number -