scala - How do I filter rows based on whether a column value is in a Set of Strings in a Spark DataFrame -


is there more elegant way of filtering based on values in set of string?

def myfilter(actions: set[string], mydf: dataframe): dataframe = {   val containsaction = udf((action: string) => {     actions.contains(action)   })    mydf.filter(containsaction('action)) } 

in sql can do

select * mytable action in ('action1', 'action2', 'action3') 

how this:

mydf.filter("action in (1,2)") 

or

import org.apache.spark.sql.functions.lit        mydf.where($"action".in(seq(1,2).map(lit(_)):_*)) 

or

import org.apache.spark.sql.functions.lit        mydf.where($"action".in(seq(lit(1),lit(2)):_*)) 

additional support added make cleaner in 1.5


Comments

Popular posts from this blog

javascript - Using jquery append to add option values into a select element not working -

Android soft keyboard reverts to default keyboard on orientation change -

Rendering JButton to get the JCheckBox behavior in a JTable by using images does not update my table -