python - Compiling a condition (once) for multicolumns pandas dataframe -
i put condition 1 column in pandas dataframe, works. doesn't work when apply columns. example have following dataframe called temps:
temps t t_0 t_1 2012-07-16 260 250 210 2012-07-17 230 251 212 2012-07-18 265 220 250 2012-07-19 270 260 210 it works when specified 1 column:
df_new = temps['t'][(temps['t'].values>(temps['t'].mean()-1.0*temps['t'].std())) & (temps['t'].values<(temps['t'].mean()+1.0*temps['t'].std()))] df_new t 2012-07-16 260 2012-07-18 265 2012-07-19 270 i appreciate if guide me how can once columns? following doesn't work of course.
df_new = temps[(temps.values>(temps.mean()-1.0*temps.std())) & (temps.values<(temps.mean()+1.0*temps.std()))] the expected output is:
df_new t t_0 t_1 2012-07-16 210 250 210 2012-07-19 210 260 210 or
df_new t t_0 t_1 2012-07-16 210 250 210 2012-07-17 nan 251 212 2012-07-18 265 nan nan 2012-07-19 210 260 210 thank in advance
here's lazy way make code work least change possible:
temps2 = temps.copy() col in temps2.columns: temps2[col] = temps[col][(temps[col].values>(temps[col].mean()-1.0*temps[col].std())) & (temps[col].values<(temps[col].mean()+1.0*temps[col].std()))] t t_0 t_1 2012-07-16 260 250 210 2012-07-17 nan 251 212 2012-07-18 265 nan nan 2012-07-19 270 260 210 and here's more elegant way:
zscore = temps.apply( lambda x: (x - x.mean()) / x.std() ) t t_0 t_1 2012-07-16 0.208683 0.272618 -0.533286 2012-07-17 -1.460778 0.330011 -0.431708 2012-07-18 0.486926 -1.449180 1.498279 2012-07-19 0.765169 0.846551 -0.533286 temps[ abs(zscore) < 1 ] t t_0 t_1 2012-07-16 260 250 210 2012-07-17 nan 251 212 2012-07-18 265 nan nan 2012-07-19 270 260 210
Comments
Post a Comment