python - Populate new column based on existing column checked against regex pandas -
i have data frame in pandas below
df = pd.dataframe({'firstname':['vishal', 'nishal', 'indira', 'jagdish', 'tamnna'], 'actual age':[25,33,58,58,30]}) firstname actual age 0 vishal 25 1 nishant 33 2 indira 58 3 jagdish 58 4 tamnna 30 and regex:
\w+ish\w* what cant seem figure our provide result below:
firstname actual age copydown 0 vishal 25 vishal 1 nishant 33 nishant 2 indira 58 nishant 3 jagdish 58 jagdish 4 tamnna 30 jagdish so want through firstname column, , if can match regex given, continue copying down value in new column until next match found, , keep doing until end.
any ideas? ive been stuck on days. copydown feature want implement might useful in denormalised datasets. (using dates stuff)
thanks in advance
here 1 way it. first identify whether there match. groupby using cumsum trick. finally, populate each sub group using first value.
import pandas pd import re # data # ============================= print(df) firstname actual age 0 vishal 25 1 nishant 33 2 indira 58 3 jagdish 58 4 tamnna 30 # processing # ============================= pattern = re.compile(r'\w+ish\w*') df['matched'] = [(pattern.match(x) not none) x in df.firstname.values] df['diff_names'] = df.matched.astype(int).cumsum() def func(group): group['copydown'] = group['firstname'].values[0] return group.drop(['matched', 'diff_names'], axis=1) df.groupby('diff_names').apply(func) firstname actual age copydown 0 vishal 25 vishal 1 nishant 33 nishant 2 indira 58 nishant 3 jagdish 58 jagdish 4 tamnna 30 jagdish
Comments
Post a Comment