Find matches by condition between 2 datasets in SAS -


i'm trying improve processing time used via existing for-loop in *.jsl file classmates , using in our programming course using sas. question: there proc or sequence of statements exist sas offers can replicate search , match condition? or way go through unsorted files without going line line looking matching condition(s)?

our current scrip file below:

if( roadnumber_fuel[n]==roadnumber_to[m] &      fueldate[n]>=tripstart[m] & fueldate[n]<=tripend[m],                 newtripid[n] = tripid[m];             ); 

i have 2 sets of data simplified below.

data1: id1     date1 1      may 1, 2012 2      jun 4, 2013 3      aug 5, 2013 .. . 

&

data2: id2    date2         date3       trip_id 1      jan 1 2012    feb 1 2012  9876  2      sep 5 2013    nov 3 2013  931  1      dec 1 2012    dec 3 2012  236 3      mar 9 2013    may 3 2013  390 2      jun 1 2013    jun 9 2013  811  1      apr 1 2012    may 5 2012  76  ... .. . 

i need check lot of iterations goal have code check:

data1.id1 = data2.id2 , (date1 >date2 , date1 < date3) 

my desired output dataset woudld be

id1     date1         trip_id 1      may 1, 2012    76 2      jun 4, 2013    811 

thanks insight!

you can range matches in 2 ways. first off, can match using proc sql if you're familiar sql:

proc sql;  create tablec   select * table    left join table b      on a.id=b.id , a.date > b.date1 , a.date < b.date2   ; quit; 

second, can create format. faster option if it's possible this. tricky when have ids, can it.

first, create new variable, id+date. dates numbers around 18,000-20,000, multiply id 100,000 , you're safe.

second, create dataset range dataset start=lower date plus id*100,000, end=higher date + id*100,000, fmtname=some string become format name (must start a-z or _ , have a-z, _, digits only). label value want retrieve (trip_id in above example).

data b_fmts;   set b;  start=id*100000+date1;  end  =id*100000+date2;  label=value_you_want_out;  fmtname='mydatef'; run; 

then use proc format cntlin=` option import formats.

proc format cntlin=b_fmts; quit; 

make sure date ranges don't overlap - if fail.

then can use easily:

data a_match;  set a;  trip_id=put(id*100000+date,mydatef.); run; 

Comments

Popular posts from this blog

javascript - Using jquery append to add option values into a select element not working -

Android soft keyboard reverts to default keyboard on orientation change -

Rendering JButton to get the JCheckBox behavior in a JTable by using images does not update my table -