parsing - SQL Split Multiple Multivalue Columns into Rows -
i have data sent me, , need normalize it. data in sql table, each row has multiple multi value columns. example following:
id  fname   lname       projects           projdates 1   john    doe         proja;projb;projc  20150701;20150801;20150901 2   jane    smith       projd;;projc       20150701;;20150902 3   lisa    anderson    projb;projc        20150801;20150903 4   nancy   johnson     projb;projc;proje  20150601;20150822;20150904 5   chris   edwards     proja              20150905 needs this:
id  fname   lname      projects projdates 1   john    doe          proja  20150701 1   john    doe          projb  20150801 1   john    doe          projc  20150901 2   jane    smith        projd  20150701 2   jane    smith        projc  20150902 3   lisa    anderson     projb  20150801 3   lisa    anderson     projc  20150903 4   nancy   johnson      projb  20150601 4   nancy   johnson      projc  20150822 4   nancy   johnson      proje  20150904 5   chris   edwards      proja  20150905 i need split rows id, fname, lname, , parsing projects , proddates separate records. have found many posts split functions , can work 1 column, not 2. when 2 columns permeates split. ie john doe, gives me records proja 3 times, once each of proddates. need coorelate each multivalue project record it's respective projdate , not others.
any thoughts?
thanks!
if use jeff moden's "delimitedsplit8k" splitter (which have renamed here "fdelimitedsplit8k")  (ref. figure 21: the final "new" splitter code, ready testing) 
 heavy lifting splits, rest becomes straightforward, using cross apply , proper joining.
if object_id (n'tempdb..#tinputdata') not null     drop table #tinputdata  create table #tinputdata (      id        int          primary key clustered  -- add identity if id needs set @ insert time    , fname     varchar (30)    , lname     varchar (30)    , projects  varchar (4000)    , projdates varchar (4000) )  insert #tinputdata          ( id, fname, lname, projects, projdates ) values    ( 1, 'john',  'doe'      , 'proja;projb;projc' , '20150701;20150801;20150901'),    ( 2, 'jane',  'smith'    , 'projd;;projc'      , '20150701;;20150902'),    ( 3, 'lisa',  'anderson' , 'projb;projc'       , '20150801;20150903'),    ( 4, 'nancy', 'johnson'  , 'projb;projc;proje' , '20150601;20150822;20150904'),    ( 5, 'chris', 'edwards'  , 'proja'             , '20150905')  select * #tinputdata  -- take @ insert results  ; resultset   (    select          indata.id       , indata.fname       , indata.lname       , projectlist.itemnumber projectid       , projectlist.item project       , datelist.itemnumber dateid       , datelist.item projdate    #tinputdata indata    cross apply dbo.fdelimitedsplit8k(indata.projects,';') projectlist    cross apply dbo.fdelimitedsplit8k(indata.projdates,';') datelist    datelist.itemnumber = projectlist.itemnumber  -- links projects , dates in left-to-r1ght order    , (projectlist.item <> '' , datelist.item <> '') -- ignore input lines when both projects , projdates have no value; note these aren't nulls. ) select        id     , fname     , lname     , project     , projdate  resultset order id, project results in
id  fname  lname     project  projdate   --  -----  --------  -------  --------    1  john   doe       proja    20150701    1  john   doe       projb    20150801    1  john   doe       projc    20150901    2  jane   smith     projc    20150902    2  jane   smith     projd    20150701    3  lisa   anderson  projb    20150801    3  lisa   anderson  projc    20150903    4  nancy  johnson   projb    20150601    4  nancy  johnson   projc    20150822    4  nancy  johnson   proje    20150904    5  chris  edwards   proja    20150905   this algorithm handles project , date lists of equal length. should 1 list shorter other given row, special attention needed apply null in proper place.
-- cleanup drop table #tinputdata 
Comments
Post a Comment