parsing - SQL Split Multiple Multivalue Columns into Rows -
i have data sent me, , need normalize it. data in sql table, each row has multiple multi value columns. example following:
id fname lname projects projdates 1 john doe proja;projb;projc 20150701;20150801;20150901 2 jane smith projd;;projc 20150701;;20150902 3 lisa anderson projb;projc 20150801;20150903 4 nancy johnson projb;projc;proje 20150601;20150822;20150904 5 chris edwards proja 20150905
needs this:
id fname lname projects projdates 1 john doe proja 20150701 1 john doe projb 20150801 1 john doe projc 20150901 2 jane smith projd 20150701 2 jane smith projc 20150902 3 lisa anderson projb 20150801 3 lisa anderson projc 20150903 4 nancy johnson projb 20150601 4 nancy johnson projc 20150822 4 nancy johnson proje 20150904 5 chris edwards proja 20150905
i need split rows id, fname, lname, , parsing projects , proddates separate records. have found many posts split functions , can work 1 column, not 2. when 2 columns permeates split. ie john doe, gives me records proja 3 times, once each of proddates. need coorelate each multivalue project record it's respective projdate , not others.
any thoughts?
thanks!
if use jeff moden's "delimitedsplit8k" splitter (which have renamed here "fdelimitedsplit8k") (ref. figure 21: the final "new" splitter code, ready testing)
heavy lifting splits, rest becomes straightforward, using cross apply , proper joining.
if object_id (n'tempdb..#tinputdata') not null drop table #tinputdata create table #tinputdata ( id int primary key clustered -- add identity if id needs set @ insert time , fname varchar (30) , lname varchar (30) , projects varchar (4000) , projdates varchar (4000) ) insert #tinputdata ( id, fname, lname, projects, projdates ) values ( 1, 'john', 'doe' , 'proja;projb;projc' , '20150701;20150801;20150901'), ( 2, 'jane', 'smith' , 'projd;;projc' , '20150701;;20150902'), ( 3, 'lisa', 'anderson' , 'projb;projc' , '20150801;20150903'), ( 4, 'nancy', 'johnson' , 'projb;projc;proje' , '20150601;20150822;20150904'), ( 5, 'chris', 'edwards' , 'proja' , '20150905') select * #tinputdata -- take @ insert results ; resultset ( select indata.id , indata.fname , indata.lname , projectlist.itemnumber projectid , projectlist.item project , datelist.itemnumber dateid , datelist.item projdate #tinputdata indata cross apply dbo.fdelimitedsplit8k(indata.projects,';') projectlist cross apply dbo.fdelimitedsplit8k(indata.projdates,';') datelist datelist.itemnumber = projectlist.itemnumber -- links projects , dates in left-to-r1ght order , (projectlist.item <> '' , datelist.item <> '') -- ignore input lines when both projects , projdates have no value; note these aren't nulls. ) select id , fname , lname , project , projdate resultset order id, project
results in
id fname lname project projdate -- ----- -------- ------- -------- 1 john doe proja 20150701 1 john doe projb 20150801 1 john doe projc 20150901 2 jane smith projc 20150902 2 jane smith projd 20150701 3 lisa anderson projb 20150801 3 lisa anderson projc 20150903 4 nancy johnson projb 20150601 4 nancy johnson projc 20150822 4 nancy johnson proje 20150904 5 chris edwards proja 20150905
this algorithm handles project , date lists of equal length. should 1 list shorter other given row, special attention needed apply null in proper place.
-- cleanup drop table #tinputdata
Comments
Post a Comment