Aside standard unpivot tasks (we create n rows using values from n columns) following unpivot task appears: One record with coma separated values in a column causes creating several rows, one for each part of the coma separated field.
Let's show it on following example uf intra-column unpivot task:
create table tt (a integer,b varchar2(500));
insert into tt values(1,'John,Steve,Jim,Anne');
insert into tt values(2,'Karen,Betty,Rebecca,Tom,Henry');
commit;
Query created sample data:
SQL> Select a,b from tt
2 /
A B
---------- --------------------------------------------------
1 John,Steve,Jim,Anne
2 Karen,Betty,Rebecca,Tom,Henry
Following query returns unpivoted set of data:
Select ttt.a,regexp_substr(tt.b,'(^|[,])([^,]*)',1,upiv.lvl,'ni',2) b
from tt
join (select level as lvl from dual connect by level<=100) upiv on upiv.lvl<=regexp_count(tt.b,'[,]')+1
/
A B
---------- --------------------------------------------------
1 John
1 Steve
1 Jim
1 Anne
2 Karen
2 Betty
2 Rebecca
2 Tom
2 Henry
Note: Presented example do not counts with more complex cases with comas within fields etc. More complex regular expression would be necessary to solve them.
For other separating characters you just have to replace [,] and [^,] strings with others. Following example shows values delimited by a pipe sign [|].
Select ttt.a,regexp_substr(tt.b,'(^|[|])([^|]*)',1,upiv.lvl,'ni',2) b
from tt
join (select level as lvl from dual connect by level<=100) upiv on upiv.lvl<=regexp_count(tt.b,'[|]')+1
/
In the case we have a constant, a variable, or a parameter and we want to tokenize it by a SQL query we can use following simplification:
Select regexp_substr(p_wanted_patterns,'(^|[,])([^,]*)',1,level,'ni',2) b
from dual connect by level<=regexp_count(p_wanted_patterns,'[,]')+1
Inverse task - Pivot:
Lets prepare unpivoted sample table using the query above:
Create table tta as
Select tt.a,regexp_replace(tt.b,'(^([^,]*,){'||to_char(upiv.lvl-1)||'})([^,]*)(,.*|$)','\3') b
from tt
join (select level as lvl from dual connect by level<=100) upiv on upiv.lvl<=length(regexp_replace(tt.b,'[^,]'))+1
/
SQL> select a,b from tta;
A B
---------- ---------------
1 John
2 Karen
1 Steve
2 Betty
1 Jim
2 Rebecca
1 Anne
2 Tom
2 Henry
Following query reverts data into intra-column pivoted form:
select a
,listagg(tta.b,',')within group(order by b) as b
from tta
group by a;
In the case of older versions than Oracle 11g following workaround should be used:
select a
,rtrim(xmlagg(xmlelement(c,tta.b||',')).extract('//text()'),',') as b
from tta
group by a;
Returns:
SQL> select a,rtrim(xmlagg(xmlelement(c,tta.b||',')).extract('//text()'),',') as b from tta group by a;
A B
---------- --------------------------------------------------
1 John,Steve,Jim,Anne
2 Karen,Henry,Betty,Rebecca,Tom
Note:
Last mentioned method - Pivot using xmlagg works well for small amounts of data. In Oracle 10g I have met some difficulties when data was over 5000 records.
Note: From the Oracle 11g version function listagg satisfies needs of text aggregation well.
CSV decomposition skeleton
Following example shows a decomposition of CSV file with certain number of columns into particular fields.
It also supports enclosing of fields.
Declare
c_delimiter Varchar2(1 CHAR):='|';
c_encloser Varchar2(1 CHAR):='"';
c_linefeed Varchar2(1 CHAR):=chr(10);
c_carrige Varchar2(1 CHAR):=chr(13);
c_fields Integer:=4;
v_SRC CLOB; ---the delimited CLOB
Begin
for r1 in(
Select regexp_substr(v_SRC,'(^|['||c_linefeed||'])([^'||c_linefeed||']*)('||c_carrige||'?|\s*)',1,level,'ni',2) line
from dual connect by level<=regexp_count(v_SRC,'['||c_linefeed||']')+1
)LOOP
------------------------------------
DBMS_OUTPUT.PUT_LINE('>'||r1.line||'<');
------------------------------------
for r2 in(
Select regexp_substr(r1.line,'(^|['||c_delimiter||'])((['||c_encloser||'][^'||c_encloser||']*)|([^'||c_delimiter||']*))',1,level,'ni',2) field
from dual connect by level<=c_fields
)LOOP
r2.field:=ltrim(r2.field,c_encloser);
-------------------------------------
DBMS_OUTPUT.PUT_LINE('['||r2.field||']');
-------------------------------------
end LOOP;
end LOOP;
End;
/
The same in one query using WITH clauses and PIVOT to et fields into fields:
with CONST as(
Select
'|' as c_delimiter
,'"' as c_encloser
,chr(10) as c_linefeed
,chr(13) as c_carriage
,4 as c_fields
from DUAL
)
,SRC as(
Select
CL_VAL as v_SRC --the delimited CLOB
,CONST.*
from CONST
cross join ZZZ_BJA_CLOB --source table with CLOB
)
,LINES as(
Select
regexp_substr(v_SRC,'(^|['||c_linefeed||'])([^'||c_linefeed||']*)('||c_carriage||'?|\s*)',1,level,'ni',2) line
,level as line_nr
,c_delimiter
,c_encloser
,c_fields
from SRC connect by level<=regexp_count(v_SRC,'['||c_linefeed||']')+1
)
,FIELDS as(
Select cast(ltrim(regexp_substr(line,'(^|['||c_delimiter||'])((['||c_encloser||'][^'||c_encloser||']*)|([^'||c_delimiter||']*))',1,field_nr,'ni',2),c_encloser) as VARCHAR2(4000)) field
,field_nr
,line_nr
from LINES cross join (select level as field_nr from CONST connect by level<=c_fields)
)
Select * from FIELDS
pivot(max(FIELD) for FIELD_NR in(1 as "F1",2 as "F2", 3 as "F3", 4 as "F4"))