HomeArticlesOthersLogin
NewsCustom SearchContact
Intra-field deduplication of list by regular expression
    by Ludek Bob Jankovsky, 23-May-2018 (ORACLE STUFFS)
Sometimes you need to deduplicate repeating patterns from a text string. Our case has been inspired by list of values based on LISTAGG operation on source with duplicities. Of course, the best practice is to deduplicate the source before the LISTAGG operation, but what if we have just the result with mentioned duplicities?

For example, we have a following string in one database field:

'01,02,02,03,03,03,03,04,05'

There are repeating values in the list and lets assume we want to get rid of them.
First we have to identify a pattern of a single value. In our case it is:

,\d{2}

The delimiter of particular fields (if any) should be incuded.
To deduplicate the string we have to:

  • Add the delimiter at the start of our string.
  • Use regexp_replace function to remove duplicities.
  • Remove the delimiter from the start of the result string.

  • All could be cone by following simple expression:

    ltrim(regexp_replace(','||OUR_STRING,'(,\d{2})(\1)+','\1'),',') 

    Let's use following test query to check how that works:

    With L1 as (
      Select '01,02,02,03,03,03,03,04,05' as STRING from DUAL
    )
    Select 
       L1.STRING
      ,ltrim(regexp_replace(','||L1.STRING,'(,\d{2})(\1)+','\1'),',')  as RESULT
    from L1;

    SQL> With L1 as ( 2 Select '01,02,02,03,03,03,03,04,05' as STRING from DUAL 3 ) 4 Select 5 L1.STRING 6 ,ltrim(regexp_replace(','||L1.STRING,'(,\d{2})(\1)+','\1'),',') as RESULT 7 from L1; STRING -------------------------- RESULT -------------------------- 01,02,02,03,03,03,03,04,05 01,02,03,04,05 Elapsed: 00:00:00.07

    Ludek Bob Jankovsky
    All Right Reserved © 2007, Designed by Bob Jankovsky