Friday, July 10, 2015

Concatenate distinct row values of multiple columns effectively

Converting columns to rows, rows to columns, can sometimes be a bit of a headache for database developers. Recently run into a case where I needed to modify a dataset in order to have all the distinct row values from two separate columns for one ID concatenated in one row in the same separate columns. Here's a visual example of a small test dataset:
What I needed to get was:
ID                            A                             B
1                        Juliet/Krisjanis          Kilo/Mike
2                        Tango                      Mike
3                        Juliet                        Bravo/Charley

 From the first look might seem easy to do, but actually isn't at all. You might to select the values from each column in a separate query and then join them by ID or have several nested queries or...

Here's a very nice and compact way of doing it without a need of several subqueries or similar struggles. Just use regepxp_replace and listagg functions together with a windowing function like this:

regexp_replace( listagg(trim(your_column), '/') within GROUP (
    ORDER BY your_column) OVER (PARTITION BY your_id) ,'([^/]+)(/\1)+', '\1')

So, here's how it looks on the previous dataset:
Thanks for your attention!

No comments:

Post a Comment