3

Using PostgreSQL 8.4, I have successfully been able to use array_agg() to take a case of multiple orders and make a single row per customers:

From this:

order_id|customer_id|order_date  |order_desc
1       |1          |"2010-01-01"|"Tom's First" 
2       |1          |"2010-04-01"|"Tom's Second" 
7       |1          |"2010-04-13"|"Tom's Third" 
8       |1          |"2011-04-13"|"Tom's Last" 
5       |1          |"2011-06-20"|"Tom's Really Last." 
3       |2          |"2010-07-07"|"Dick's First" 
6       |2          |"2011-07-07"|"Dick's Other" 
4       |3          |"2011-04-04"|"Harry's Only"

Using this:

select cu.customer, array_agg(ord.order_id) as orders from test_order ord
inner join test_customer cu
on ord.customer_id = cu.customer_id
group by cu.customer

results in:

customer   |orders  
"Tom"      |"{1,2,7,8,5}"  
"Dick"     |"{3,6}"  
"Harry"    |"{4}"  

And I can grab pieces of the array to create new columns if I hard code each iteration:

select cu.customer,   
(array_agg(ord.order_id))[1] as order_1,  
(array_agg(ord.order_id))[2] as order_2,  
(array_agg(ord.order_id))[3] as order_3,  
(array_agg(ord.order_id))[4] as order_4,  
(array_agg(ord.order_id))[5] as order_5   
from test_order ord  
inner join test_customer cu  
on ord.customer_id = cu.customer_id  
group by cu.customer  

results in:

customer|order_1|order_2|order_3|order_4|order_5  
"Dick"  |3      |6      |       |       |  
"Harry" |4      |       |       |       |   
"Tom"   |8      |1      |5      |2      |7  

However, what I'd like to do, in two steps:

  1. For Loop my way through the records so that I don't have to create every iteration of the field. The good news is that the structure above doesn't error and just passes NULL, but if I ever get to some insane number of records, I don't have to have to keep creating order_55, order_56 etc manually in my statement.

  2. Even better would be eventually to not pass it a specific field and have it iterate through all the fields (barring the customer_id) and give me iterations of each field, to the effect of:

    customer|order_id1|order_date1|order_desc1|order_id2|order_date2|order_desc2| ... 
    

    etc etc. Basically joining the parent table (customer) to the child (order), but having multiple child records go across a single row instead of creating multiples.

    (Yes, I understand that this goes against the basic concept of why you do parent/child tables in the first place. However, I pass this on to a client and this would make the process infinitely easier.)

UPDATE: I've gotten closer with the cannibalized function ... the first time I call it, it creates the columns and populates one of the customers. But for some reason, my IF NOT EXISTS works when ran separately, but not within the function: I get a "column order_id1 exists" error. I'd also like to eventually modify this so the specific fields aren't hard coded; instead of the customer_id I'd like to do something like pass the parent table, child table and joining ID, and have it fully append the child table in this crosstab-ish manner.

CREATE FUNCTION loop_test(integer) RETURNS integer AS $$

DECLARE
rOrder RECORD;
loop_counter INT := 1;
target_customer_id ALIAS FOR $1;        
BEGIN

FOR rOrder IN SELECT * 
    FROM vdad_data.test_order 
    WHERE customer_id = target_customer_id 
    ORDER BY order_id LOOP

    IF NOT EXISTS
        (
        SELECT * FROM information_schema.COLUMNS
        WHERE COLUMN_NAME= 'order_id' || loop_counter
        AND TABLE_NAME='test_customer'
        AND TABLE_SCHEMA='vdad_data'
        )
        THEN

        EXECUTE 'ALTER TABLE vdad_data.test_customer
        ADD COLUMN order_id' || loop_counter || ' integer';
    END IF;

    IF NOT EXISTS
        (
        SELECT * FROM information_schema.COLUMNS
        WHERE COLUMN_NAME= 'order_date' || loop_counter
        AND TABLE_NAME='test_customer'
        AND TABLE_SCHEMA='vdad_data'
        )
        THEN

        EXECUTE 'ALTER TABLE vdad_data.test_customer
        ADD COLUMN order_date' || loop_counter || ' date';
    END IF;


    IF NOT EXISTS
        (
        SELECT * FROM information_schema.COLUMNS
        WHERE COLUMN_NAME= 'order_desc' || loop_counter
        AND TABLE_NAME='test_customer'
        AND TABLE_SCHEMA='vdad_data'
        )
        THEN

        EXECUTE 'ALTER TABLE vdad_data.test_customer
        ADD COLUMN order_desc' || loop_counter || ' character varying';
    END IF;

EXECUTE 'UPDATE vdad_data.test_customer 
      SET order_id' || loop_counter || ' = ' || rOrder.order_id ||',
      order_date' || loop_counter || ' = ' || quote_literal(to_char(rOrder.order_date,'yyyy-mm-dd')) ||',         
      order_desc' || loop_counter  || ' = ' ||  quote_literal(rOrder.order_desc) ||'
      WHERE customer_id = ' ||rOrder.customer_id;

loop_counter = loop_counter + 1;
END LOOP;

RETURN 1;
END;
$$ LANGUAGE plpgsql;

I apologize for being all over the map, as I've been trying to tackle several things about this at once that I can't quite get. Any help is appreciated, thanks!

3
  • 1
    Are you trying to do something like this crosstab function? Commented Jul 5, 2011 at 22:58
  • Hrm. Sort of. I'm guessing that's a 9.0 function, as I don't have it in 8.4. But it appears that still requires a manual hard code of each field? And the header of each data field isn't determined by a third column (the "cat" value in that example), it's just iterations of the existing column header: order_id becomes order_id1, order_id2, etc. Commented Jul 6, 2011 at 12:52
  • 1
    @MitchO: it's also in 8.4. You need to install the contrib. Commented Jul 17, 2011 at 15:37

2 Answers 2

1

Are you trying to get the ordinal number of each of the customer's orders? You can do that with the row_number() function in Postgres 8.4. Creating separate columns for each order number is not sustainable or efficient in SQL.

Something like:

select cu.customer,
       row_number() OVER(PARTITION BY cu.customer ORDER BY ord.order_date)
from test_order ord inner join test_customer cu  
  on ord.customer_id = cu.customer_id  
group by cu.customer  
Sign up to request clarification or add additional context in comments.

3 Comments

No, this isn't what i'm trying to do. Unfortunately, this is an oversimplified version of my true data. You may feel that seperate columns is not efficient, but I have a situation where I get data with a multi table format, and I need to deliver it as a single table. Therefore, If i just did a standard join, I'd get an inordinate amount of increased rows. Imagine one main table of customers, another of orders, another of addresses, another of relatives ... and doing the join on that per customer.
I get the right data, but I can turn one customer into hundreds, if not thousands of records, and it'd be served much better with order1, order2, address1, address2, addresss3, relative1 .. etc.
Sorry, I guess it's not entirely clear what your application is trying to do. If you have multiple one to many (one customer to many addresses or orders) join then one way to approach the solution is to write two queries. One that lists all customer addresses and one that lists all orders. If you try to join one customer to both addresses and orders you get the cross join situation you're describing (I think).
0

This:

select array_agg(row(order_id,order_date,order_desc))
from (
select 1 order_id,1 customer_id,'2010-01-01' order_date,'Tom''s First' order_desc union 
select 2 order_id,1 customer_id,'2010-04-01' order_date,'Tom''s Second' order_desc union 
select 7 order_id,1 customer_id,'2010-04-13' order_date,'Tom''s Third' order_desc union 
select 8 order_id,1 customer_id,'2011-04-13' order_date,'Tom''s Last' order_desc union 
select 5 order_id,1 customer_id,'2011-06-20' order_date,'Tom''s Really Last.' order_desc union 
select 3 order_id,2 customer_id,'2010-07-07' order_date,'Dick''s First' order_desc union 
select 6 order_id,2 customer_id,'2011-07-07' order_date,'Dick''s Other' order_desc union 
select 4 order_id,3 customer_id,'2011-04-04' order_date,'Harry''s Only' order_desc
) orders
group by orders.customer_id

gives you three rows:

"{"(2,2010-04-01,\"Tom's Second\")","(1,2010-01-01,\"Tom's First\")","(7,2010-04-13,\"Tom's Third\")","(5,2011-06-20,\"Tom's Really Last.\")","(8,2011-04-13,\"Tom's Last\")"}"

"{"(3,2010-07-07,\"Dick's First\")","(6,2011-07-07,\"Dick's Other\")"}"

"{"(4,2011-04-04,\"Harry's Only\")"}"

This looks very close to what you said would be "even better":

customer|order_id1|order_date1|order_desc1|order_id2|order_date2|order_desc2| ...

The only difference is: Everything is contained in a single column. Of course that single column is an array and each element is a composite type and if you flatten that structure you got exactly what you asked for. If course it depends on whether you have the means to do the flattening.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.