mardi 21 juin 2016

updating rows in HIVE with new incoming data

I tried to do the similar problem from https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_dataintegration/content/incrementally-updating-hive-table-with-sqoop-and-ext-table.html . But it doesn't work for my problem.

history_data is my base data and incremental_data is new incoming data

Order_number,Order-Status,Entry_Date,Invoice_Amount

history_data (will come only once)

100,1,'2016-05-01',10.50
300,2,'2016-05-01',20.00
700,1,'2016-05-02',30.00
800,9,'2016-05-05',20.00
900,5,'2016-05-09',50.00

incremental_data (will come every day)

100,4,'2016-05-01',10.50       Order 100 has new status now - this should update order 100 in history data and result should have status 4         
950,1,'2016-05-10',10.00       New order - this should be added to history data
975,9,'2016-05-15',20.00       New order - this should be added to history data
700,7,'2016-05-02',30.00       Order 700 has new status now - this should update order 700 in history data and result should have status 7


The result should look like below

100,4,'2016-05-01',10.50
300,2,'2016-05-01',20.00
700,7,'2016-05-02',30.00
800,9,'2016-05-05',20.00
900,5,'2016-05-09',50.00
950,1,'2016-05-10',10.00       
975,9,'2016-05-15',20.00

Following is my code:

CREATE VIEW reconcile_view AS
SELECT t1.* FROM
    (SELECT * FROM history_data
     UNION ALL
     SELECT * from incremental_data) t1
JOIN
    (SELECT order_number FROM
        (SELECT * FROM history_data
         UNION ALL
         SELECT * from incremental_data )
     GROUP BY order_number) t2
ON t1.order_number = t2.order_number;

This code also gives me error saying: FAILED: ParseException line 4:42 Failed to recognize predicate 'GROUP'. Failed rule: 'identifier' in subquery source

The I tried this code:

create view trit_view as SELECT * FROM history_data UNION ALL SELECT * from incremental_table;

CREATE VIEW groupedit as SELECT order_number from trit_view GROUP BY order_number;

CREATE VIEW reconcile_view AS SELECT t1.* from trit_view t1 join groupedit t2 on t1.order_number = t2.order_number;

But it does not give me required result.

Aucun commentaire:

Enregistrer un commentaire