Friday, March 30, 2012

How do I manage this mess? Thanks!

Say for example I have the following 2 database tables, the first one contains the old employee data, and has the fields shown below:

oldEmployeeID

FirstName

LastName

DateOfBirth

HiringDate

TerminationDate

and another one containing the new employee data with similar fields but instead of oldEmployeeID, it is showing the newEmployeeID.

During the conversion process, something were messed up and instead of putting in the original hiring date of the workers into the new employee database, the conversion date was put in, which, depending on the mood of HR ladies, could be any date, and at the same time, of course, new employee join the company, and we assume their hiring dates were entered correctly. On top of that, there are some employee who were terminated before the conversion took place but we still need to keep a record of that.

And I created a third table, say, emplyeeAll with similar fields to the employee data tables.

So here is what I need to do: if the firstName, lastName and DateOfBirth in the old employee data table and the new employee data table matches, I would assume they are the same employee, hence I would put the information for the employee obtained from the new employee data table to the employeeAll table, with the Hiring Date changed to the Hiring Date of the old employee data table (and do not copy the record from the old employee table to prevent duplicates), otherwise, I would simply copy and paste the data in new and old employee table to my employeeAll table.

I know this is really confusing, but...well...hope you know what I am saying...

Is it possible to have a SQL statement for all these? If so, how should the statement looks like?

Thanks a lot!

Regards,

Anyi

Hi,

If I understand this correctly, (I think I do....perhaps) ... if you have data in two tables (and I know you have three, the last would be the destinationtable employeeAll) and you wish to collect the difference between table 1 and 2 (employeeOld, employeeNew) then the left outer join is your answer...

--If you wish to collect differential rows from old table

select employeeOld.*

from employeeOld

left outer join employeeNew on employeeOld.FirstName = employeeNew.FirstName and employeeOld.DateOfBirth = employeeNew.DateOfBirth

where employeeNew.DateOfBirth is null

Similarly you can mix 'n match the above to collect differetial data from the other two tables...note the 'select' is table where the additional data lies and the where clause (is null) is the comparison table.

Hope it helps

|||

I thinks that something like this is what you want:


Code Snippet


-- This collects the data
INSERT INTO EmployeeAll
SELECT
n.EmployeeID,
n.LastName,
n.FirstName,
n.DateOfBirth,
coalesce( o.HiringDate, n.HiringDate ),
coalesce( o.TerminationDate, n.TerminationDate ),
n.{RemainingColumns}
FROM NewEmployees n
LEFT JOIN OldEmployees o
ON ( n.LastName = o.LastName
AND n.FirstName = o.FirstName
AND n.DateOfBirth = o.DateOfBirth
)

|||

But would this actually copy the rest of the record into EmployeeAll, i.e., the records that only showed up in the old employee table (the employees terminated before the conversion took place) or the records that only showed up on the new employee table (i.e., the employees hired after the conversion was completed)?

Thanks!

Regards,

Anyi

Arnie Rowland wrote:

I thinks that something like this is what you want:


Code Snippet


-- This collects the data
INSERT INTO EmployeeAll
SELECT
n.EmployeeID,
n.LastName,
n.FirstName,
n.DateOfBirth,
coalesce( o.HiringDate, n.HiringDate ),
coalesce( o.TerminationDate, n.TerminationDate ),
n.{RemainingColumns}
FROM NewEmployees n
LEFT JOIN OldEmployees o
ON ( n.LastName = o.LastName
AND n.FirstName = o.FirstName
AND n.DateOfBirth = o.DateOfBirth
)

|||

If you want both the records in the old table that don't exists in the new table, the records in the new table that don't exists in the old table, and the records that are in both (with the corrected HiringDate), then change the JOIN from a LEFT JOIN to a FULL JOIN. (This works in SQL 2005 -NOT SQL 2000.)

If you are using SQL 2000, you will need three queries to accomplish the same task.

No comments:

Post a Comment