Friday, February 17, 2012

Doing a SELECT on a different field than the field in the GROUP BY

I'm using SQL Server 2005.

I've got table tblSaleAddresses which looks something like this:

AddressID SaleID Address1 Address2 ..... DateEffective
- -
1 100 Unit 20 100 Easy St 1 Feb 04
2 100 10 Any St 1 Oct 05
3 101 85 Main St 1 Mar 04
4 101 20 Lovers Ln 15 Sep 05

I want to select the current address for each SaleID - that is the address for each SaleID WHERE Max(DateEffective) and WHERE DateEffective <= GETDATE().

In the case above, I should get AddressID 1 (AddressID 2 isn't yet effective because it's still in the future) for SaleID 100 and AddressID 4 for SaleID 101.

I'm having trouble because I want to SELECT AddressID but GROUP BY SaleID.

I know you can do a GROUP BY subquery and then join on the SaleID and DateEffective, but it doesn't seem right to me you have to join on a date field (which theorectically isn't very unique). I expected that there would by a statement that would allow you to select the specific AddressID for the MAX(DateEffective)?

Or have I organised my data in the wrong way? And if so, what should I have done differently?

Here's the SELECT statement that joins on the date field:

SELECT t.*

FROM tblSaleAddresses t

INNER JOIN(

SELECT SaleID, MAX(DateEffective) AS MaxDateEffective

FROM tblSaleAddresses

WHERE DateEffective <= GETDATE()

GROUP BY SaleID

) d On t.SaleID = d.SaleID And t.DateEffective = d.MaxDateEffective

Thanks!

I posted few queries that show how to do this for a similar problem. Please see the thread titled "Get latest purchase of each vender". Here are few queries:

-- Works from SQL70 onwards:
select t1.*
from tblSaleAddresses as t1
where t1.AddressId = (select top 1 t2.AddressId
from tblSaleAddresses as t2
where t2.SaleID = t1.SaleID
and t2.DateEffective <= CURRENT_TIMESTAMP
order by t2.DateEffective DESC)
and t1.DateEffective <= CURRENT_TIMESTAMP;

-- SQL2005 specific:

select t2.*

from (

select t1.*
, ROW_NUMBER() OVER(PARTITION BY t1.SaleID ORDER BY t1.DateEffective DESC, t1.AddressID DESC) as AddrSeq
from tblSaleAddresses as t1
where t1.DateEffective <= CURRENT_TIMESTAMP
) as t2
where t2.AddrSeq = 1;

-- ANSI SQL query
select t1.*
from tblSaleAddresses as t1
join (
-- I concatenate date & id as binary data and take MAX on that first
-- then you can parse the address id of the max value
-- this works for positive values of AddressId and Date
-- you can character values that sorts correctly also.
select cast(right(max(cast(convert(char(8), t2.DateEffective, 112) as binary(8))
+ cast(t2.AddressID as binary(4))), 4) as int)
from tblSaleAddresses as t2
where t2.DateEffective <= CURRENT_TIMESTAMP
group by t2.SaleID
) as t3(AddressId)
on t3.AddressId = t1.AddressId;

|||Thanks! I chose the SQL 2005 option and it worked perfectly.

I was already starting down the path of the new ranking functions including ROW_NUMBER, but I was still trying to figure it out.

Thanks for your help!

Regards,

No comments:

Post a Comment