Saturday, February 25, 2012

Newbie:UNION with SELECT TOP

I have been trying to query the Top 100 selling items for each category to b
e
one result set...all day. I thought that this should work...
SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
FROM dbo.Sales
WHERE categoryid ='01'
GROUP BY productid
ORDER BY sum(sales) DESC
UNION
SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
FROM dbo.Sales
WHERE categoryid ='02'
GROUP BY productid
ORDER BY sum(sales) DESC
UNION...
...and so on for our 14 different categories.
This is the first time that I have tried to use that UNION command, and I'm
wondering if it is the right way to go about this? Obviously I am muddling m
y
way through SQL Queries and appreciate any help.Chuck wrote:
> I have been trying to query the Top 100 selling items for each
> category to be one result set...all day. I thought that this should
> work...
> SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
> FROM dbo.Sales
> WHERE categoryid ='01'
> GROUP BY productid
> ORDER BY sum(sales) DESC
> UNION
> SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
> FROM dbo.Sales
> WHERE categoryid ='02'
> GROUP BY productid
> ORDER BY sum(sales) DESC
> UNION...
> ...and so on for our 14 different categories.
> This is the first time that I have tried to use that UNION command,
> and I'm wondering if it is the right way to go about this? Obviously
> I am muddling my way through SQL Queries and appreciate any help.
Apart from suggesting the use of UNION ALL instead of UNION*, I can make no
suggestions here. Is it not doing what you want it to do?
Bob Barrows
*Using the ALL keyword prevents the query engine from wasting time trying to
eliminate duplicate rows from the result. You know there will be no
duplicates, so use UNION ALL
--
Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.|||> Apart from suggesting the use of UNION ALL instead of UNION*, I can make
> no
> suggestions here. Is it not doing what you want it to do?
Well, you can't have ORDER BY within each UNION, you can only order the
entire result.
Though I am not sure why this query all has to be in one resultset. My
three suggestions would be:
(a) get the top 100 across all categories:
SELECT TOP 100
productID, categoryID, SUM(sales)
FROM dbo.Sales
GROUP BY productID, categoryID
ORDER BY 3 DESC;
(b) an easy way to bypass the ORDER/UNION quandary is to use a #temp table
or @.table variable and insert each result there, then you can order on the
whole instead of its parts.
(c) have 14 results, or take category as a parameter and call the stored
procedure however many times is relevant (my guess is that, in general, we
are either looking at sales overall (in which case (a) is more prudent) or
we are looking at a specific category).
A|||create table #t(productid int, categoryid int, sumsales float)
insert into #t values(1,1,12.3)
insert into #t values(2,1,11.0)
insert into #t values(3,1,15.3)
insert into #t values(1,2,12.3)
insert into #t values(2,2,1.3)
insert into #t values(3,2,2.3)
insert into #t values(4,2,7.3)
insert into #t values(5,2,12.1)
select * from #t t1 where
(select count(*) from #t t2 where t1.categoryid = t2.categoryid
and t2.sumsales>t1.sumsales)<2
order by categoryid, sumsales desc
(1 row(s) affected)
(1 row(s) affected)
(1 row(s) affected)
(1 row(s) affected)
(1 row(s) affected)
(1 row(s) affected)
(1 row(s) affected)
(1 row(s) affected)
productid categoryid sumsales
-- --
----
3 1 15.300000000000001
1 1 12.300000000000001
1 2 12.300000000000001
5 2 12.1
(4 row(s) affected)
drop table #t|||oops I hit Post button too soon.
the query I posted will give you top 2 in each category, *assuming
there are no ties*
I added a tie:
insert into #t values(4,1,12.3)
and this works:
select * from #t t1 where sumsales in
(select top 2 with ties sumsales
from #t t2 where t1.categoryid = t2.categoryid
order by sumsales desc)
order by categoryid, sumsales desc
productid categoryid sumsales
-- --
3 1 15.300000000000001
4 1 12.300000000000001
1 1 12.300000000000001
1 2 12.300000000000001
5 2 12.1
(5 row(s) affected)|||Aaron Bertrand [SQL Server MVP] wrote:
> Well, you can't have ORDER BY within each UNION, you can only order
> the entire result.
Thanks, I missed that (obviously ... )
--
Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.|||you can do it with derived tables and subqueries as well, and not have
to hard-code categoryid's
e.g.
select productid, categoryid, sales
from (
select productid, categoryid, sum(sales) as sales
from sales
group by productid, categoryid
) SumAll
where productid in (
select top 100 productid
from (
select productid, sum(sales) as sales
from sales
where categoryid=SumAll.categoryid
group by productid
) SumCategory
order by sales desc
)
order by categoryid asc, sales desc
Chuck wrote:
> I have been trying to query the Top 100 selling items for each category to
be
> one result set...all day. I thought that this should work...
> SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
> FROM dbo.Sales
> WHERE categoryid ='01'
> GROUP BY productid
> ORDER BY sum(sales) DESC
> UNION
> SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
> FROM dbo.Sales
> WHERE categoryid ='02'
> GROUP BY productid
> ORDER BY sum(sales) DESC
> UNION...
> ...and so on for our 14 different categories.
> This is the first time that I have tried to use that UNION command, and I'
m
> wondering if it is the right way to go about this? Obviously I am muddling
my
> way through SQL Queries and appreciate any help.|||
"Alexander Kuznetsov" wrote:

> oops I hit Post button too soon.
> the query I posted will give you top 2 in each category, *assuming
> there are no ties*
> I added a tie:
> insert into #t values(4,1,12.3)
> and this works:
> select * from #t t1 where sumsales in
> (select top 2 with ties sumsales
> from #t t2 where t1.categoryid = t2.categoryid
> order by sumsales desc)
> order by categoryid, sumsales desc
>
> productid categoryid sumsales
> -- --
> 3 1 15.300000000000001
> 4 1 12.300000000000001
> 1 1 12.300000000000001
> 1 2 12.300000000000001
> 5 2 12.1
> (5 row(s) affected)
>
Bob, Aaron, and Alexander,
Thank you for your quick replies. In answer to your questions...I have
thought of creating individual result sets, which would solve part of the
issue. The ultimate goal here is to create a VIEW which will join with an
inventory table to produce the percentage of Top 100 selling merchandise
"in-stock" by category. As a result, at some point it will be necessary to
rejoin the individual result sets, which I am trying to avoid.
However, I want to break down the project into bite sized pieces, focussing
for the moment on this problem.
I have tried Alexander's suggestion, but the sumsales column in it is not an
aggregate, while my query it is. That produced the error that I could not
use an aggregate as the value for an IN statement. I have altered it to
this, which has produced a timeout error.
SELECT ProductID, CategoryID, SUM(Sales) AS Expr1, SUM(QTY) AS Expr2
FROM cghastin.jvsProd_Performance t1
WHERE (ProductID IN
(SELECT TOP 2 productid
FROM cghastin.jvsprod_performance t2
WHERE t1.categoryid = t2.categoryid
GROUP BY productid
ORDER BY SUM(sales) DESC))
GROUP BY CategoryID, ProductID|||an index on (productID, category_id, sales, qty)
might speed it up
Also you may try this:
select productid, categoryid, sumsales, sumqty
from (
select productid, categoryid, sum(sales) as sumsales, sum(qty) as
sumqty
from sales
group by productid, categoryid
) t
where (select count(*) from (
select productid, categoryid, sum(sales) as sumsales, sum(qty) as
sumqty
from sales
group by productid, categoryid ) t1 where t.productid =
t1.productid
and t.categoryid = t1.categoryid and t.sumsales < t1.sumsales)<100
that is with ties|||an index on (productID, category_id, sales, qty)
might speed it up
Also you may try this:
select productid, categoryid, sumsales, sumqty
from (
select productid, categoryid, sum(sales) as sumsales, sum(qty) as
sumqty
from sales
group by productid, categoryid
) t
where (select count(*) from (
select productid, categoryid, sum(sales) as sumsales, sum(qty) as
sumqty
from sales
group by productid, categoryid ) t1 where t.productid =
t1.productid
and t.categoryid = t1.categoryid and t.sumsales < t1.sumsales)<100
that is with ties

No comments:

Post a Comment