Zdobądź 10 najlepszych produktów z każdej kategorii

Prawdopodobnie istnieją powody, aby nie używać funkcji analitycznych, ale korzystać z funkcji analitycznych samo :

select am, rf, rfm, rownum_rf2, rownum_rfm
from
(
    -- the 3nd level takes the subproduct ranks, and for each equally ranked
    -- subproduct, it produces the product ranking
    select am, rf, rfm, rownum_rfm,
      row_number() over (partition by rownum_rfm order by rownum_rf) rownum_rf2
    from
    (
        -- the 2nd level ranks (without ties) the products within
        -- categories, and subproducts within products simultaneosly
        select am, rf, rfm,
          row_number() over (partition by am order by count_rf desc) rownum_rf,
          row_number() over (partition by am, rf order by count_rfm desc) rownum_rfm
        from
        (
            -- inner most query counts the records by subproduct
            -- using regular group-by. at the same time, it uses
            -- the analytical sum() over to get the counts by product
            select tg.am, ttc.rf, ttc.rfm,
              count(*) count_rfm,
              sum(count(*)) over (partition by tg.am, ttc.rf) count_rf
            from tg inner join ttc on tg.value = ttc.value
            group by tg.am, ttc.rf, ttc.rfm
        ) X
    ) Y
    -- at level 3, we drop all but the top 5 subproducts per product
    where rownum_rfm <= 5   -- top  5 subproducts
) Z
-- the filter on the final query retains only the top 10 products
where rownum_rf2 <= 10  -- top 10 products
order by am, rownum_rf2, rownum_rfm;

Użyłem rownum zamiast rangi, więc nigdy nie otrzymasz remisów, czyli innymi słowy, remisy będą ustalane losowo. To również nie działa, jeśli dane nie są wystarczająco gęste (mniej niż 5 podproduktów w dowolnym z 10 najlepszych produktów — zamiast tego może wyświetlać podprodukty z niektórych innych produktów). Ale jeśli dane są gęste (duża baza danych), zapytanie powinno działać poprawnie.

Poniższy zapis wykonuje dwa przejścia danych, ale w każdym przypadku zwraca poprawne wyniki. Ponownie, jest to zapytanie bez powiązań.

select am, rf, rfm, count_rf, count_rfm, rownum_rf, rownum_rfm
from
(
    -- next join the top 10 products to the data again to get
    -- the subproduct counts
    select tg.am, tg.rf, ttc.rfm, tg.count_rf, tg.rownum_rf, count(*) count_rfm,
        ROW_NUMBER() over (partition by tg.am, tg.rf order by 1 desc) rownum_rfm
    from (
        -- first rank all the products
        select tg.am, tg.value, ttc.rf, count(*) count_rf,
            ROW_NUMBER() over (order by 1 desc) rownum_rf
        from tg
        inner join ttc on tg.value = ttc.value
        group by tg.am, tg.value, ttc.rf
        order by count_rf desc
        ) tg
    inner join ttc on tg.value = ttc.value and tg.rf = ttc.rf
    -- filter the inner query for the top 10 products only
    where rownum_rf <= 10
    group by tg.am, tg.rf, ttc.rfm, tg.count_rf, tg.rownum_rf
) X
-- filter where the subproduct rank is in top 5
where rownum_rfm <= 5
order by am, rownum_rf, rownum_rfm;

kolumny:

count_rf : count of sales by product
count_rfm : count of sales by subproduct
rownum_rf : product rank within category (rownumber - without ties)
rownum_rfm : subproduct rank within product (without ties)