Jak uniknąć używania tymczasowego w zapytaniach typu wiele-do-wielu?

Oto uproszczony przykład, który jakiś czas temu zrobiłem dla podobnego pytania związanego z wydajnością, który wykorzystuje klastrowane indeksy klucza podstawowego innodb (oczywiście dostępne tylko z innodb !!)

Masz 3 tabele:category, product i product_category w następujący sposób:

drop table if exists product;
create table product
(
prod_id int unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb; 

drop table if exists category;
create table category
(
cat_id mediumint unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb; 

drop table if exists product_category;
create table product_category
(
cat_id mediumint unsigned not null,
prod_id int unsigned not null,
primary key (cat_id, prod_id) -- **note the clustered composite index** !!
)
engine = innodb;

Najważniejszą rzeczą jest kolejność klastrowanego złożonego klucza podstawowego product_catgeory jako typowe zapytania dla tego scenariusza zawsze są prowadzone przez cat_id =x lub cat_id w (x,y,z...).

Mamy 500K kategorie, 1 milion produkty i 125 milionów kategorie produktów.

select count(*) from category;
+----------+
| count(*) |
+----------+
|   500000 |
+----------+

select count(*) from product;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+

select count(*) from product_category;
+-----------+
| count(*)  |
+-----------+
| 125611877 |
+-----------+

Zobaczmy więc, jak ten schemat działa dla zapytania podobnego do Twojego. Wszystkie zapytania są uruchamiane na zimno (po restarcie mysql) z pustymi buforami i bez buforowania zapytań.

select
 p.*
from
 product p
inner join product_category pc on 
    pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
 p.prod_id desc -- sry dont a date field in this sample table - wont make any difference though
limit 20;

+---------+----------------+
| prod_id | name           |
+---------+----------------+
|  993561 | Product 993561 |
|  991215 | Product 991215 |
|  989222 | Product 989222 |
|  986589 | Product 986589 |
|  983593 | Product 983593 |
|  982507 | Product 982507 |
|  981505 | Product 981505 |
|  981320 | Product 981320 |
|  978576 | Product 978576 |
|  973428 | Product 973428 |
|  959384 | Product 959384 |
|  954829 | Product 954829 |
|  953369 | Product 953369 |
|  951891 | Product 951891 |
|  949413 | Product 949413 |
|  947855 | Product 947855 |
|  947080 | Product 947080 |
|  945115 | Product 945115 |
|  943833 | Product 943833 |
|  942309 | Product 942309 |
+---------+----------------+
20 rows in set (0.70 sec) 

explain
select
 p.*
from
 product p
inner join product_category pc on 
    pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
 p.prod_id desc -- sry dont a date field in this sample table - wont make any diference though
limit 20;

+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref           | rows | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
|  1 | SIMPLE      | pc    | ref    | PRIMARY       | PRIMARY | 3       | const           |  499 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | p     | eq_ref | PRIMARY       | PRIMARY | 4       | vl_db.pc.prod_id |    1 |                                              |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

Czyli to 0,70 sekundy zimno – auć.

Mam nadzieję, że to pomoże :)

EDYTUJ

Po przeczytaniu odpowiedzi na mój komentarz powyżej wydaje się, że masz jedną z dwóch możliwości:

create table articles_to_categories
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(article_id, category_id), -- good for queries that lead with article_id = x
key (category_id)
)
engine=innodb;

lub.

create table categories_to_articles
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(category_id, article_id), -- good for queries that lead with category_id = x
key (article_id)
)
engine=innodb;

zależy od typowego zapytania dotyczące sposobu definiowania klastra PK.