I have a fairly large SQL query that is being slowed by the GROUP BY and ORDER BY statements. What is the best way to optimize?

What is the best way to optimize this query as its taking up to 12 second to execute?

Apologies for the size of the query, it executes almost instantly without the group by and order by statements.

I’m fairly new to SQL optimization, I’ve only been coding for around a year. I didn’t write this query but I need to fix it.

SELECT `products`.*, `product_alternative`.`alternative_product_code`, `product_alternative`.`id` as `product_alternative_id`, `vat_rate`.`rate` AS `vat_rate`, `product_images`.`filename` AS `product_image_file`, `product_docs`.`filename` AS `product_doc_file`, `suppliers`.`supplier_code`, `suppliers`.`name` AS `supplier_name`, `commission`.`commission` AS `supplier_commission`, `categories`.`name` AS `category_name`, `sub_categories`.`name` AS `subcategory_name`, IF(`products`.`product_doc_id` = 0,  NULL,  CONCAT( "/product-docs/", `products`.`id` )) AS `product_doc_url`,IF(`products`.`product_image_id` = 0, NULL, CONCAT( "/product-images/", `products`.`id`, "/original/" )) AS `original_image_url`,IF(`products`.`product_image_id` = 0, NULL, CONCAT( "/product-images/", `products`.`id`, "/medium/" )) AS `medium_image_url`,IF(`products`.`product_image_id` = 0, NULL, CONCAT( "/product-images/", `products`.`id`, "/thumb/" )) AS `thumb_image_url`,CASE WHEN     `group_favourite`.`id` IS NOT NULL     THEN 1 ELSE 0 END AS `group_favourite` ,CASE WHEN     `product_favourite`.`org_id` != 0     AND `product_favourite`.`org_id` IS NOT NULL     THEN 1 ELSE 0 END AS `favourite` ,CASE WHEN     `product_favourite`.`org_id` != 0     AND `product_favourite`.`org_id` IS NOT NULL     THEN 1 ELSE 0 END AS `favourite`, `product_favourite`.`needs_authorisation`,   IF(`customer_personal_favourite`.`id` IS NOT NULL,1,0) AS `is_personal_favourite`  FROM `products`   LEFT JOIN `vat_rate` ON `products`.`vat_rate_id` = `vat_rate`.`id` LEFT JOIN `product_images` ON `products`.`product_image_id` = `product_images`.`id` LEFT JOIN `product_docs` ON `products`.`product_doc_id` = `product_docs`.`id`  LEFT JOIN `orgs` AS `suppliers` ON `products`.`supplier_id` = `suppliers`.`id`  LEFT JOIN `commission` ON     `suppliers`.`id` = `commission`.`org_id`     AND `commission`.`status` = 1     AND DATE(`commission`.`start_date`) <= DATE('2020-07-15')     AND (         DATE(`commission`.`end_date`) >= DATE('2020-07-15')         OR `commission`.`end_date` = '0000-00-00'     )  LEFT JOIN `categories` AS `categories` ON     `products`.`category_id` = `categories`.`id`   LEFT JOIN `categories` AS `sub_categories` ON     `products`.`sub_category_id` = `sub_categories`.`id`  LEFT JOIN `product_attribute_product` ON     `product_attribute_product`.`product_id` = `products`.`id` LEFT JOIN `product_alternative` ON     `product_alternative`.`product_original_id` = `products`.`original_id` JOIN `orgs` AS `customer` ON `customer`.`id` IN (320)  LEFT JOIN `product_supplier_account` ON `product_supplier_account`.`product_id` = `products`.`id`  JOIN `supplier_account` AS `default_supplier_account` ON `default_supplier_account`.`supplier_id` = `products`.`supplier_id` AND `default_supplier_account`.`is_default` = 1 AND `default_supplier_account`.`is_deleted` = 0  JOIN `customer_to_supplier` ON `customer_to_supplier`.`supplier_id` = `products`.`supplier_id` AND `customer_to_supplier`.`customer_id` IN (`customer`.`id`) AND `customer_to_supplier`.`allow_access` = 1 AND `customer_to_supplier`.`status` = 1  JOIN `customer_to_supplier_account` ON     `customer_to_supplier_account`.`customer_to_supplier_id` = `customer_to_supplier`.`id`     AND (         `customer_to_supplier_account`.`supplier_account_id` =                                    `product_supplier_account`.`supplier_account_id`         OR `customer_to_supplier_account`.`supplier_account_id` = `default_supplier_account`.`id`     )     AND `customer_to_supplier_account`.`allow_access` = 1     AND `customer_to_supplier_account`.`is_deleted` = 0 LEFT JOIN `statement_products_categories` ON `statement_products_categories`.`product_id` = `products`.`original_id` LEFT JOIN `product_favourite` ON     `product_favourite`.`product_id` = `products`.`original_id`     AND `product_favourite`.`org_id` IN (`customer`.`id`)     LEFT JOIN `group_favourite` AS `group_favourite` ON `products`.`original_id` = `group_favourite`.`product_id` AND `group_favourite`.`group_id` IN (37) LEFT JOIN `budget` AS `category_budget` ON     `category_budget`.`org_id` IN (`customer`.`id`)     AND `category_budget`.`budget_type_id` = 1     AND `category_budget`.`category_id` = `products`.`category_id`     AND         `category_budget`.`start_date` <= CURDATE()         AND (             `category_budget`.`final_date` >= CURDATE()             OR `category_budget`.`final_date` IS NULL         ) LEFT JOIN `budget` AS `supplier_budget` ON     `supplier_budget`.`org_id` IN (`customer`.`id`)     AND `supplier_budget`.`budget_type_id` = 3     AND `supplier_budget`.`supplier_id` = `products`.`supplier_id`     AND         `supplier_budget`.`start_date` <= CURDATE()         AND (             `supplier_budget`.`final_date` >= CURDATE()             OR `supplier_budget`.`final_date` IS NULL         ) LEFT JOIN `budget` AS `product_budget` ON     `product_budget`.`org_id` IN (`customer`.`id`)     AND `product_budget`.`budget_type_id` = 2     AND         `product_budget`.`start_date` <= CURDATE()         AND (             `product_budget`.`final_date` >= CURDATE()             OR `product_budget`.`final_date` IS NULL         ) LEFT  JOIN `customer_personal_favourite` ON `customer_personal_favourite`.`org_id` = `customer`.`id` AND `customer_personal_favourite`.`product_original_id` = `products`.`original_id`  WHERE 0 OR (1  AND `products`.`status` = 1  AND ((     `products`.`parent_id` = 0     OR `products`.`parent_id` IS NULL )   OR (     `product_favourite`.`org_id` IS NOT NULL     AND `product_favourite`.`org_id` != 0 )  )  AND `products`.`id` = `products`.`current_id` ) GROUP BY `products`.`id` ORDER BY `favourite` DESC,  `active_date` ASC,     `products`.`code` ASC,     `products`.`name` ASC,     `products`.`parent_id` ASC   LIMIT 0, 20  
Add Comment
2 Answer(s)

I had once faced this issue and I decided at that time to write an article on this and I have compiled from different sources and optimized my query if someone wishes to see how I achieved it. Please have a look at this. https://junaidtechblog.wordpress.com/2019/09/04/optimize-sql-query-groupby-having/

Add Comment
      AND  DATE(`commission`.`start_date`) <= DATE('2020-07-15')       AND  ( DATE(`commission`.`end_date`) >= DATE('2020-07-15')               OR  `commission`.`end_date` = '0000-00-00'  

has several problems:

  • OR is hard to optimize. Find some way to avoid it — such as by removing such rows, or using an end date that is far in the future, not past.
  • Don’t hide a column in a function call.
  • Let’s see SHOW CREATE TABLE.
  • You don’t need DATE() around the literal ‘2020-07-15’.

This, on the other hand, is OK: product_budget.start_date <= CURDATE()

ORDER BY `favourite` DESC,          `active_date` ASC, 

Unless you have MySQL 8.0, no index can handle an ORDER BY that mixes DESC and ASC. (It may not be possible to use an index for other reasons.)

WHERE 0 OR (1  AND ... 

Dynamically build the query, rather than using kludges like these.

LEFT JOIN ... ON ... foo = 1 

ON should state how the tables are related. foo = 1 feels more like something that should be in the WHERE clause (for filtering). Moving that between ON and WHERE will probably change the set of rows produced. But I am questioning whether that should be JOIN (instead of LEFT JOIN) or the test should be in WHERE, or whether it is actually valid the way you wrote it.

Please provide EXPLAIN SELECT ... The lines with the largest "rows" are likely to be the tables to focus on.

GROUP BY  `products`.`id` 

is likely to lead to mangled output. Read about "only_full_group_by".

That leads to a potential speedup.

  1. Do the minimal amount of work to get a list of id to satisfy the ORDER BY and LIMIT.

  2. Then JOIN to all the other tables, including back to products, to get the other columns. This is likely to speed things up and get rid of the above bug.

  3. Where practical change LEFT JOIN into a subquery:

     SELECT ...      `vat_rate`.`rate` AS `vat_rate`,      ...  LEFT JOIN  `vat_rate`  ON `products`.`vat_rate_id` = `vat_rate`.`id`  ... 

–>

    SELECT ...         ( SELECT `rate` FROM vat_rate WHERE id = `products`.`vat_rate_id`         )  AS `vat_rate`,         ...     ... 

Get most of that done, then come back for more suggestions.

Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.