Opened 10 years ago

Closed 10 years ago

#1936 closed enhancement (fixed)

Bad performance when filtering on subqueries

Reported by: Nicklas Nordborg Owned by: everyone
Priority: major Milestone: BASE 3.5
Component: core Version:
Keywords: Cc:

Description (last modified by Nicklas Nordborg)

I noticed this when working with #1325. When filtering items that are not members of a specified item list the performance can get very bad.

In this special case I had an item list containing 25330 samples which is all except 72. Using the list to filter out the 25530 that are members is quick, but selecting the 'not' option to filter out the 72 non-members takes almost a minute.

The SQL that is executed is something like (not full statement):

select ... from "BioMaterials"  
where id <>ALL
(select members from "ItemLists" ... where id=754572)

This query took about 40 seconds to execute. According to the documentation for PostgreSQL (http://www.postgresql.org/docs/9.4/static/functions-subquery.html) <>ALL is equivalent to NOT IN. Trying this out I get the same result but the query takes less than a second to execute.

We are using <>ALL in some other places as well, but I think we should switch to NOT IN.

Ps. When filtering out items that are members of a list we use =ANY which (according to the documentation) is the same as IN but here we see no difference in performance (both are good).

The possibility to use <>ALL and =ANY was introduced by #1156.

Change History (3)

comment:1 by Nicklas Nordborg, 10 years ago

Description: modified (diff)

comment:2 by Nicklas Nordborg, 10 years ago

(In [6844]) References #1936: Bad performance when filtering on subqueries

Changed all =ANY to IN and <>ALL to NOT IN. Tested with PostgreSQL so far. Hopes it works with MySQL also.

comment:3 by Nicklas Nordborg, 10 years ago

Resolution: fixed
Status: newclosed

No problem with MySQL.

Note: See TracTickets for help on using tickets.