Opened 10 years ago

Last modified 10 years ago

#1936 closed enhancement

Bad performance when filtering on subqueries — at Version 1

Reported by: Nicklas Nordborg Owned by: everyone
Priority: major Milestone: BASE 3.5
Component: core Version:
Keywords: Cc:

Description (last modified by Nicklas Nordborg)

I noticed this when working with #1325. When filtering items that are not members of a specified item list the performance can get very bad.

In this special case I had an item list containing 25330 samples which is all except 72. Using the list to filter out the 25530 that are members is quick, but selecting the 'not' option to filter out the 72 non-members takes almost a minute.

The SQL that is executed is something like (not full statement):

select ... from "BioMaterials"  
where id <>ALL
(select members from "ItemLists" ... where id=754572)

This query took about 40 seconds to execute. According to the documentation for PostgreSQL (http://www.postgresql.org/docs/9.4/static/functions-subquery.html) <>ALL is equivalent to NOT IN. Trying this out I get the same result but the query takes less than a second to execute.

We are using <>ALL in some other places as well, but I think we should switch to NOT IN.

Ps. When filtering out items that are members of a list we use =ANY which (according to the documentation) is the same as IN but here we see no difference in performance (both are good).

The possibility to use <>ALL and =ANY was introduced by #1156.

Change History (1)

comment:1 by Nicklas Nordborg, 10 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.