Opened 18 years ago

Closed 18 years ago

#248 closed defect (fixed)

Postgres hangs when trying to load data from a filtered bioassayset

Reported by: Nicklas Nordborg Owned by: Nicklas Nordborg
Priority: critical Milestone: BASE 2.0
Component: core Version:
Keywords: Cc:

Description

A non-filter bioassay set is displaying ok. The SQL for loading the filtered bioassay set is:

SELECT COUNT(*)  FROM "base2"."dynamic"."D2624Spot" "spt" 
INNER JOIN "base2"."dynamic"."D2624Filter" AS flt 
ON flt."cube" = "spt"."cube" AND flt."column" = "spt"."column" 
AND flt."position" = "spt"."position"  
WHERE "spt"."cube" = 2 AND "spt"."layer" = 1 AND "flt"."filter" = 1 
AND "spt"."column" = 1

The SQL for loading the unfiltered bioassay set is the same without the INNER JOIN part:

SELECT COUNT(*)  FROM "base2"."dynamic"."D2624Spot" "spt" 
WHERE "spt"."cube" = 2 AND "spt"."layer" = 1 AND "spt"."column" = 1

Change History (5)

comment:1 by Nicklas Nordborg, 18 years ago

Milestone: BASE 2.0
Status: newassigned

Hmm... it might have been a problem with running out of disk space. Both queries are executing fine right now and I had a disk problem the day this error was reported. So, if nobody else is having the same error I think this ticket can be closed.

comment:2 by Nicklas Nordborg, 18 years ago

NO!! The hanging is back, but postgres isn't really hung. It is the query takes a very very long tim to execute.

I think the problem is that postgres doesn't update it's indexes automatically. It always hangs after inserting new data. In the test case a 100000+ rows were inserted into the D2624Spot and D2624Filter tables. I think that postgres can't use the indexes in the first query since the new data isn't in there. Running a VACUUM ANALYZE on the tables seems to update the indexes. After that the query runs on a few seconds.

So the question is. How do we automate this? In a typical use case someone filter's the data and then wants to view the result immediately. We can't wait for a sceduled task to do the VACUUM command.

Postgres has a tool, pg_autovacuum, that might be useful. More info at http://www.postgresql.org/docs/8.1/static/maintenance.html#AUTOVACUUM

comment:3 by Nicklas Nordborg, 18 years ago

I have done some more tests and found another problem, but maybe also a solution. The problem is that after a spotbatcher has inserted it's spots (which may a 100000 new spots) it does a select to find the number of unique reporters. This query takes a long time since the new spots are not in the index. The AUTOVACUUM above can't solve this problem, since it only executed at certain time points. However, issuing an ANALYZE <table> command just before closing the spotbatcher updates the index and the select runs quickly. VACUUM can't be used. It comaplains about not beeing able to run within a transaction.

The only concern is what happens if several transactions are inserting data into the same table, and multiple ANALYZE commands are run at the same time. This is going to happen in a production environment and must be investigated.

comment:4 by Nicklas Nordborg, 18 years ago

Priority: majorcritical

comment:5 by Nicklas Nordborg, 18 years ago

Resolution: fixed
Status: assignedclosed

(In [2380]) Fixes #248: Postgres hangs when trying to load data from a filtered bioassayset

Added methods to DbEngine that allows Batcher:s to issue SQL commands after a batch operation has been completed. Improved logging of batchers.

Note: See TracTickets for help on using tickets.