Opened 19 years ago
Closed 19 years ago
#248 closed defect (fixed)
Postgres hangs when trying to load data from a filtered bioassayset
Reported by: | Nicklas Nordborg | Owned by: | Nicklas Nordborg |
---|---|---|---|
Priority: | critical | Milestone: | BASE 2.0 |
Component: | core | Version: | |
Keywords: | Cc: |
Description
A non-filter bioassay set is displaying ok. The SQL for loading the filtered bioassay set is:
SELECT COUNT(*) FROM "base2"."dynamic"."D2624Spot" "spt" INNER JOIN "base2"."dynamic"."D2624Filter" AS flt ON flt."cube" = "spt"."cube" AND flt."column" = "spt"."column" AND flt."position" = "spt"."position" WHERE "spt"."cube" = 2 AND "spt"."layer" = 1 AND "flt"."filter" = 1 AND "spt"."column" = 1
The SQL for loading the unfiltered bioassay set is the same without the INNER JOIN part:
SELECT COUNT(*) FROM "base2"."dynamic"."D2624Spot" "spt" WHERE "spt"."cube" = 2 AND "spt"."layer" = 1 AND "spt"."column" = 1
Change History (5)
comment:1 by , 19 years ago
Milestone: | → BASE 2.0 |
---|---|
Status: | new → assigned |
comment:2 by , 19 years ago
NO!! The hanging is back, but postgres isn't really hung. It is the query takes a very very long tim to execute.
I think the problem is that postgres doesn't update it's indexes automatically. It always hangs after inserting new data. In the test case a 100000+ rows were inserted into the D2624Spot and D2624Filter tables. I think that postgres can't use the indexes in the first query since the new data isn't in there. Running a VACUUM ANALYZE on the tables seems to update the indexes. After that the query runs on a few seconds.
So the question is. How do we automate this? In a typical use case someone filter's the data and then wants to view the result immediately. We can't wait for a sceduled task to do the VACUUM command.
Postgres has a tool, pg_autovacuum, that might be useful. More info at http://www.postgresql.org/docs/8.1/static/maintenance.html#AUTOVACUUM
comment:3 by , 19 years ago
I have done some more tests and found another problem, but maybe also a solution. The problem is that after a spotbatcher has inserted it's spots (which may a 100000 new spots) it does a select to find the number of unique reporters. This query takes a long time since the new spots are not in the index. The AUTOVACUUM above can't solve this problem, since it only executed at certain time points. However, issuing an ANALYZE <table> command just before closing the spotbatcher updates the index and the select runs quickly. VACUUM can't be used. It comaplains about not beeing able to run within a transaction.
The only concern is what happens if several transactions are inserting data into the same table, and multiple ANALYZE commands are run at the same time. This is going to happen in a production environment and must be investigated.
comment:4 by , 19 years ago
Priority: | major → critical |
---|
comment:5 by , 19 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Hmm... it might have been a problem with running out of disk space. Both queries are executing fine right now and I had a disk problem the day this error was reported. So, if nobody else is having the same error I think this ticket can be closed.