Opened 18 years ago

Last modified 17 years ago

#294 closed enhancement

Performance testing: Plugin-in execution time — at Version 16

Reported by: Johan Enell Owned by: Nicklas Nordborg
Priority: blocker Milestone: BASE 2.5
Component: coreplugins Version: trunk
Keywords: Cc:

Description (last modified by Nicklas Nordborg)

The core plug-ins are slow. Just by running plug-ins on the demo server I've noticed that they are alot slower then the plug-ins from base1.

I've tested the intenisty calculator and Lowess but I think this affects every plug-in. Maybe some more tests should be done to locate if there is a bottleneck and it would be nice if we could create a best practice document.

See also #796 (Don't create indexes...) and #797 (Enhance performance for LOWESS...)

Change History (20)

comment:1 by Jari Häkkinen, 18 years ago

Milestone: BASE 2.x+BASE 2.1

comment:2 by Martin Svensson, 18 years ago

Owner: changed from base to Johan Enell

comment:3 by Jari Häkkinen, 18 years ago

Milestone: BASE 2.1BASE 2.2
Summary: Plugin-in execution timePerformance testing: Plugin-in execution time

comment:4 by Jari Häkkinen, 18 years ago

Milestone: BASE 2.2BASE 2.x+

comment:5 by Jari Häkkinen, 18 years ago

Milestone: BASE 2.x+BASE 2.4

comment:6 by Jari Häkkinen, 18 years ago

Performance tuning for the whole application is also needed.

comment:7 by Jari Häkkinen, 17 years ago

Milestone: BASE 2.4BASE 2.3

Milestone BASE 2.4 deleted

comment:8 by Nicklas Nordborg, 17 years ago

Priority: majorcritical

comment:9 by Nicklas Nordborg, 17 years ago

Owner: changed from Johan Enell to Nicklas Nordborg
Status: newassigned

I will start to create a BASE 2 test framework that can be used to:

  • Import reporters
  • Create one array design
  • Import a given number of raw bioassays from one or two data files
  • Create an experiment and add the raw bioassays to it
  • Create a root bioassayset from all raw bioassays
  • Create a filtered bioassayset from the root bioassayset
  • Normalize the filtered bioassayset with Lowess

Some parameters may be given on the command line that starts the test (for example the number of raw bioassays to create and which subtests to run). Some parameters may be given in a configuration file (for example parameters to the filter and lowess plug-ins).

I expect someone with more knowledge about BASE 1 and microarray analysis to help me with selecing proper parameter values, setting up a BASE 1 server for comparison, etc.

comment:10 by Nicklas Nordborg, 17 years ago

(In [3659]) References #294: Performance testing

Added framework for testing BASE 2 plug-ins.

comment:11 by Nicklas Nordborg, 17 years ago

Milestone: BASE 2.4BASE 2.5

comment:12 by Nicklas Nordborg, 17 years ago

Owner: Nicklas Nordborg removed
Status: assignednew

comment:13 by Jari Häkkinen, 17 years ago

Milestone: BASE 2.5BASE 2.x+

comment:14 by Nicklas Nordborg, 17 years ago

Description: modified (diff)
Milestone: BASE 2.x+BASE 2.5
Owner: set to Nicklas Nordborg
Priority: criticalblocker
Status: newassigned

Johan V-C has done some interesting test with the following plug-ins:

  • Create root bioassayset. The same 55K raw data set has been used for 10-40 bioassays. Median FG - Median BG was used for intensity values
  • Jep filter plugin: ch(1) > 0 && ch(2) > 0 && raw('flag') == 0
  • Lowess plugin: step=0.1, window size=0.33, iterations=4, blockgroup size=1

Corresponding tests were done on the BASE 1 server. The results show that BASE 1 execution time is linear with respect to the number of bioassays. With BASE 2 only the filter plug-in is linear but takes about 3 times longer. The other plug-ins seems to have a random execution time that is 20-50 times longer than in BASE 1.

by Nicklas Nordborg, 17 years ago

BASE 1 performance with respect to number of bioassays

by Nicklas Nordborg, 17 years ago

BASE 2 performance with respect to number of bioassays

comment:15 by Nicklas Nordborg, 17 years ago

I made some tests at home and discovered the same pattern, but I also noticed that BASE 2 creates "extra" indexes on the same columns that make up the primary key for all tables in the dynamic database. When I removed the indexes the execution time for the filter plug-in dropped to about half and for the root bioassayset creation the time dropped to be equal to the time for the filter plug-in. The Lowess plug-in is faster, but not by much. I suspect that the problem here is that the number of issued SELECT SQL statements depends a lot on the 'blockgroup size' parameter. Descreasing the number of SELECT statements would make it a lot faster

by Nicklas Nordborg, 17 years ago

Attachment: perftest.txt added

Test results when running with/without index on primary key columns

comment:16 by Nicklas Nordborg, 17 years ago

Description: modified (diff)

by Nicklas Nordborg, 17 years ago

Attachment: normalize.txt added

Normalization results after #797 has been fixed

Note: See TracTickets for help on using tickets.