Opened 12 years ago

Closed 11 years ago

#294 closed enhancement (fixed)

Performance testing: Plugin-in execution time

Reported by: enell Owned by: nicklas
Priority: blocker Milestone: BASE 2.5
Component: coreplugins Version: trunk
Keywords: Cc:

Description (last modified by nicklas)

The core plug-ins are slow. Just by running plug-ins on the demo server I've noticed that they are alot slower then the plug-ins from base1.

I've tested the intenisty calculator and Lowess but I think this affects every plug-in. Maybe some more tests should be done to locate if there is a bottleneck and it would be nice if we could create a best practice document.

See also #796 (Don't create indexes...) and #797 (Enhance performance for LOWESS...)

Attachments (4)

BASE1_prestanda_skalning.png (47.2 KB) - added by nicklas 11 years ago.
BASE 1 performance with respect to number of bioassays
BASE2_prestanda_skalning.png (50.6 KB) - added by nicklas 11 years ago.
BASE 2 performance with respect to number of bioassays
perftest.txt (819 bytes) - added by nicklas 11 years ago.
Test results when running with/without index on primary key columns
normalize.txt (740 bytes) - added by nicklas 11 years ago.
Normalization results after #797 has been fixed

Download all attachments as: .zip

Change History (21)

comment:1 Changed 12 years ago by jari

  • Milestone changed from BASE 2.x+ to BASE 2.1

comment:2 Changed 12 years ago by martin

  • Owner changed from base to enell

comment:3 Changed 12 years ago by jari

  • Milestone changed from BASE 2.1 to BASE 2.2
  • Summary changed from Plugin-in execution time to Performance testing: Plugin-in execution time

comment:4 Changed 12 years ago by jari

  • Milestone changed from BASE 2.2 to BASE 2.x+

comment:5 Changed 12 years ago by jari

  • Milestone changed from BASE 2.x+ to BASE 2.4

comment:6 Changed 12 years ago by jari

Performance tuning for the whole application is also needed.

comment:7 Changed 11 years ago by jari

  • Milestone changed from BASE 2.4 to BASE 2.3

Milestone BASE 2.4 deleted

comment:8 Changed 11 years ago by nicklas

  • Priority changed from major to critical

comment:9 Changed 11 years ago by nicklas

  • Owner changed from enell to nicklas
  • Status changed from new to assigned

I will start to create a BASE 2 test framework that can be used to:

  • Import reporters
  • Create one array design
  • Import a given number of raw bioassays from one or two data files
  • Create an experiment and add the raw bioassays to it
  • Create a root bioassayset from all raw bioassays
  • Create a filtered bioassayset from the root bioassayset
  • Normalize the filtered bioassayset with Lowess

Some parameters may be given on the command line that starts the test (for example the number of raw bioassays to create and which subtests to run). Some parameters may be given in a configuration file (for example parameters to the filter and lowess plug-ins).

I expect someone with more knowledge about BASE 1 and microarray analysis to help me with selecing proper parameter values, setting up a BASE 1 server for comparison, etc.

comment:10 Changed 11 years ago by nicklas

(In [3659]) References #294: Performance testing

Added framework for testing BASE 2 plug-ins.

comment:11 Changed 11 years ago by nicklas

  • Milestone changed from BASE 2.4 to BASE 2.5

comment:12 Changed 11 years ago by nicklas

  • Owner nicklas deleted
  • Status changed from assigned to new

comment:13 Changed 11 years ago by jari

  • Milestone changed from BASE 2.5 to BASE 2.x+

comment:14 Changed 11 years ago by nicklas

  • Description modified (diff)
  • Milestone changed from BASE 2.x+ to BASE 2.5
  • Owner set to nicklas
  • Priority changed from critical to blocker
  • Status changed from new to assigned

Johan V-C has done some interesting test with the following plug-ins:

  • Create root bioassayset. The same 55K raw data set has been used for 10-40 bioassays. Median FG - Median BG was used for intensity values
  • Jep filter plugin: ch(1) > 0 && ch(2) > 0 && raw('flag') == 0
  • Lowess plugin: step=0.1, window size=0.33, iterations=4, blockgroup size=1

Corresponding tests were done on the BASE 1 server. The results show that BASE 1 execution time is linear with respect to the number of bioassays. With BASE 2 only the filter plug-in is linear but takes about 3 times longer. The other plug-ins seems to have a random execution time that is 20-50 times longer than in BASE 1.

Changed 11 years ago by nicklas

BASE 1 performance with respect to number of bioassays

Changed 11 years ago by nicklas

BASE 2 performance with respect to number of bioassays

comment:15 Changed 11 years ago by nicklas

I made some tests at home and discovered the same pattern, but I also noticed that BASE 2 creates "extra" indexes on the same columns that make up the primary key for all tables in the dynamic database. When I removed the indexes the execution time for the filter plug-in dropped to about half and for the root bioassayset creation the time dropped to be equal to the time for the filter plug-in. The Lowess plug-in is faster, but not by much. I suspect that the problem here is that the number of issued SELECT SQL statements depends a lot on the 'blockgroup size' parameter. Descreasing the number of SELECT statements would make it a lot faster

Changed 11 years ago by nicklas

Test results when running with/without index on primary key columns

comment:16 Changed 11 years ago by nicklas

  • Description modified (diff)

Changed 11 years ago by nicklas

Normalization results after #797 has been fixed

comment:17 Changed 11 years ago by nicklas

  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.