Opened 13 years ago
Closed 11 years ago
#294 closed enhancement (fixed)
Performance testing: Plugin-in execution time
Reported by: | Johan Enell | Owned by: | Nicklas Nordborg |
---|---|---|---|
Priority: | blocker | Milestone: | BASE 2.5 |
Component: | coreplugins | Version: | trunk |
Keywords: | Cc: |
Description (last modified by )
The core plug-ins are slow. Just by running plug-ins on the demo server I've noticed that they are alot slower then the plug-ins from base1.
I've tested the intenisty calculator and Lowess but I think this affects every plug-in. Maybe some more tests should be done to locate if there is a bottleneck and it would be nice if we could create a best practice document.
See also #796 (Don't create indexes...) and #797 (Enhance performance for LOWESS...)
Attachments (4)
Change History (21)
comment:1 Changed 13 years ago by
Milestone: | BASE 2.x+ → BASE 2.1 |
---|
comment:2 Changed 13 years ago by
Owner: | changed from base to Johan Enell |
---|
comment:3 Changed 13 years ago by
Milestone: | BASE 2.1 → BASE 2.2 |
---|---|
Summary: | Plugin-in execution time → Performance testing: Plugin-in execution time |
comment:4 Changed 12 years ago by
Milestone: | BASE 2.2 → BASE 2.x+ |
---|
comment:5 Changed 12 years ago by
Milestone: | BASE 2.x+ → BASE 2.4 |
---|
comment:6 Changed 12 years ago by
comment:8 Changed 12 years ago by
Priority: | major → critical |
---|
comment:9 Changed 12 years ago by
Owner: | changed from Johan Enell to Nicklas Nordborg |
---|---|
Status: | new → assigned |
I will start to create a BASE 2 test framework that can be used to:
- Import reporters
- Create one array design
- Import a given number of raw bioassays from one or two data files
- Create an experiment and add the raw bioassays to it
- Create a root bioassayset from all raw bioassays
- Create a filtered bioassayset from the root bioassayset
- Normalize the filtered bioassayset with Lowess
Some parameters may be given on the command line that starts the test (for example the number of raw bioassays to create and which subtests to run). Some parameters may be given in a configuration file (for example parameters to the filter and lowess plug-ins).
I expect someone with more knowledge about BASE 1 and microarray analysis to help me with selecing proper parameter values, setting up a BASE 1 server for comparison, etc.
comment:10 Changed 12 years ago by
comment:11 Changed 12 years ago by
Milestone: | BASE 2.4 → BASE 2.5 |
---|
comment:12 Changed 12 years ago by
Owner: | Nicklas Nordborg deleted |
---|---|
Status: | assigned → new |
comment:13 Changed 12 years ago by
Milestone: | BASE 2.5 → BASE 2.x+ |
---|
comment:14 Changed 11 years ago by
Description: | modified (diff) |
---|---|
Milestone: | BASE 2.x+ → BASE 2.5 |
Owner: | set to Nicklas Nordborg |
Priority: | critical → blocker |
Status: | new → assigned |
Johan V-C has done some interesting test with the following plug-ins:
- Create root bioassayset. The same 55K raw data set has been used for 10-40 bioassays. Median FG - Median BG was used for intensity values
- Jep filter plugin: ch(1) > 0 && ch(2) > 0 && raw('flag') == 0
- Lowess plugin: step=0.1, window size=0.33, iterations=4, blockgroup size=1
Corresponding tests were done on the BASE 1 server. The results show that BASE 1 execution time is linear with respect to the number of bioassays. With BASE 2 only the filter plug-in is linear but takes about 3 times longer. The other plug-ins seems to have a random execution time that is 20-50 times longer than in BASE 1.
Changed 11 years ago by
Attachment: | BASE1_prestanda_skalning.png added |
---|
BASE 1 performance with respect to number of bioassays
Changed 11 years ago by
Attachment: | BASE2_prestanda_skalning.png added |
---|
BASE 2 performance with respect to number of bioassays
comment:15 Changed 11 years ago by
I made some tests at home and discovered the same pattern, but I also noticed that BASE 2 creates "extra" indexes on the same columns that make up the primary key for all tables in the dynamic database. When I removed the indexes the execution time for the filter plug-in dropped to about half and for the root bioassayset creation the time dropped to be equal to the time for the filter plug-in. The Lowess plug-in is faster, but not by much. I suspect that the problem here is that the number of issued SELECT SQL statements depends a lot on the 'blockgroup size' parameter. Descreasing the number of SELECT statements would make it a lot faster
Changed 11 years ago by
Attachment: | perftest.txt added |
---|
Test results when running with/without index on primary key columns
comment:16 Changed 11 years ago by
Description: | modified (diff) |
---|
Changed 11 years ago by
Attachment: | normalize.txt added |
---|
Normalization results after #797 has been fixed
comment:17 Changed 11 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Performance tuning for the whole application is also needed.