Opened 16 years ago

Closed 16 years ago

#939 closed defect (fixed)

BioAssaySetExporter doesn't export BASEfiles for multiple array designs correctly

Reported by: base Owned by: everyone
Priority: minor Milestone: BASE 2.6.1
Component: coreplugins Version:
Keywords: Cc:

Description (last modified by Jari Häkkinen)

I am finding problems with BASEfile export, averaged over reporter or not. It's something to do with the select "_position" in the spot data query I think. But I do not have a patch for this problem.

I will try to attach a gzipped BASE file (averaged over reporter, made by the base1pluginexecutor for hierarchical clustering) where you can see the main chunk of 19,388 averaged reporters (I think the data is correct) and then all the other reporters again (several times). The file should only have around 20k lines.

When I filter the bioassayset to include just one array design, it works fine.

Analysis of this multi-array design expt has been working fine in BASE. We have a 4k array and a 20k array (all 4k reporters are on the 20k array). I'll assign it minor priority because we only have one expt like this!

More details gladly supplied.

Bob MacCallum.

Attachments (2)

new.bioassayset.export.txt (702.5 KB ) - added by base 16 years ago.
NOT averaged on reporter
new.bioassayset.export.reporteraveraged.txt (661.2 KB ) - added by base 16 years ago.
AVERAGED on reporter

Download all attachments as: .zip

Change History (12)

comment:1 by base, 16 years ago

oh dear, it was just over the 4MB limit...

email me and i'll send it (see recent mailing list posts for address)

cheers, Bob

comment:2 by Nicklas Nordborg, 16 years ago

From your description I don't understand what the problem is. Do you think that you can create a smaller experiment and a smaller file? For example, with a single 20k array and a single 4k array.

I checked the code, and I think there is a problem when merging on reporters that is related to the 'position' numbers. The query still selects the 'position' values, but when merging the position is no longer well defined. I think MySQL just selects a random value among the reporters that was merged. Users running Postgres will probably get an error message. I remember a similar problem in the experiment explorer code when the 'average' option was selected a long time ago. Anyway this may not be the same problem that you see since I think it could be problem even if only one array design has been used.

comment:3 by base, 16 years ago

I'll make a smaller demo tomorrow (good suggestion) and get back to you. Thanks.

by base, 16 years ago

Attachment: new.bioassayset.export.txt added

NOT averaged on reporter

by base, 16 years ago

AVERAGED on reporter

comment:4 by base, 16 years ago

Just added two exported BASEfiles.

The first, you could argue, is correct, since it has the same number of spot data lines as specified in the "count" header.

The second has definitely not averaged over reporter correctly.

comment:5 by Nicklas Nordborg, 16 years ago

I think that this issue may be related to #941. I am not very familiar with the BASEFile format and don't know what to look for. The "merge on reporters" case where the number of spots in the header doesn't match the actual number of rows can be explained by #941. If there are other problems than that there are probably other bugs in the export as well.

comment:6 by base, 16 years ago

I agree that adressing #941 will almost certainly fix this problem.


comment:7 by Nicklas Nordborg, 16 years ago

(In [4167]) Fixes #941: SQLException when exporting "merged on reporters" with BioAssaySet exporter on Postgres References #939: BioAssaySetExporter doesn't export BASEfiles for multiple array designs correctly

comment:8 by base, 16 years ago

I will test this soon. Thanks for the fix.


comment:9 by base, 16 years ago

I have just done a quick test again on an experiment with multiple array designs. Exporting to Normal BASEfile.

With Average on reporters, it seems to work fine, and indeed I have been using the base1 hierarchical clustering plugin several times with no problems.

With no average on reporters, it also seems to work. The data doesn't look pretty but that may be due to some issue with my array designs (different 'positions' or something).


comment:10 by Jari Häkkinen, 16 years ago

Description: modified (diff)
Milestone: BASE 2.6.1
Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.