Opened 17 years ago

Closed 17 years ago

#894 closed enhancement (fixed)

Support for array designs with other feature identification methods than coordinates

Reported by: Nicklas Nordborg Owned by: Nicklas Nordborg
Priority: blocker Milestone: BASE 2.6
Component: core Version:
Keywords: Cc:

Description

This ticket replaces #867. It had a slightly different angle, and the proposed solution didn't reflect reality.

We now have more information about the Illumina platform. As it turns out each array has a unique ID for each feature on it. The uniqueness is probably global but the ID doesn't make a good candidate as reporter ID. This is because different feature ID:s may point to the same reporter ID. For example:

FeatId1 --> RepId1
FeatId2 --> RepId2
FeatId3 --> RepId1

It is not verified if a feature ID on another array design with the same ID can point to different reporters, but it doesn't matter with the proposed solution.

So, we need an extra column in the Features table that can hold a string value. (The feature ID is actually an integer, but using strings will be more future proof). This value should replace the row and column coordinates as the identifier for the feature. Changes that we need to make:

  • Add the ArrayDesign.getFeatureIdentifier() method which is an enum: COORDINATE, EXTERNAL_ID. Maybe we can also add an option to use: POSITION
  • Get rid of the unique index on block, row and column on the Features table, since they no longer need to be unique. The FeatureBatcher must instead verify that each feature is unique with respect to the identification method used. It should still be possible to assign values to columns that are used by the other identification methods. Their uniqueness will not be checked.
  • The RawDataBatcher must be made aware of the identification variants and be able to prompt for column mappings for each type. We should also think about compatibility issues. Users may want to mix data from different platforms using the same array design. The use case is that one scanner generates block,row,column coordinates and another scanner only the position number.
  • We need to think about what to do with the current IlluminaRawDataImporter. It can probably use the TargetID as the feature ID, but what about backwards compatibility with existing data. It must probably still support the fake coordinate generation.

Change History (10)

comment:1 by Nicklas Nordborg, 17 years ago

Owner: changed from everyone to Nicklas Nordborg
Status: newassigned

comment:2 by Nicklas Nordborg, 17 years ago

(In [4080]) References #894: Support for array designs with other feature identification methods than coordinates

The functionality is now implemented. Test programs run fine with the COORDINATES method. Needs testing with the other methods. RawBioAssay.updateArrayDesign needs to be recoded.

comment:3 by Nicklas Nordborg, 17 years ago

(In [4083]) References #894: Support for array designs with other feature identification methods than coordinates

Everything should now be fixed. May need some more testing before closing this ticket.

comment:4 by Nicklas Nordborg, 17 years ago

Type: defectenhancement

comment:5 by Nicklas Nordborg, 17 years ago

(In [4106]) References #894. Fixes #908.

comment:6 by Nicklas Nordborg, 17 years ago

(In [4108]) References #894: Support for array designs with other feature identification methods than coordinates

Do not verify the reporter if null is passed as external ID. Pick the reporter automatically from the array design instead.

comment:7 by Martin Svensson, 17 years ago

(In [4109]) References #894 Allow 0 as a block number in BlockInfo

comment:8 by Martin Svensson, 17 years ago

(In [4110]) References #894 Removed an unnecessary value check. The check has already been done when parameter is created.

comment:9 by markus, 17 years ago

Regarding

"It is not verified if a feature ID on another array design with the same ID can point to different reporters, but it doesn't matter with the proposed solution."

quoted from the description of this ticket:

At least Illumina uses the same feature IDs for different reporters across different beadarray platforms. Sentrix expression beadarrays and Infinium SNP beadarrays use identical featureIDs for reporters that are different for the two platforms.

comment:10 by Nicklas Nordborg, 17 years ago

Resolution: fixed
Status: assignedclosed

Everything seems to be ok. Test programs run fine.

Note: See TracTickets for help on using tickets.