Opened 17 years ago
Closed 17 years ago
#894 closed enhancement (fixed)
Support for array designs with other feature identification methods than coordinates
Reported by: | Nicklas Nordborg | Owned by: | Nicklas Nordborg |
---|---|---|---|
Priority: | blocker | Milestone: | BASE 2.6 |
Component: | core | Version: | |
Keywords: | Cc: |
Description
This ticket replaces #867. It had a slightly different angle, and the proposed solution didn't reflect reality.
We now have more information about the Illumina platform. As it turns out each array has a unique ID for each feature on it. The uniqueness is probably global but the ID doesn't make a good candidate as reporter ID. This is because different feature ID:s may point to the same reporter ID. For example:
FeatId1 --> RepId1 FeatId2 --> RepId2 FeatId3 --> RepId1
It is not verified if a feature ID on another array design with the same ID can point to different reporters, but it doesn't matter with the proposed solution.
So, we need an extra column in the Features table that can hold a string value. (The feature ID is actually an integer, but using strings will be more future proof). This value should replace the row and column coordinates as the identifier for the feature. Changes that we need to make:
- Add the ArrayDesign.getFeatureIdentifier() method which is an enum: COORDINATE, EXTERNAL_ID. Maybe we can also add an option to use: POSITION
- Get rid of the unique index on block, row and column on the Features table, since they no longer need to be unique. The FeatureBatcher must instead verify that each feature is unique with respect to the identification method used. It should still be possible to assign values to columns that are used by the other identification methods. Their uniqueness will not be checked.
- The RawDataBatcher must be made aware of the identification variants and be able to prompt for column mappings for each type. We should also think about compatibility issues. Users may want to mix data from different platforms using the same array design. The use case is that one scanner generates block,row,column coordinates and another scanner only the position number.
- We need to think about what to do with the current IlluminaRawDataImporter. It can probably use the TargetID as the feature ID, but what about backwards compatibility with existing data. It must probably still support the fake coordinate generation.
Change History (10)
comment:1 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:2 by , 17 years ago
comment:3 by , 17 years ago
comment:4 by , 17 years ago
Type: | defect → enhancement |
---|
comment:6 by , 17 years ago
comment:8 by , 17 years ago
comment:9 by , 17 years ago
Regarding
"It is not verified if a feature ID on another array design with the same ID can point to different reporters, but it doesn't matter with the proposed solution."
quoted from the description of this ticket:
At least Illumina uses the same feature IDs for different reporters across different beadarray platforms. Sentrix expression beadarrays and Infinium SNP beadarrays use identical featureIDs for reporters that are different for the two platforms.
comment:10 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Everything seems to be ok. Test programs run fine.
(In [4080]) References #894: Support for array designs with other feature identification methods than coordinates
The functionality is now implemented. Test programs run fine with the COORDINATES method. Needs testing with the other methods. RawBioAssay.updateArrayDesign needs to be recoded.