Illumina array slides are special in that they contain multiple "sections" on a single array chip. When hybridizing a slide each section is hybridized with a different sample. This can already be represented in the database by selecting multiple labeled extract and connecting them to the same hybridization. But, this is actually a "misuse" since the assumption is that all labeled extract are used on the entire slide and that they all should have different labels. Since Illumina is a one-channel platform, the experiment validator will complain about the number of labeled extracts not being 1, and that they don't have different labels. Another problem is that it is not possible to know which raw bioassay is related to what labeled extract. In the real world, there is a one-to-one relationship, but in BASE this is lost and it seems like all labeled extracts are associated with all raw bioassay. This will make it hard and confusing when using the experiment overview since all labeled extracts can be reached from all raw bioassays.
So, we need to add information about the number of sub arrays on a slide (possible already at the array design level). We need to be able to associate a specific labeled extract to a specific sub-array, and we also need to associate the related raw bioassay to the same sub-array.
If we could do this from scratch without having to worry about backwards compatibility it would probably be easy. I think the hard part is to maintain the backwards compatibility.
Updated implementation idea
- The ArrayDesign and Hybridization gets an additional field: numArrays.
- When selecting Labeled extracts for a Hyb, the user can also specify which array(s) the labeled extract should go on. We will use the already existing 'dummy' column in the BioMaterialEventSources table for this. This exists because of a Hibernate quirk that would delete the Hyb->LE links if the usedQuantity was set to null. Luckily, we can now use this column to instead store the index number of the array the LE is linked to.
- As before, a single LE can only be linked once. This may means that the combination with a 2-channel platform, common reference and multi-array hyb can't be represented as is. The solution to this case is to first split the common reference into separate LE:s, one for each array. This can be done with the 'pooling' function. It's not a perfect solution, but this is the only case that needs extra work.
- The RawBioassay gets an extra column, arrayNum, that holds the index number of the array it is connected to. This number will be used to connect the raw bioassay with the labeled extracts that has the same number in the Hyb->LE link discussed in 2) above.
- The above information is in principle everything that is needed to be able to connect a RBA with the correct LE:s. A remaining problem is that the
Hybridization.getAnnotatableParents()
doesn't know if we are interrested in just LE:s on one array or on the entire hybridization. Since we can't break the API this method must still return all LE:d. We should add a new method, Hybridization.getAnnotatableParents(arrayNum)
, that just returns the LE:s on a specific array. It then becomes a problem of updating the client/plug-in code which of the two methods to call. In the BASE code there are two such places, discussed in 6 and 7 below.
- Inheriting annotations: This is in the web client code that displays the form where it is possible to select annotations to inherit. There is some generic code that needs to be complemeted with a special case:
if (root instanceof RawBioAssay && current instanceof Hybridization)
{
current.getAnnotatableParents(root.getArrayNum());
}
- Experiment overview: This has a cache that makes sure that
getAnnotatableParents()
is only called once for every item. This has to be changed since we may need to do this differently for multi-array hybridization. Will be fixed in #921.
- Update from 2.5: The three new columns on ArrayDesign, Hyb and RawBioassay all get a default value of 1.
- Experiment validator: Needs to be updated to be aware of the array number on the RBA and Hyb-LE links and only load the appropriate subset of LE:s for each RBA. When that is done everything else should work as before. New validation rule: check that arrayNum values doesn't fall outside the range set by Hyb and/or ArrayDesign. Will be fixed in #921.
- Backwards compatibility: Older tools/plug-ins may incorrectly link a RBA with LE:s from other RBA:s since they are not aware of the array number and can't filter on it when loading LE:s from a Hyb.
Was just browsing the tickets and saw this. Just for info: Agilent are also making many slides on one slide. I've seen one where four different samples can be hybed at once.
Bob.