1 | Summary of changes and additions for supporting sequencing
|
---|
2 | ==========================================================
|
---|
3 |
|
---|
4 | 1. New type of [MeasuredBioMaterial]: [Library]
|
---|
5 | This is a new Java-class but is saved to the same database
|
---|
6 | table as the other biomaterial types (discriminator-value=5)
|
---|
7 | The parent type is [Extract]
|
---|
8 |
|
---|
9 | A new [ProtocolType] is created by the installation:
|
---|
10 | 'Library preparation'
|
---|
11 |
|
---|
12 | A [Barcode] is needed when mixing multiple biomaterials on the
|
---|
13 | same lane. I have three options:
|
---|
14 |
|
---|
15 | a) A new entity class: [Barcode]
|
---|
16 | b) Re-use [Label]
|
---|
17 | c) Use annotation on [Library]
|
---|
18 |
|
---|
19 | Which one should we choose? I guess it depends on what we need to
|
---|
20 | store. Do we need more than what can currently be stored for a label
|
---|
21 | (name and description)? An annotation can only store a single value.
|
---|
22 | If we re-use label, should we rename it to something more generic?
|
---|
23 | For example, [Tag] which may have a [TagType] attribute (eg. 'Label',
|
---|
24 | 'Barcode', etc...).
|
---|
25 |
|
---|
26 | Hmmm... could we take this to the 'exteme' and rename [LabeledExtract]
|
---|
27 | instead of creating [Library]? For example, [TaggedExtract] has a [Tag]
|
---|
28 | which has a [TagType] that tells us what it is. Combined with the other
|
---|
29 | ticket (#1597: Subtypes of biomaterial items) this may be a more
|
---|
30 | future-proof solution...
|
---|
31 |
|
---|
32 | Do we need some kind of 'mode' setting in the GUI so that it uses
|
---|
33 | the correct terminology as much as possible? The 'mode' setting
|
---|
34 | could also control parts of the 'Validate' functionality in the
|
---|
35 | 'Item overview' which may need to work differently. Particularly
|
---|
36 | all rules for 'number of channels' which are only needed for microarray
|
---|
37 | experiments.
|
---|
38 |
|
---|
39 |
|
---|
40 |
|
---|
41 | 2. Changes in [BioMaterialEvent]
|
---|
42 |
|
---|
43 | A new entity class [BioMaterialEventParticipant] is introduced
|
---|
44 | instead of the "anonymous" link-table 'BioMaterialEventSources'.
|
---|
45 | This should make it possible to get rid of the [UsedQuantity]
|
---|
46 | "fulhack" that was used to support multi-array slides.
|
---|
47 | Existing information can easily be moved to the new tables.
|
---|
48 |
|
---|
49 | We may need a new [BioMaterialEventType] (eg. 'Sequencing').
|
---|
50 | Or is it better to change the name of the 'Hybridization' event
|
---|
51 | to, for example, 'BioAssayCreation'?
|
---|
52 |
|
---|
53 | 3. New entity class [PhysicalBioAssay] that replaces [Hybridization]
|
---|
54 |
|
---|
55 | All hybridization data is moved to the new database table. The
|
---|
56 | information it can hold is more or less the same as before, but
|
---|
57 | it can be linked with any type of [MeasuredBioMaterial] and not
|
---|
58 | just [LabeledExtract].
|
---|
59 |
|
---|
60 | A [PhysicalBioAssay] should be classified by a type ('Hybridization',
|
---|
61 | 'Sequencing', etc.). This could be an entity class of it's own
|
---|
62 | or a property/enum or we can know this from the 'creationEvent'
|
---|
63 | type.
|
---|
64 |
|
---|
65 | The [PhysicalBioAssay] should implement [FileStoreEnabled] so
|
---|
66 | that we can link files to it.
|
---|
67 |
|
---|
68 | New [ProtocolType]: 'Sequencing'
|
---|
69 | New [HardwareType]: 'Sequencing station'
|
---|
70 | New [Hardware]: 'HiSeq 2000'
|
---|
71 |
|
---|
72 | 4. Changes for [FileSet] and related classes
|
---|
73 |
|
---|
74 | [FileSetMember] is made into an [Annotatable] item so that
|
---|
75 | we can add annotations on files.
|
---|
76 |
|
---|
77 | [FileSet] is modified so that it becomes possible to add more
|
---|
78 | than one file for each [DataFileType]. But this is controlled
|
---|
79 | by a flag ('allowMultiple').
|
---|
80 |
|
---|
81 | 5. New entity classes [BioAssayEvent], [DerivedBioAssaySet] and
|
---|
82 | [DerivedBioAssay] that replaces [Scan] and [Image]
|
---|
83 |
|
---|
84 | The [BioAssayEvent] and [DerivedBioAssaySet] makes up a loop
|
---|
85 | that is started from a [PhysicalBioAssay] and ends with a
|
---|
86 | [RawBioAssay]. This loop is similar to the loop with [Transformation]
|
---|
87 | and [BioAssaySet] in the existing analysis section.
|
---|
88 |
|
---|
89 | Existing [Scan] and [Image] data is moved into a single "iteration"
|
---|
90 | of that loop. The scan data is split between the bioassay event
|
---|
91 | (protocol, scanner, date) and the derived bioassay set
|
---|
92 | (name, description, owner). One or more derived bioassays are also
|
---|
93 | created with links back to the biomaterial they are related to.
|
---|
94 | Image data is moved to the file set of the derived bioassay set
|
---|
95 | with
|
---|
96 |
|
---|
97 | The new classes can be used to represent the multiple steps
|
---|
98 | that are required before sequenced data can be boiled down to
|
---|
99 | something that is similar to expression data.
|
---|
100 |
|
---|
101 | A bioassay event can be linked with [Job], [Protocol],
|
---|
102 | [Hardware] and [Software]. It should be possible to
|
---|
103 | create iterations both manually and with plug-ins.
|
---|
104 |
|
---|
105 | Do we need a new [PluginType] or can 'Analysis' be re-used?
|
---|
106 | We already implement context-checking so there shouldn't be any risk
|
---|
107 | of mixing things up.
|
---|
108 |
|
---|
109 | New [Software]: BCL Converter, Casava, Bowtie, Myrna, Tophat,
|
---|
110 | Cufflinks, and more???
|
---|
111 |
|
---|
112 | New [DataFileType]: bcl, cif, fastq, qseq, bam, sam, and more???
|
---|
113 | Which ones do we really want to store/reference through BASE?
|
---|
114 |
|
---|
115 | New [FileType]: ????
|
---|
116 |
|
---|
117 | New [Platform]/[PlatformVariant]: ???
|
---|
118 |
|
---|
119 | Or should most of this be in an extensions package similar to
|
---|
120 | the existing Illumina package?
|
---|
121 |
|
---|