1 | Summary of changes and additions for supporting sequencing |
---|
2 | ========================================================== |
---|
3 | |
---|
4 | 1. New type of [MeasuredBioMaterial]: [Library] |
---|
5 | This is a new Java-class but is saved to the same database |
---|
6 | table as the other biomaterial types (discriminator-value=5) |
---|
7 | The parent type is [Extract] |
---|
8 | |
---|
9 | A new [ProtocolType] is created by the installation: |
---|
10 | 'Library preparation' |
---|
11 | |
---|
12 | A [Barcode] is needed when mixing multiple biomaterials on the |
---|
13 | same lane. I have three options: |
---|
14 | |
---|
15 | a) A new entity class: [Barcode] |
---|
16 | b) Re-use [Label] |
---|
17 | c) Use annotation on [Library] |
---|
18 | |
---|
19 | Which one should we choose? I guess it depends on what we need to |
---|
20 | store. Do we need more than what can currently be stored for a label |
---|
21 | (name and description)? An annotation can only store a single value. |
---|
22 | If we re-use label, should we rename it to something more generic? |
---|
23 | For example, [Tag] which may have a [TagType] attribute (eg. 'Label', |
---|
24 | 'Barcode', etc...). |
---|
25 | |
---|
26 | Hmmm... could we take this to the 'exteme' and rename [LabeledExtract] |
---|
27 | instead of creating [Library]? For example, [TaggedExtract] has a [Tag] |
---|
28 | which has a [TagType] that tells us what it is. Combined with the other |
---|
29 | ticket (#1597: Subtypes of biomaterial items) this may be a more |
---|
30 | future-proof solution... |
---|
31 | |
---|
32 | Do we need some kind of 'mode' setting in the GUI so that it uses |
---|
33 | the correct terminology as much as possible? The 'mode' setting |
---|
34 | could also control parts of the 'Validate' functionality in the |
---|
35 | 'Item overview' which may need to work differently. Particularly |
---|
36 | all rules for 'number of channels' which are only needed for microarray |
---|
37 | experiments. |
---|
38 | |
---|
39 | |
---|
40 | |
---|
41 | 2. Changes in [BioMaterialEvent] |
---|
42 | |
---|
43 | A new entity class [BioMaterialEventParticipant] is introduced |
---|
44 | instead of the "anonymous" link-table 'BioMaterialEventSources'. |
---|
45 | This should make it possible to get rid of the [UsedQuantity] |
---|
46 | "fulhack" that was used to support multi-array slides. |
---|
47 | Existing information can easily be moved to the new tables. |
---|
48 | |
---|
49 | We may need a new [BioMaterialEventType] (eg. 'Sequencing'). |
---|
50 | Or is it better to change the name of the 'Hybridization' event |
---|
51 | to, for example, 'BioAssayCreation'? |
---|
52 | |
---|
53 | 3. New entity class [PhysicalBioAssay] that replaces [Hybridization] |
---|
54 | |
---|
55 | All hybridization data is moved to the new database table. The |
---|
56 | information it can hold is more or less the same as before, but |
---|
57 | it can be linked with any type of [MeasuredBioMaterial] and not |
---|
58 | just [LabeledExtract]. |
---|
59 | |
---|
60 | A [PhysicalBioAssay] should be classified by a type ('Hybridization', |
---|
61 | 'Sequencing', etc.). This could be an entity class of it's own |
---|
62 | or a property/enum or we can know this from the 'creationEvent' |
---|
63 | type. |
---|
64 | |
---|
65 | The [PhysicalBioAssay] should implement [FileStoreEnabled] so |
---|
66 | that we can link files to it. |
---|
67 | |
---|
68 | New [ProtocolType]: 'Sequencing' |
---|
69 | New [HardwareType]: 'Sequencing station' |
---|
70 | New [Hardware]: 'HiSeq 2000' |
---|
71 | |
---|
72 | 4. Changes for [FileSet] and related classes |
---|
73 | |
---|
74 | [FileSetMember] is made into an [Annotatable] item so that |
---|
75 | we can add annotations on files. |
---|
76 | |
---|
77 | [FileSet] is modified so that it becomes possible to add more |
---|
78 | than one file for each [DataFileType]. But this is controlled |
---|
79 | by a flag ('allowMultiple'). |
---|
80 | |
---|
81 | 5. New entity classes [BioAssayEvent], [DerivedBioAssaySet] and |
---|
82 | [DerivedBioAssay] that replaces [Scan] and [Image] |
---|
83 | |
---|
84 | The [BioAssayEvent] and [DerivedBioAssaySet] makes up a loop |
---|
85 | that is started from a [PhysicalBioAssay] and ends with a |
---|
86 | [RawBioAssay]. This loop is similar to the loop with [Transformation] |
---|
87 | and [BioAssaySet] in the existing analysis section. |
---|
88 | |
---|
89 | Existing [Scan] and [Image] data is moved into a single "iteration" |
---|
90 | of that loop. The scan data is split between the bioassay event |
---|
91 | (protocol, scanner, date) and the derived bioassay set |
---|
92 | (name, description, owner). One or more derived bioassays are also |
---|
93 | created with links back to the biomaterial they are related to. |
---|
94 | Image data is moved to the file set of the derived bioassay set |
---|
95 | with |
---|
96 | |
---|
97 | The new classes can be used to represent the multiple steps |
---|
98 | that are required before sequenced data can be boiled down to |
---|
99 | something that is similar to expression data. |
---|
100 | |
---|
101 | A bioassay event can be linked with [Job], [Protocol], |
---|
102 | [Hardware] and [Software]. It should be possible to |
---|
103 | create iterations both manually and with plug-ins. |
---|
104 | |
---|
105 | Do we need a new [PluginType] or can 'Analysis' be re-used? |
---|
106 | We already implement context-checking so there shouldn't be any risk |
---|
107 | of mixing things up. |
---|
108 | |
---|
109 | New [Software]: BCL Converter, Casava, Bowtie, Myrna, Tophat, |
---|
110 | Cufflinks, and more??? |
---|
111 | |
---|
112 | New [DataFileType]: bcl, cif, fastq, qseq, bam, sam, and more??? |
---|
113 | Which ones do we really want to store/reference through BASE? |
---|
114 | |
---|
115 | New [FileType]: ???? |
---|
116 | |
---|
117 | New [Platform]/[PlatformVariant]: ??? |
---|
118 | |
---|
119 | Or should most of this be in an extensions package similar to |
---|
120 | the existing Illumina package? |
---|
121 | |
---|