Ticket #1153: sequencing-draft-1.txt

File sequencing-draft-1.txt, 5.0 KB (added by Nicklas Nordborg, 13 years ago)

More details and thoughts about the diagram

1Summary of changes and additions for supporting sequencing
41. New type of [MeasuredBioMaterial]: [Library]
5 This is a new Java-class but is saved to the same database
6 table as the other biomaterial types (discriminator-value=5)
7 The parent type is [Extract]
9 A new [ProtocolType] is created by the installation:
10 'Library preparation'
12 A [Barcode] is needed when mixing multiple biomaterials on the
13 same lane. I have three options:
15 a) A new entity class: [Barcode]
16 b) Re-use [Label]
17 c) Use annotation on [Library]
19 Which one should we choose? I guess it depends on what we need to
20 store. Do we need more than what can currently be stored for a label
21 (name and description)? An annotation can only store a single value.
22 If we re-use label, should we rename it to something more generic?
23 For example, [Tag] which may have a [TagType] attribute (eg. 'Label',
24 'Barcode', etc...).
26 Hmmm... could we take this to the 'exteme' and rename [LabeledExtract]
27 instead of creating [Library]? For example, [TaggedExtract] has a [Tag]
28 which has a [TagType] that tells us what it is. Combined with the other
29 ticket (#1597: Subtypes of biomaterial items) this may be a more
30 future-proof solution...
32 Do we need some kind of 'mode' setting in the GUI so that it uses
33 the correct terminology as much as possible? The 'mode' setting
34 could also control parts of the 'Validate' functionality in the
35 'Item overview' which may need to work differently. Particularly
36 all rules for 'number of channels' which are only needed for microarray
37 experiments.
412. Changes in [BioMaterialEvent]
43 A new entity class [BioMaterialEventParticipant] is introduced
44 instead of the "anonymous" link-table 'BioMaterialEventSources'.
45 This should make it possible to get rid of the [UsedQuantity]
46 "fulhack" that was used to support multi-array slides.
47 Existing information can easily be moved to the new tables.
49 We may need a new [BioMaterialEventType] (eg. 'Sequencing').
50 Or is it better to change the name of the 'Hybridization' event
51 to, for example, 'BioAssayCreation'?
533. New entity class [PhysicalBioAssay] that replaces [Hybridization]
55 All hybridization data is moved to the new database table. The
56 information it can hold is more or less the same as before, but
57 it can be linked with any type of [MeasuredBioMaterial] and not
58 just [LabeledExtract].
60 A [PhysicalBioAssay] should be classified by a type ('Hybridization',
61 'Sequencing', etc.). This could be an entity class of it's own
62 or a property/enum or we can know this from the 'creationEvent'
63 type.
65 The [PhysicalBioAssay] should implement [FileStoreEnabled] so
66 that we can link files to it.
68 New [ProtocolType]: 'Sequencing'
69 New [HardwareType]: 'Sequencing station'
70 New [Hardware]: 'HiSeq 2000'
724. Changes for [FileSet] and related classes
74 [FileSetMember] is made into an [Annotatable] item so that
75 we can add annotations on files.
77 [FileSet] is modified so that it becomes possible to add more
78 than one file for each [DataFileType]. But this is controlled
79 by a flag ('allowMultiple').
815. New entity classes [BioAssayEvent], [DerivedBioAssaySet] and
82 [DerivedBioAssay] that replaces [Scan] and [Image]
84 The [BioAssayEvent] and [DerivedBioAssaySet] makes up a loop
85 that is started from a [PhysicalBioAssay] and ends with a
86 [RawBioAssay]. This loop is similar to the loop with [Transformation]
87 and [BioAssaySet] in the existing analysis section.
89 Existing [Scan] and [Image] data is moved into a single "iteration"
90 of that loop. The scan data is split between the bioassay event
91 (protocol, scanner, date) and the derived bioassay set
92 (name, description, owner). One or more derived bioassays are also
93 created with links back to the biomaterial they are related to.
94 Image data is moved to the file set of the derived bioassay set
95 with
97 The new classes can be used to represent the multiple steps
98 that are required before sequenced data can be boiled down to
99 something that is similar to expression data.
101 A bioassay event can be linked with [Job], [Protocol],
102 [Hardware] and [Software]. It should be possible to
103 create iterations both manually and with plug-ins.
105 Do we need a new [PluginType] or can 'Analysis' be re-used?
106 We already implement context-checking so there shouldn't be any risk
107 of mixing things up.
109 New [Software]: BCL Converter, Casava, Bowtie, Myrna, Tophat,
110 Cufflinks, and more???
112 New [DataFileType]: bcl, cif, fastq, qseq, bam, sam, and more???
113 Which ones do we really want to store/reference through BASE?
115 New [FileType]: ????
117 New [Platform]/[PlatformVariant]: ???
119 Or should most of this be in an extensions package similar to
120 the existing Illumina package?