Ticket #1153: sequencing-draft-1.txt

File sequencing-draft-1.txt, 5.0 KB (added by Nicklas Nordborg, 12 years ago)

More details and thoughts about the diagram

Line 
1Summary of changes and additions for supporting sequencing
2==========================================================
3
41. New type of [MeasuredBioMaterial]: [Library]
5   This is a new Java-class but is saved to the same database
6   table as the other biomaterial types (discriminator-value=5)
7   The parent type is [Extract]
8
9   A new [ProtocolType] is created by the installation:
10   'Library preparation'
11
12   A [Barcode] is needed when mixing multiple biomaterials on the
13   same lane. I have three options:
14
15   a) A new entity class: [Barcode]
16   b) Re-use [Label]
17   c) Use annotation on [Library]
18
19   Which one should we choose? I guess it depends on what we need to
20   store. Do we need more than what can currently be stored for a label
21   (name and description)? An annotation can only store a single value.
22   If we re-use label, should we rename it to something more generic?
23   For example, [Tag] which may have a [TagType] attribute (eg. 'Label',
24   'Barcode', etc...).
25   
26   Hmmm... could we take this to the 'exteme' and rename [LabeledExtract]
27   instead of creating [Library]? For example, [TaggedExtract] has a [Tag]
28   which has a [TagType] that tells us what it is. Combined with the other
29   ticket (#1597: Subtypes of biomaterial items) this may be a more
30   future-proof solution...
31   
32   Do we need some kind of 'mode' setting in the GUI so that it uses
33   the correct terminology as much as possible? The 'mode' setting
34   could also control parts of the 'Validate' functionality in the
35   'Item overview' which may need to work differently. Particularly
36   all rules for 'number of channels' which are only needed for microarray
37   experiments.
38   
39   
40   
412. Changes in [BioMaterialEvent]
42
43   A new entity class [BioMaterialEventParticipant] is introduced
44   instead of the "anonymous" link-table 'BioMaterialEventSources'.
45   This should make it possible to get rid of the [UsedQuantity]
46   "fulhack" that was used to support multi-array slides.
47   Existing information can easily be moved to the new tables.
48   
49   We may need a new [BioMaterialEventType] (eg. 'Sequencing').
50   Or is it better to change the name of the 'Hybridization' event
51   to, for example, 'BioAssayCreation'?
52 
533. New entity class [PhysicalBioAssay] that replaces [Hybridization]
54
55   All hybridization data is moved to the new database table. The
56   information it can hold is more or less the same as before, but
57   it can be linked with any type of [MeasuredBioMaterial] and not
58   just [LabeledExtract].
59   
60   A [PhysicalBioAssay] should be classified by a type ('Hybridization',
61   'Sequencing', etc.). This could be an entity class of it's own
62   or a property/enum or we can know this from the 'creationEvent'
63   type.
64   
65   The [PhysicalBioAssay] should implement [FileStoreEnabled] so
66   that we can link files to it.
67   
68   New [ProtocolType]: 'Sequencing'
69   New [HardwareType]: 'Sequencing station'
70   New [Hardware]: 'HiSeq 2000'
71 
724. Changes for [FileSet] and related classes
73
74   [FileSetMember] is made into an [Annotatable] item so that
75   we can add annotations on files.
76
77   [FileSet] is modified so that it becomes possible to add more
78   than one file for each [DataFileType]. But this is controlled
79   by a flag ('allowMultiple').
80
815. New entity classes [BioAssayEvent], [DerivedBioAssaySet] and
82   [DerivedBioAssay] that replaces [Scan] and [Image]
83   
84   The [BioAssayEvent] and [DerivedBioAssaySet] makes up a loop
85   that is started from a [PhysicalBioAssay] and ends with a
86   [RawBioAssay]. This loop is similar to the loop with [Transformation]
87   and [BioAssaySet] in the existing analysis section.
88   
89   Existing [Scan] and [Image] data is moved into a single "iteration"
90   of that loop. The scan data is split between the bioassay event
91   (protocol, scanner, date) and the derived bioassay set
92   (name, description, owner). One or more derived bioassays are also
93   created with links back to the biomaterial they are related to.
94   Image data is moved to the file set of the derived bioassay set
95   with
96
97   The new classes can be used to represent the multiple steps
98   that are required before sequenced data can be boiled down to
99   something that is similar to expression data.
100   
101   A bioassay event can be linked with [Job], [Protocol],
102   [Hardware] and [Software]. It should be possible to
103   create iterations both manually and with plug-ins.
104   
105   Do we need a new [PluginType] or can 'Analysis' be re-used?
106   We already implement context-checking so there shouldn't be any risk
107   of mixing things up.
108   
109   New [Software]: BCL Converter, Casava, Bowtie, Myrna, Tophat,
110   Cufflinks, and more???
111
112   New [DataFileType]: bcl, cif, fastq, qseq, bam, sam, and more???
113   Which ones do we really want to store/reference through BASE?
114   
115   New [FileType]: ????
116   
117   New [Platform]/[PlatformVariant]: ???
118   
119   Or should most of this be in an extensions package similar to
120   the existing Illumina package?
121