Opened 12 years ago
Closed 12 years ago
#1717 closed defect (fixed)
Annotation importer create annotation set even if there are no annotations for an item
Reported by: | Nicklas Nordborg | Owned by: | everyone |
---|---|---|---|
Priority: | major | Milestone: | BASE 3.2 |
Component: | coreplugins | Version: | 3.1 |
Keywords: | Cc: |
Description
Don't know how long this has been around. Discovered it while testing the BASE 3.2 release and have verified that the issue also exists in 3.1.
How to reproduce:
- Create 2 samples with different names and one annotation type. It doesn't matter what kind of annotation.
- Create a text file with two columns: sample name and annotation value. Add one line for each sample and enter the names in the name column. For one of the samples, enter a value in the annotation column, but leave the other one empty.
- Import the annotation with the annotation importer. IT should report that 1 sample has been annotated with 1 value.
- Check the
BioMaterials
table in the database. Both samples have a value in theannotationset_id
column, but only one of them actually have a real entry in theAnnotations
table.
The annotation importer should not create an annotation set if there is no annotation value.
Change History (7)
comment:1 by , 12 years ago
comment:2 by , 12 years ago
Just a note about inherited annotations. Some of the 632K annotation sets that doesn't have any primary annotations have inherited annotations from other items. So before we can delete any annotation sets, we must also check the InheritedAnnotations
and InheritedAnnotationSets
tables. There are about 20K such annotation sets on the production server.
comment:3 by , 12 years ago
mysql> select count(at.id) from `AnnotationSets` at left join `Annotations` a on a.annotationset_id=at.id left join `InheritedAnnotations` ia on ia.annotationset_id=at.id left join `InheritedAnnotationSets` ias on ias.annotationset_id=at.id where a.id is null and ia.annotation_id is null and ias.inherited_id is null; +--------------+ | count(at.id) | +--------------+ | 623186 | +--------------+ 1 row in set (6.42 sec)
comment:4 by , 12 years ago
comment:5 by , 12 years ago
Made another count to see which items that have all those bogus annotation sets:
mysql> select count(*), at.item_type, count(a.id) from `AnnotationSets` at left join `Annotations` a on a.annotationset_id=at.id group by item_type; +----------+-----------+-------------+ | count(*) | item_type | count(a.id) | +----------+-----------+-------------+ | 3981 | 201 | 3976 | | 141145 | 202 | 141022 | | 27194 | 203 | 26771 | | 655604 | 226 | 32980 | | 289 | 242 | 289 | | 1 | 243 | 0 | | 13305 | 264 | 3869 | | 1296 | 267 | 1228 | | 50 | 268 | 0 | +----------+-----------+-------------+ 9 rows in set (3.48 sec)
item_type: 201=BIOSOURCE, 202=SAMPLE, 203=EXTRACT, 226=WELL, 242=ARRAYBATCH, 243=ARRAYSLIDE, 264=RAWBIOASSAY, 267=PHYSICALBIOASSAY, 268=DERIVEDBIOASSAY
First column is the number of annotation sets for a given item_type (second column). The third column is the number of actual annotations.
The big difference is for item_type=226 which are WELL items. This is a bit strange since the annotation importer doesn't work with WELL items. This means that the empty annotation sets have been created by something else. I suspect that it is the BASE 1->BASE 2 migration, so this has been around for a very long time, and the fix should probably have a separate ticket.
comment:7 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Hmmm.... this could potentially be a big performance issue when handling annotations... I checked the production server:
It has 210K annotations divided among 50K annotation sets, eg. about 4 annotations per item. But...
Total number of annotation sets is 683K...
...but 632K of them have no annotation values at all! So it seems like we are wasting a lot of rows in the
AnnotationSets
table. I think the database update should remove all extra entries.