BASE: Ticket #1385: Plot function in the annotation summary table in experiment explorer
https://base.thep.lu.se/ticket/1385
<p>
A plot button in annotation summary table. The plot should be a box plot of user selected values grouped on annotation, i.e., one box for each annotation value. Replaces 3 from <a class="closed ticket" href="https://base.thep.lu.se/ticket/1375" title="#1375: enhancement: Improve experiment explorer display (closed: duplicate)">#1375</a>. This should be synchronized with <a class="closed ticket" href="https://base.thep.lu.se/ticket/1386" title="#1386: enhancement: Plot function in the bioassay table in experiment explorer (closed: fixed)">#1386</a>.
</p>
en-usBASEhttps://base.thep.lu.se/htdocs/base.png
https://base.thep.lu.se/ticket/1385
Trac 1.2.3Nicklas NordborgMon, 28 Sep 2009 06:44:38 GMT
https://base.thep.lu.se/ticket/1385#comment:1
https://base.thep.lu.se/ticket/1385#comment:1
<p>
I don't understand what kind of plot you would like. Please specify and give an example.
</p>
TicketJohan Vallon-ChristerssonMon, 28 Sep 2009 14:09:16 GMT
https://base.thep.lu.se/ticket/1385#comment:2
https://base.thep.lu.se/ticket/1385#comment:2
<p>
Example in pdf file
</p>
TicketJohan Vallon-ChristerssonMon, 28 Sep 2009 14:09:36 GMTattachment set
https://base.thep.lu.se/ticket/1385
https://base.thep.lu.se/ticket/1385
<ul>
<li><strong>attachment</strong>
set to <em>plotfunction_EE_box.pdf</em>
</li>
</ul>
TicketNicklas NordborgWed, 14 Oct 2009 12:16:12 GMTstatus, owner, description changed; milestone set
https://base.thep.lu.se/ticket/1385#comment:3
https://base.thep.lu.se/ticket/1385#comment:3
<ul>
<li><strong>status</strong>
changed from <em>new</em> to <em>assigned</em>
</li>
<li><strong>owner</strong>
changed from <em>everyone</em> to <em>Nicklas Nordborg</em>
</li>
<li><strong>description</strong>
modified (<a href="/ticket/1385?action=diff&version=3">diff</a>)
</li>
<li><strong>milestone</strong>
set to <em>BASE 2.14</em>
</li>
</ul>
TicketNicklas NordborgThu, 15 Oct 2009 11:00:20 GMT
https://base.thep.lu.se/ticket/1385#comment:4
https://base.thep.lu.se/ticket/1385#comment:4
<p>
I have investigated what kind of help we can get from the <code>JFreeChart</code> plot package that we are using. It has a box-and-whisker type chart that is relatively easy to use. But I don't know if all calculations are made exactly as in the pdf that Johan submitted. By looking at the <code>JFreeChart</code> source code here is what I think it does:
</p>
<ul><li>It calculates the mean and median as usual. The plot can show one or both values. The median as a line and the mean as a circle.
</li><li>The 1st (Q1) and 3rd (Q3) quartiles are calculated as the median of the lower/upper half of the (sorted) list of values. The two values define the bottom and top of the box. Eg. if we have 10 sorted data values, then Q1 = median of values 1-5 and Q2 = median of values 6-10.
</li><li>Then, upper and lower threashold (TU/TL) values are calculated as:
<pre class="wiki"> TU = Q3 + (Q3-Q1)*1.5
TL = Q1 - (Q3-Q1)*1.5
</pre></li><li>The highest data value that is less than or equal to TU defines the upper whisker and the lowest data value that is greater than or equal to TL defines the lower whisker.
</li></ul><p>
So, now my question is if this algorithm is what you want? If not, it would be nice if someone post an alternate algorithm for how to calculate the values.
</p>
TicketJari HäkkinenThu, 15 Oct 2009 13:04:56 GMT
https://base.thep.lu.se/ticket/1385#comment:5
https://base.thep.lu.se/ticket/1385#comment:5
<p>
We would prefer that the TU and TL are calculated differently. The lower value should be the 5th percentile and the upper value should be the 95th percentile. If you have all the values in a sorted vector simply use the value at index 0.05*vector_size and 0.95*vector_size, respectively. Ties should be solved by taking the arithmetic average of the two neighbouring values.
</p>
<p>
(Q1 and Q3 are calculated similarly with factors 0.25 and 0.75, but the way outlined above works also.)
</p>
TicketNicklas NordborgThu, 15 Oct 2009 13:30:15 GMT
https://base.thep.lu.se/ticket/1385#comment:6
https://base.thep.lu.se/ticket/1385#comment:6
<blockquote class="citation">
<p>
Ties should be solved by taking the arithmetic average of the two neighbouring values.
</p>
</blockquote>
<p>
What exactly does this mean?
</p>
TicketJari HäkkinenThu, 15 Oct 2009 14:04:37 GMT
https://base.thep.lu.se/ticket/1385#comment:7
https://base.thep.lu.se/ticket/1385#comment:7
<p>
Replying to <a class="ticket" href="https://base.thep.lu.se/ticket/1385#comment:6" title="Comment 6">nicklas</a>:
</p>
<blockquote class="citation">
<blockquote class="citation">
<p>
Ties should be solved by taking the arithmetic average of the two neighbouring values.
</p>
</blockquote>
<p>
What exactly does this mean?
</p>
</blockquote>
<p>
The counting for the 20th percentile may end up between two elements in the vector. Say you have a vector with 6 elements:
</p>
<pre class="wiki">1 4 12 53 100 126
</pre><p>
the 20th percentile is between index 1 and 2 ... the value should be (1+4)/2=2.5.
</p>
TicketNicklas NordborgThu, 15 Oct 2009 15:08:59 GMT
https://base.thep.lu.se/ticket/1385#comment:8
https://base.thep.lu.se/ticket/1385#comment:8
<p>
Hmmm... so if we use <code>factor * vector_size</code> we get the index for that percentile...
It doesn't seem to work for medians which I guess is the same as the 50th percentile. And what about boundaries when we are close the the first and last element in the list?
</p>
<ul><li>25th percentile: 6 * 0.25 = 1.5 --> average of element 1+2
</li><li>median: 6 * 0.5 = 3 --> but the median should be the average of element 3+4
</li><li>5th percentile: 6 * 0.05 = 0.3 --> value of element 1?
</li><li>95th percentile: 6 * 0.95 = 5.7 --> average of element 5+6... but this is not symmetric with the 5th percentile??
</li></ul><p>
What if we have 7 elements?
</p>
<ul><li>25th percentile: 7 * 0.25 = 1.75 --> average of element 1+2
</li><li>median: 7 * 0.5 = 3.5 --> but the median should be the value of element 4
</li></ul><p>
What am I missing?
</p>
TicketNicklas NordborgThu, 15 Oct 2009 15:14:22 GMT
https://base.thep.lu.se/ticket/1385#comment:9
https://base.thep.lu.se/ticket/1385#comment:9
<p>
I found this: <a class="ext-link" href="http://www.koders.com/java/fid867FA235DAF49EE794B20334EF719CE6C69E17E5.aspx"><span class="icon"></span>http://www.koders.com/java/fid867FA235DAF49EE794B20334EF719CE6C69E17E5.aspx</a>
</p>
<p>
Does that algorithm makes sense?
</p>
TicketJari HäkkinenFri, 16 Oct 2009 06:37:44 GMT
https://base.thep.lu.se/ticket/1385#comment:10
https://base.thep.lu.se/ticket/1385#comment:10
<p>
The index determination should use (vector.length+1) and you will the proper index.
</p>
<p>
The code seems okay, the difference lies in the calculation of ties. I suggested a non-weighted average whereas the code interpolates between the values in the two neighbouring elements. Either will do, just document the choice made.
</p>
TicketNicklas NordborgFri, 16 Oct 2009 12:27:01 GMT
https://base.thep.lu.se/ticket/1385#comment:11
https://base.thep.lu.se/ticket/1385#comment:11
<p>
(In <a class="changeset" href="https://base.thep.lu.se/changeset/5138" title="References #1385 and #1386. Plot functions in experiment explorer
...">[5138]</a>) References <a class="closed ticket" href="https://base.thep.lu.se/ticket/1385" title="#1385: enhancement: Plot function in the annotation summary table in experiment explorer (closed: fixed)">#1385</a> and <a class="closed ticket" href="https://base.thep.lu.se/ticket/1386" title="#1386: enhancement: Plot function in the bioassay table in experiment explorer (closed: fixed)">#1386</a>. Plot functions in experiment explorer
</p>
<p>
Both types of plots can now be generated and I think the percentile values are correctly calculated.
</p>
TicketNicklas NordborgMon, 19 Oct 2009 07:11:18 GMT
https://base.thep.lu.se/ticket/1385#comment:12
https://base.thep.lu.se/ticket/1385#comment:12
<p>
(In <a class="changeset" href="https://base.thep.lu.se/changeset/5142" title="References #1385 and #1386. Plot functions in experiment explorer
The ...">[5142]</a>) References <a class="closed ticket" href="https://base.thep.lu.se/ticket/1385" title="#1385: enhancement: Plot function in the annotation summary table in experiment explorer (closed: fixed)">#1385</a> and <a class="closed ticket" href="https://base.thep.lu.se/ticket/1386" title="#1386: enhancement: Plot function in the bioassay table in experiment explorer (closed: fixed)">#1386</a>. Plot functions in experiment explorer
</p>
<p>
The current reporter name is used as a default subtitle.
</p>
TicketNicklas NordborgMon, 19 Oct 2009 09:34:45 GMTstatus changed; resolution set
https://base.thep.lu.se/ticket/1385#comment:13
https://base.thep.lu.se/ticket/1385#comment:13
<ul>
<li><strong>status</strong>
changed from <em>assigned</em> to <em>closed</em>
</li>
<li><strong>resolution</strong>
set to <em>fixed</em>
</li>
</ul>
<p>
Everything seems to be ok now.
</p>
Ticket