Opened 16 years ago

Last modified 15 years ago

#1120 closed enhancement

The dynamic part of BASE should keep track whether intensity data is in log space or not — at Version 1

Reported by: Jari Häkkinen Owned by: everyone
Priority: major Milestone: BASE 2.12
Component: core Version:
Keywords: Cc:

Description (last modified by Jari Häkkinen)

I know the tradition is to store data non-logged but there have been many cases previously where this informal rule has not been followed creating confusion. In many cases numbers are used in logged form only and now data has to be converted back and forth.

One could argue that why treat log space differently from other spaces. A valid point, this ticket actually questions why we think the Euclidean space should have a special position in the database. If one is not satisfied with simply adding support for log/Euclidean space tracking one can imagine an enumeration of pre-defined spaces that is extended as further transforms should be supported ... just a thought.

Change History (1)

comment:1 by Jari Häkkinen, 15 years ago

Description: modified (diff)
Summary: The dynamic part of BASE should keep track whether data is stored logged or notThe dynamic part of BASE should keep track whether intensity data is in log space or not

More arguments for storing information in BASE about how data is stored.

If users run the Jep Intensity Transformer plug-in and transforms all intensities to logged intensities the values are stored as logged values in BASE. After this point the user must remember that all work on this branch in the analysis tree is in log space (at least until data is transformed again). All that is fine but what if the user selects to normalize with 'Average Normalization' from the net.sf.basedb.normalizers package? This normalization will always use geometric mean as the averager. Geometric mean makes sense for non-logged data but arithmetic mean makes sense for logged data.

There is two solution to this; i) If BASE knows if data is in log space or not the plug-in could use this information to automatically select the appropriate averaging method, ii) Add 'select average method functionality' to the normalization plug-in. In the latter case the user needs to be informed about how to choose the average method, whereas if BASE stores log space information the plug-in automatically selects the appropriate average method. Of course, i) and ii) can co-exist where the plug-in uses log space information to set an appropriate default.

Note: See TracTickets for help on using tickets.