2.17.2: 2011-06-17

net.sf.basedb.plugins
Class IlluminaRawDataImporter

java.lang.Object
  extended by net.sf.basedb.core.plugin.AbstractPlugin
      extended by net.sf.basedb.plugins.AbstractFlatFileImporter
          extended by net.sf.basedb.plugins.IlluminaRawDataImporter
All Implemented Interfaces:
AutoDetectingImporter, InteractivePlugin, Plugin, SignalTarget

public class IlluminaRawDataImporter
extends AbstractFlatFileImporter
implements InteractivePlugin

An importer plug-in for Illumina raw data. The plug-in will create one or more raw bioassays from the data in the file. Optionally, it will add the raw bioassays to an experiment.

Since the data files doesn't have any coordinate information the importer will use one of the following methods.

NOTE! Since the methods are not conflicting with each other, there will not be an actual check which method to use by this plug-in. We will simple set all values as specified above and let the BASE core handle the identification.

Version:
2.4
Author:
nicklas
Last modified
$Date: 2010-11-08 14:49:35 +0100 (Mon, 08 Nov 2010) $

Nested Class Summary
private static class IlluminaRawDataImporter.BatchAndMapHolder
          Holds a RawBioAssay and the RawDataBatcher to use for import and the Mapper for all it's data columns.
 
Nested classes/interfaces inherited from interface net.sf.basedb.core.plugin.Plugin
Plugin.MainType
 
Field Summary
private static About about
           
private static PluginParameter<String> associationsSection
          Section definition for grouping associations to other items: scan, software, protocol and experiment
private  RequestInformation configureJob
           
private  DbControl dc
           
private  ArrayDesign design
           
private  Experiment experiment
           
private static PluginParameter<String> featureIdentificationParameter
           
private static PluginParameter<String> featureMismatchErrorParameter
           
private  FlatFileParser ffp
           
private  FeatureIdentificationMethod fiMethod
           
private static Set<GuiContext> guiContexts
           
private  List<FlatFileParser.Line> headerLines
           
private  List<IlluminaRawDataImporter.BatchAndMapHolder> holders
           
private  RawDataType illumina
           
private static PluginParameter<String> invalidColumnsErrorParameter
           
private static PluginParameter<String> missingReporterErrorParameter
           
private  boolean nullIfException
           
private  boolean nullIfMissingReporter
           
private  NumberFormat numberFormat
           
private  int numInserted
           
private  int numRawBioAssays
           
private  Protocol protocol
           
private  List<RawBioAssay> rawBioAssays
           
private  Mapper reporterMapper
           
private  Scan scan
           
private  Software software
           
private static Pattern splitHeader
          A column header must be like: extendedpropertyname-arrayname
private  boolean useSmartFeatureMismatchHandling
           
private  boolean verifyColumns
           
 
Fields inherited from class net.sf.basedb.plugins.AbstractFlatFileImporter
CHARSET, charsetType, complexMappings, dataFooterRegexpParameter, dataHeaderRegexpParameter, dataSplitterRegexpParameter, DECIMAL_SEPARATOR, decimalSeparatorType, defaultErrorParameter, errorSection, fileParameter, fileType, headerRegexpParameter, ignoreRegexpParameter, invalidUseOfNullErrorParameter, mappingSection, maxDataColumnsParameter, minDataColumnsParameter, numberFormatErrorParameter, numberOutOfRangeErrorParameter, numDataColumnsType, optionalRegexpType, parserSection, requiredRegexpType, sectionRegexpParameter, stringTooLongErrorParameter, trimQuotesParameter
 
Fields inherited from class net.sf.basedb.core.plugin.AbstractPlugin
annotationSection, configuration, COPY_ANNOTATIONS, job, OVERWRITE_ANNOTATIONS, sc
 
Constructor Summary
IlluminaRawDataImporter()
           
 
Method Summary
protected  void begin(FlatFileParser ffp)
          Called just before parsing of the file begins.
protected  void beginData()
          Check column headers and map them to raw bioassays.
 void configure(GuiContext context, Request request, Response response)
          Configure the plugin.
protected  void end(boolean success)
          If successful: Close batchers Associate with experiment if there is one Commit If not successful: Delete raw bioassays that has been created Rollback
private  List<RawBioAssay> extractAndCreateRawBioAssays(DbControl dc, List<String> headers, boolean verifyColumns, File rawDataFile)
          Extract array names and raw data property names from the column headers.
 About getAbout()
          Get information about the plugin, such as name, version, authors, etc.
private static String getColumnName(RawDataProperty rdp, RawBioAssay rba)
          Convert a raw data property to a column name.
private  RequestInformation getConfigureJobParameters(GuiContext context)
           
protected  String getDecimalSeparator()
          Get the decimal separator used by numbers in the file.
 Set<GuiContext> getGuiContexts()
          Get a set containing all items that the plugin handles.
protected  FlatFileParser getInitializedFlatFileParser()
          Create a FlatFileParser that can parse Illumina data files: Data splitter: (,|\t) Header regexp: (.+)=(.*?)
private
<T extends BasicItem>
List<T>
getItems(DbControl dc, ItemQuery<T> query, Restriction... restrictions)
          Sort the items by name and add USE permission filter to the query.
 RequestInformation getRequestInformation(GuiContext context, String command)
          This method will return the RequestInformation for a given command, i.e.
protected  String getSuccessMessage(int skippedLines)
          Called if the parsing was successful to let the subclass generate a simple message that is sent back to the core and user interface.
protected  void handleData(FlatFileParser.Data data)
          Called by the parser for every line in the file that is a data line.
protected  void handleHeader(FlatFileParser.Line line)
          Copy headers so that we can set them on the raw bioassays in beginData()
protected  boolean isImportable(FlatFileParser ffp)
          Check that the first line contains the text "Illumina"
 String isInContext(GuiContext context, Object item)
          If used from an experiment context, verify that the experiment is an 'illumina' experiment and that the logged in user has permission to write.
 boolean requiresConfiguration()
          Return TRUE, since the implementation requires it for finding the regular expressions used by the FlatFileParser.
 boolean supportsConfigurations()
          Returns TRUE, since that is how the plugins used to work before this method was introduced.
 
Methods inherited from class net.sf.basedb.plugins.AbstractFlatFileImporter
addErrorHandler, checkColumnMapping, checkColumnMapping, continueWithNextFileAfterError, doImport, finish, getCharset, getCharset, getCharsetParameter, getDecimalSeparatorParameter, getErrorHandler, getErrorOption, getFileIterator, getMainType, getMapper, getNumberFormat, getNumBytes, getPrimaryLocationFilter, getProgress, getSignalHandler, getTotalFileSize, handleSection, isImportable, log, log, log, log, run, setUpErrorHandling, start, wrapInputStream
 
Methods inherited from class net.sf.basedb.core.plugin.AbstractPlugin
checkInterrupted, cloneParameterWithDefaultValue, closeLogFile, createLogFile, done, getCopyAnnotationsParmeter, getCurrentConfiguration, getCurrentJob, getJobOrConfigurationValue, getOverwriteAnnotationsParameters, getPermissions, init, log, log, storeValue, storeValue, storeValues, validateRequestParameters
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface net.sf.basedb.core.plugin.Plugin
done, getMainType, getPermissions, init, run
 

Field Detail

about

private static final About about

guiContexts

private static final Set<GuiContext> guiContexts

featureIdentificationParameter

private static final PluginParameter<String> featureIdentificationParameter

invalidColumnsErrorParameter

private static final PluginParameter<String> invalidColumnsErrorParameter

missingReporterErrorParameter

private static final PluginParameter<String> missingReporterErrorParameter

featureMismatchErrorParameter

private static final PluginParameter<String> featureMismatchErrorParameter

associationsSection

private static final PluginParameter<String> associationsSection
Section definition for grouping associations to other items: scan, software, protocol and experiment


configureJob

private RequestInformation configureJob

illumina

private RawDataType illumina

ffp

private FlatFileParser ffp

dc

private DbControl dc

experiment

private Experiment experiment

design

private ArrayDesign design

scan

private Scan scan

software

private Software software

protocol

private Protocol protocol

fiMethod

private FeatureIdentificationMethod fiMethod

rawBioAssays

private List<RawBioAssay> rawBioAssays

holders

private List<IlluminaRawDataImporter.BatchAndMapHolder> holders

headerLines

private List<FlatFileParser.Line> headerLines

reporterMapper

private Mapper reporterMapper

numInserted

private int numInserted

numRawBioAssays

private int numRawBioAssays

numberFormat

private NumberFormat numberFormat

nullIfException

private boolean nullIfException

verifyColumns

private boolean verifyColumns

nullIfMissingReporter

private boolean nullIfMissingReporter

useSmartFeatureMismatchHandling

private boolean useSmartFeatureMismatchHandling

splitHeader

private static final Pattern splitHeader
A column header must be like: extendedpropertyname-arrayname

Constructor Detail

IlluminaRawDataImporter

public IlluminaRawDataImporter()
Method Detail

getAbout

public About getAbout()
Description copied from interface: Plugin
Get information about the plugin, such as name, version, authors, etc.

Specified by:
getAbout in interface Plugin
Returns:
An About object

supportsConfigurations

public boolean supportsConfigurations()
Description copied from class: AbstractPlugin
Returns TRUE, since that is how the plugins used to work before this method was introduced.

Specified by:
supportsConfigurations in interface Plugin
Overrides:
supportsConfigurations in class AbstractPlugin
Returns:
TRUE or FALSE

requiresConfiguration

public boolean requiresConfiguration()
Description copied from class: AbstractFlatFileImporter
Return TRUE, since the implementation requires it for finding the regular expressions used by the FlatFileParser. If this method is overridden and returns FALSE, the subclass must also override the AbstractFlatFileImporter.getInitializedFlatFileParser() method and provide a parser with all regular expressions and other options set.

Specified by:
requiresConfiguration in interface Plugin
Overrides:
requiresConfiguration in class AbstractFlatFileImporter
Returns:
TRUE or FALSE

getGuiContexts

public Set<GuiContext> getGuiContexts()
Description copied from interface: InteractivePlugin
Get a set containing all items that the plugin handles. Ie. if the plugin imports reporters, return a set containing Item.REPORTER. This information is used by client applications to put the plugin in the proper place in the user interface.

Specified by:
getGuiContexts in interface InteractivePlugin
Returns:
A Set containing Item:s, or null if the plugin is not concerned about items

isInContext

public String isInContext(GuiContext context,
                          Object item)
If used from an experiment context, verify that the experiment is an 'illumina' experiment and that the logged in user has permission to write.

Specified by:
isInContext in interface InteractivePlugin
Parameters:
context - The current context of the client application, it is one of the values found in set returned by InteractivePlugin.getGuiContexts()
item - The currently active item, it's type should match the GuiContext.getItem() type, or null if the context is a list context
Returns:
Null if the plugin can use that item, or a warning-level message explaining why the plugin can't be used

getRequestInformation

public RequestInformation getRequestInformation(GuiContext context,
                                                String command)
                                         throws BaseException
Description copied from interface: InteractivePlugin
This method will return the RequestInformation for a given command, i.e. the list of parameters and some nice help text.

Specified by:
getRequestInformation in interface InteractivePlugin
Parameters:
context - The current context of the client application, it is one of the values found in set returned by InteractivePlugin.getGuiContexts()
command - The command
Returns:
The RequestInformation for the command
Throws:
BaseException - if there is an error

configure

public void configure(GuiContext context,
                      Request request,
                      Response response)
Description copied from interface: InteractivePlugin
Configure the plugin. Hopefully the client is supplying values for the parameters specified by InteractivePlugin.getRequestInformation(GuiContext, String).

Specified by:
configure in interface InteractivePlugin
Parameters:
context - The current context of the client application, it is one of the values found in set returned by InteractivePlugin.getGuiContexts()
request - Request object with the command and parameters
response - Response object in for the plugin to response through

getInitializedFlatFileParser

protected FlatFileParser getInitializedFlatFileParser()
                                               throws BaseException
Create a FlatFileParser that can parse Illumina data files: NOTE! To begin with we support both comma and tab as column splitter but later on (in isImportable(FlatFileParser)) when we know which one is actually used, we change this in the parser. We need to do this since numbers may use comma as decimal separator.

Overrides:
getInitializedFlatFileParser in class AbstractFlatFileImporter
Returns:
An intialised flat file parser
Throws:
BaseException

getDecimalSeparator

protected String getDecimalSeparator()
Description copied from class: AbstractFlatFileImporter
Get the decimal separator used by numbers in the file. This method first checks the job parameters for a value, then the configuration parameters. If not found null is returned.

Overrides:
getDecimalSeparator in class AbstractFlatFileImporter
Returns:
As specified by job parameter or "dot" if not

isImportable

protected boolean isImportable(FlatFileParser ffp)
Check that the first line contains the text "Illumina"

Overrides:
isImportable in class AbstractFlatFileImporter
Parameters:
ffp - The FlatFileParser object used to parse the file
Returns:
TRUE if the first line contains "Illumina", FALSE otherwise

begin

protected void begin(FlatFileParser ffp)
              throws BaseException
Description copied from class: AbstractFlatFileImporter
Called just before parsing of the file begins. A subclass may override this method if it needs to initialise some resources before the parsing starts. Note that this method is called once for each file returned by AbstractFlatFileImporter.getFileIterator().

Overrides:
begin in class AbstractFlatFileImporter
Throws:
BaseException
See Also:
AbstractFlatFileImporter.end(boolean)

handleHeader

protected void handleHeader(FlatFileParser.Line line)
                     throws BaseException
Copy headers so that we can set them on the raw bioassays in beginData()

Overrides:
handleHeader in class AbstractFlatFileImporter
Throws:
BaseException

beginData

protected void beginData()
Check column headers and map them to raw bioassays. Create raw bioassays. Initialise column Mapper:s.

Overrides:
beginData in class AbstractFlatFileImporter

handleData

protected void handleData(FlatFileParser.Data data)
                   throws BaseException
Description copied from class: AbstractFlatFileImporter
Called by the parser for every line in the file that is a data line.

Specified by:
handleData in class AbstractFlatFileImporter
Throws:
BaseException

end

protected void end(boolean success)
            throws BaseException
If successful: If not successful:

Overrides:
end in class AbstractFlatFileImporter
Parameters:
success - TRUE if the file was parsed successfully, FALSE otherwise
Throws:
BaseException
See Also:
AbstractFlatFileImporter.begin(FlatFileParser)

getSuccessMessage

protected String getSuccessMessage(int skippedLines)
Description copied from class: AbstractFlatFileImporter
Called if the parsing was successful to let the subclass generate a simple message that is sent back to the core and user interface. An example message might by: 178 reporters imported successfully. The default implementation always return null. Note that this method is called once for every file returned by AbstractFlatFileImporter.getFileIterator().

Overrides:
getSuccessMessage in class AbstractFlatFileImporter
Parameters:
skippedLines - The number of data lines that were skipped due to errors

extractAndCreateRawBioAssays

private List<RawBioAssay> extractAndCreateRawBioAssays(DbControl dc,
                                                       List<String> headers,
                                                       boolean verifyColumns,
                                                       File rawDataFile)
Extract array names and raw data property names from the column headers. Create a raw bioassay for each array and set headers, scan, software and protocol. Optionally, verify that all arrays have the same data columns. Duplicate columns are always reported as an error.


getConfigureJobParameters

private RequestInformation getConfigureJobParameters(GuiContext context)

getColumnName

private static String getColumnName(RawDataProperty rdp,
                                    RawBioAssay rba)
Convert a raw data property to a column name.


getItems

private <T extends BasicItem> List<T> getItems(DbControl dc,
                                               ItemQuery<T> query,
                                               Restriction... restrictions)
Sort the items by name and add USE permission filter to the query.


2.17.2: 2011-06-17