Class BfsImporter

java.lang.Object
net.sf.basedb.util.importer.spotdata.BfsImporter

public class BfsImporter
extends Object
Imports spot data from a serial or matrix BFS. Before it can be used various configuration properties must be set.

The import is started with doImport(). This will start by parsing the metadata file. Additional files that are referenced in the [files] section are opened via the InputStreamLocator. File entries that start with 'x-' are considered to be extra files that should be attached to the child bioassay set. But this is only done if an ExtraFileImporter has been specified.

The section [settings] is used to control some aspects of the parsing. If the data is going into the same datacube the section is not needed. If a new datacube is needed the importer needs to know how to map the spot data to positions/reporters and to assays. We support three cases:

  1. The data uses the same datacube as the parent
  2. The data needs a new datacube but there is a one-to-one relation with the parent assays
  3. The data needs a new datacube and there may be more than one parent assay for each child assay

A new datacube is created by including a new-data-cube entry in the settings section. Multiple parents per assay is enabled by including a multi-assay-parents entry.

The child bioassay set is by default assumed to use the same IntensityTransform as the parent bioassay set. If the imported data uses a different transform this should be specified by the transform setting. Allowed values are those defined by the IntensityTransform enumeration (NONE, LOG2 or LOG10).

The importer needs 'rdata' and 'pdata' files in all cases. The ID column in the rdata file should be the position number. If the data is using the same datacube (case 1) no more columns are needed. The order of the positions doesn't matter as long as it matches the spot data files. All positions must be unique. If the data needs a new datacube (case 2 and 3) the rdata file also needs one of 'External ID' or 'Internal ID' columns which are used to map positions to reporters.

The meaning of the ID column in the pdata file depends on the case. In case 1 and 2 the ID is the internal ID of the parent assay. No more columns are needed. In case 3, the ID column is just a unique positive integer. In this case, the pdata file also needs a second column 'Parent ID', which should be a comma-separated list of the internal assay ID values of the parent assays.

The [sdata] section lists all spot data columns. The importer is looking for entries of the form 'Ch N' where N is 1, 2, ... etc. Exactly one entry for each channel is required.

Version:
2.15
Author:
Nicklas
Last modified
$Date: 2015-04-20 11:08:18 +0200 (må, 20 apr 2015) $
  • Field Details

  • Constructor Details

    • BfsImporter

      public BfsImporter()
      Create a new importer object.
  • Method Details

    • setAutoCloseParsers

      public void setAutoCloseParsers​(boolean autoClose)
      If this option is set then all parsers are automatically closed when all data has been read to them. This setting is enabled by default.
    • setDbControl

      public void setDbControl​(DbControl dc)
      Set's the DbControl that should be used to get data from the database.
    • getDbControl

      public DbControl getDbControl()
      Get the current DbControl.
    • setProgressReporter

      public void setProgressReporter​(ProgressReporter progress)
      Set the progress reporter that is used to report progress. Call setProgress(int, String) to update the current status.
    • getProgressReporter

      public ProgressReporter getProgressReporter()
      Get the progress reporter.
    • setTransformation

      public void setTransformation​(Transformation transformation)
      Set the destination transformation. A child bioassay set is created.
      Parameters:
      transformation - The transformation
    • getTransformation

      public Transformation getTransformation()
      Get the destination transformation.
    • setMetadataParser

      public void setMetadataParser​(MetadataParser parser)
      Set the metadata file parser. The metadata file should contains information about other files and their contents.
    • getMetadataParser

      public MetadataParser getMetadataParser()
    • setInputStreamLocator

      public void setInputStreamLocator​(InputStreamLocator locator)
      Set the input stream locator that is responsible for opening files that are referenced from the metadata file.
    • setExtraFileImporter

      public void setExtraFileImporter​(ExtraFileImporter extraFileImporter)
    • setProgress

      protected void setProgress​(int percent, String message)
      Update the progress of the export.
      See Also:
      ProgressReporter
    • doImport

      public BioAssaySet doImport() throws IOException
      Start the import.
      Returns:
      The created bioassay set or null if the BASEfile didn't contain spot data
      Throws:
      IOException