Class BaseFileParser

java.lang.Object
net.sf.basedb.util.basefile.BaseFileParser

public class BaseFileParser
extends Object
Parser for serial and matrix BASEfile:s. This class will setup and perform the line-by-line parsing, but it doesn't handle the data. Data is handled section-wise by registering BaseFileSectionParser:s with setSectionParser(String, BaseFileSectionParser). Sections that doens't have a registered parser are skipped.
Version:
2.14
Author:
Nicklas
Last modified
$Date: 2011-03-16 12:48:47 +0100 (on, 16 mar 2011) $
  • Field Details

  • Constructor Details

    • BaseFileParser

      public BaseFileParser()
      Creates a new parser object.
  • Method Details

    • setSectionParser

      public void setSectionParser​(String section, BaseFileSectionParser parser)
      Adds a section parser. If a parser already exists for the given section it is replaced.
      Parameters:
      section - The name of the section the parser handles
      parser - The parser, or null to remove a previously registered parser
    • getSectionParser

      public BaseFileSectionParser getSectionParser​(String section)
      Get the parser that is currently registered for a section.
      Parameters:
      section - The name of the section
      Returns:
      A section parser, or null if no parser is registered
    • setRedefinedColumnName

      public void setRedefinedColumnName​(String section, String defaultName, String redefinedName)
      Redefined a column name for a specified section. This function is useful to configure parsers that normally look for, for example, the column 'reporter' to instead look for 'Reporter ID'.
      Parameters:
      section - The section the redefined name is valid for
      defaultName - The default name (eg. 'reporter')
      redefinedName - The redefined name (ef. 'Reporter ID')
    • getRedefinedColumnName

      public String getRedefinedColumnName​(String section, String defaultName)
      Get the redefined column name for a section. If the name has not been redefined, the default name is returned.
      Parameters:
      section - The section the redefined name is valid for
      defaultName - The default name
      Returns:
      The redefined name, or the default name if it has not been redefined
    • copyRedefinedColumnNames

      public void copyRedefinedColumnNames​(BaseFileParser parser)
      Copy redefined column names from the given parser into this parser. Column names that have already been redefined in this parser will be overwritten only if they have also been redefined in the other parser.
      Parameters:
      parser - The parser to copy redefined column names from
    • setProgressReporter

      public void setProgressReporter​(AbsoluteProgressReporter progress)
      Set a progress reporter that will be used to report how the parsing is progressing.
      Parameters:
      progress - A progress reporter, or null to not report progress
    • parse

      public FlatFileParser parse​(InputStream in, String charset) throws IOException
      Parse the given input stream.
      Parameters:
      in - The data stream to parse
      charset - The character set used by the data, or null to use the configured default
      Returns:
      The FlatFileParser the was used to parse the BASEfile
      Throws:
      IOException - If there is any problem with the parsing
    • getSectionCount

      public int getSectionCount​(String section)
      The number of times we have seen a certain section in the file.
      Parameters:
      section - The name of the section
    • increaseSectionCount

      protected void increaseSectionCount​(String section, int count)
      Counts the number of times a section has been seen in the file.
      Parameters:
      section - The name of the section
      count - The count to add to the current count
    • setProgress

      public void setProgress​(long completed, String message)
      Update the progress of the parsing.
      See Also:
      ProgressReporter
    • getInitializedFlatFileParser

      protected FlatFileParser getInitializedFlatFileParser​(InputStream stream, String charset)
      Creates a FlatFileParser for parsing a BASEfile.
      Parameters:
      stream - The stream that the parser should read from.
      Returns:
      The initialised parser
    • getRequiredHeader

      public String getRequiredHeader​(FlatFileParser ffp, String header, String section, String filename)
      Get the value of a required header. This method calls FlatFileParser.getHeader(String) to get the value of the header. If the header is missing or has an empty value an exception is thrown.
      Parameters:
      ffp - The flat file parser
      header - The name of the header
      section - The name of the section that is being parsed
      filename - The name of the file that is being parsed (used to create the error message in case the header is missing)
      Returns:
      The value of the header
    • getRequiredHeader

      public List<String> getRequiredHeader​(FlatFileParser ffp, String header, String split, String section, String filename)
      Get the value of a header as a list of sub-values. If the header is missing or empty an exception is thrown. Otherwise the 'split' regular expression is used to the split the value into a list of values.
      Parameters:
      ffp - The flat file parser
      header - The name of the header
      split - A regular expression used to split the value
      section - The name of the section that is being parsed
      filename - The name of the file that is being parsed (used to create the error message in case the header is missing)
      Returns:
      A list with the values of the header
    • getRequiredIndex

      public int getRequiredIndex​(List<String> values, String value, String header, String section, int line, String filename)
      Get the index of value in a list of values or throw an exception if the value is not found. The lookup is case-insensitive.
      Parameters:
      values - The list of values to look in
      value - The value to look for
      header - The name of the header the values are from
      section - The name of the section that is being parsed
      line - The current line number in the parsed file
      filename - The The name of the file that is being parsed (used to create the error message in case the header is missing)
      Returns:
      The index of the value