2.17.2: 2011-06-17

net.sf.basedb.util.basefile
Class BaseFileParser

java.lang.Object
  extended by net.sf.basedb.util.basefile.BaseFileParser

public class BaseFileParser
extends Object

Parser for serial and matrix BASEfile:s. This class will setup and perform the line-by-line parsing, but it doesn't handle the data. Data is handled section-wise by registering BaseFileSectionParser:s with setSectionParser(String, BaseFileSectionParser). Sections that doens't have a registered parser are skipped.

Version:
2.14
Author:
Nicklas
Last modified
$Date: 2010-09-10 13:09:05 +0200 (Fri, 10 Sep 2010) $

Field Summary
private  Map<String,BaseFileSectionParser> parsers
           
private  AbsoluteProgressReporter progress
           
private  Map<String,String> redefinedColumnNames
           
private  Map<String,Integer> sectionCount
           
 
Constructor Summary
BaseFileParser()
          Creates a new parser object.
 
Method Summary
 void checkInterrupted()
          Deprecated. In 2.16, use ThreadSignalHandler.checkInterrupted() instead
 void copyRedefinedColumnNames(BaseFileParser parser)
          Copy redefined column names from the given parser into this parser.
protected  FlatFileParser getInitializedFlatFileParser(InputStream stream, String charset)
          Creates a FlatFileParser for parsing a BASEfile.
 String getRedefinedColumnName(String section, String defaultName)
          Get the redefined column name for a section.
 String getRequiredHeader(FlatFileParser ffp, String header, String section, String filename)
          Get the value of a required header.
 List<String> getRequiredHeader(FlatFileParser ffp, String header, String split, String section, String filename)
          Get the value of a header as a list of sub-values.
 int getRequiredIndex(List<String> values, String value, String header, String section, int line, String filename)
          Get the index of value in a list of values or throw an exception if the value is not found.
 int getSectionCount(String section)
          The number of times we have seen a certain section in the file.
 BaseFileSectionParser getSectionParser(String section)
          Get the parser that is currently registered for a section.
protected  void increaseSectionCount(String section, int count)
          Counts the number of times a section has been seen in the file.
 FlatFileParser parse(InputStream in, String charset)
          Parse the given input stream.
 void setProgress(long completed, String message)
          Update the progress of the parsing.
 void setProgressReporter(AbsoluteProgressReporter progress)
          Set a progress reporter that will be used to report how the parsing is progressing.
 void setRedefinedColumnName(String section, String defaultName, String redefinedName)
          Redefined a column name for a specified section.
 void setSectionParser(String section, BaseFileSectionParser parser)
          Adds a section parser.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

parsers

private final Map<String,BaseFileSectionParser> parsers

progress

private AbsoluteProgressReporter progress

sectionCount

private Map<String,Integer> sectionCount

redefinedColumnNames

private Map<String,String> redefinedColumnNames
Constructor Detail

BaseFileParser

public BaseFileParser()
Creates a new parser object.

Method Detail

setSectionParser

public void setSectionParser(String section,
                             BaseFileSectionParser parser)
Adds a section parser. If a parser already exists for the given section it is replaced.

Parameters:
section - The name of the section the parser handles
parser - The parser, or null to remove a previously registered parser

getSectionParser

public BaseFileSectionParser getSectionParser(String section)
Get the parser that is currently registered for a section.

Parameters:
section - The name of the section
Returns:
A section parser, or null if no parser is registered

setRedefinedColumnName

public void setRedefinedColumnName(String section,
                                   String defaultName,
                                   String redefinedName)
Redefined a column name for a specified section. This function is useful to configure parsers that normally look for, for example, the column 'reporter' to instead look for 'Reporter ID'.

Parameters:
section - The section the redefined name is valid for
defaultName - The default name (eg. 'reporter')
redefinedName - The redefined name (ef. 'Reporter ID')

getRedefinedColumnName

public String getRedefinedColumnName(String section,
                                     String defaultName)
Get the redefined column name for a section. If the name has not been redefined, the default name is returned.

Parameters:
section - The section the redefined name is valid for
defaultName - The default name
Returns:
The redefined name, or the default name if it has not been redefined

copyRedefinedColumnNames

public void copyRedefinedColumnNames(BaseFileParser parser)
Copy redefined column names from the given parser into this parser. Column names that have already been redefined in this parser will be overwritten only if they have also been redefined in the other parser.

Parameters:
parser - The parser to copy redefined column names from

setProgressReporter

public void setProgressReporter(AbsoluteProgressReporter progress)
Set a progress reporter that will be used to report how the parsing is progressing.

Parameters:
progress - A progress reporter, or null to not report progress

parse

public FlatFileParser parse(InputStream in,
                            String charset)
                     throws IOException
Parse the given input stream.

Parameters:
in - The data stream to parse
charset - The character set used by the data, or null to use the configured default
Returns:
The FlatFileParser the was used to parse the BASEfile
Throws:
IOException - If there is any problem with the parsing

getSectionCount

public int getSectionCount(String section)
The number of times we have seen a certain section in the file.

Parameters:
section - The name of the section

increaseSectionCount

protected void increaseSectionCount(String section,
                                    int count)
Counts the number of times a section has been seen in the file.

Parameters:
section - The name of the section
count - The count to add to the current count

checkInterrupted

@Deprecated
public void checkInterrupted()
Deprecated. In 2.16, use ThreadSignalHandler.checkInterrupted() instead

Checks if the currently executing thread has been interrupted and throws a SignalException if it has.


setProgress

public void setProgress(long completed,
                        String message)
Update the progress of the parsing.

See Also:
ProgressReporter

getInitializedFlatFileParser

protected FlatFileParser getInitializedFlatFileParser(InputStream stream,
                                                      String charset)
Creates a FlatFileParser for parsing a BASEfile.

Parameters:
stream - The stream that the parser should read from.
Returns:
The initialised parser

getRequiredHeader

public String getRequiredHeader(FlatFileParser ffp,
                                String header,
                                String section,
                                String filename)
Get the value of a required header. This method calls FlatFileParser.getHeader(String) to get the value of the header. If the header is missing or has an empty value an exception is thrown.

Parameters:
ffp - The flat file parser
header - The name of the header
section - The name of the section that is being parsed
filename - The name of the file that is being parsed (used to create the error message in case the header is missing)
Returns:
The value of the header

getRequiredHeader

public List<String> getRequiredHeader(FlatFileParser ffp,
                                      String header,
                                      String split,
                                      String section,
                                      String filename)
Get the value of a header as a list of sub-values. If the header is missing or empty an exception is thrown. Otherwise the 'split' regular expression is used to the split the value into a list of values.

Parameters:
ffp - The flat file parser
header - The name of the header
split - A regular expression used to split the value
section - The name of the section that is being parsed
filename - The name of the file that is being parsed (used to create the error message in case the header is missing)
Returns:
A list with the values of the header

getRequiredIndex

public int getRequiredIndex(List<String> values,
                            String value,
                            String header,
                            String section,
                            int line,
                            String filename)
Get the index of value in a list of values or throw an exception if the value is not found. The lookup is case-insensitive.

Parameters:
values - The list of values to look in
value - The value to look for
header - The name of the header the values are from
section - The name of the section that is being parsed
line - The current line number in the parsed file
filename - The The name of the file that is being parsed (used to create the error message in case the header is missing)
Returns:
The index of the value

2.17.2: 2011-06-17