Class SimpleStringDetector

  • All Implemented Interfaces:
    StringDetector

    public class SimpleStringDetector
    extends Object
    implements StringDetector
    A simple string detector implementation that works with two strings. It is designed to be used to detect the encoding in tabular data files using one of the ISO-8859-N encodings or similar that are not not possible to separate techically. The data file is expected to contain a header line were at least one header column has a name with non-ASCII characters. For each line of data if will first check if the 'ifFound' string can be found. If not, it will return null to request more data. If the 'ifFound' string is found, it will continue to see if the 'thenMatch' string is also present. If so, TRUE is returned to indicate a successful encoding match, otherwise FALSE is return to indicate an incorrect encoding. Note that the two strings need to be selected wisely. The 'ifFound' string should typcially be an ASCII-only string and 'thenMatch' a string with one or more non-ASCII characters. For example, if the file header is: Namn{tab}Ålder, we could use 'ifFound=Namn' and 'thenMatch=Ålder'. If the entire file is parsed without finding the 'ifFound' string, the eof(int) method will return false.
    Since:
    3.15
    Author:
    nicklas
    • Field Detail

      • ifFound

        private final String ifFound
      • thenMatch

        private final String thenMatch
    • Constructor Detail

      • SimpleStringDetector

        public SimpleStringDetector​(String ifFound,
                                    String thenMatch)
    • Method Detail

      • checkLine

        public boolean checkLine​(int lineNo,
                                 String line)
                          throws IOException
        Description copied from interface: StringDetector
        Check the given line. The detector should return TRUE if it can be certain that the file has been decoded correctly. If it can be sure that the file has been decoded incorrecty it should throw an IOException. If the detector is not sure without more data, it should return false.
        Specified by:
        checkLine in interface StringDetector
        Throws:
        IOException
      • eof

        public void eof​(int parsedLines)
                 throws IOException
        Description copied from interface: StringDetector
        This is called when the end of file has been reached and the checkLine method has returned false for all lines. If this is considered to be an incorrect decoding condition, the detector should throw an IOException, otherwise it should simply return. Note that this method is not called if TRUE is returned from the checkLine method.
        Specified by:
        eof in interface StringDetector
        Throws:
        IOException