Package net.sf.basedb.util.charset
Class SimpleStringDetector
java.lang.Object
net.sf.basedb.util.charset.SimpleStringDetector
- All Implemented Interfaces:
StringDetector
A simple string detector implementation that works with two strings.
It is designed to be used to detect the encoding in tabular data files
using one of the ISO-8859-N encodings or similar that are not not possible
to separate techically.
The data file is expected to contain a header line were at least one header
column has a name with non-ASCII characters.
For each line of data if will first check if the 'ifFound' string
can be found. If not, it will return null to request more data.
If the 'ifFound' string is found, it will continue to see if the 'thenMatch'
string is also present. If so, TRUE is returned to indicate a successful
encoding match, otherwise FALSE is return to indicate an incorrect encoding.
Note that the two strings need to be selected wisely. The 'ifFound' string should
typcially be an ASCII-only string and 'thenMatch' a string with one or more non-ASCII
characters.
For example, if the file header is:
Namn{tab}Ålder
,
we could use 'ifFound=Namn' and 'thenMatch=Ålder'.
If the entire file is parsed without finding the 'ifFound' string, the eof(int)
method
will return false.- Since:
- 3.15
- Author:
- nicklas
-
Field Summary
-
Constructor Summary
-
Method Summary
-
Field Details
-
ifFound
-
thenMatch
-
-
Constructor Details
-
SimpleStringDetector
-
-
Method Details
-
checkLine
Description copied from interface:StringDetector
Check the given line. The detector should return TRUE if it can be certain that the file has been decoded correctly. If it can be sure that the file has been decoded incorrecty it should throw an IOException. If the detector is not sure without more data, it should return false.- Specified by:
checkLine
in interfaceStringDetector
- Throws:
IOException
-
eof
Description copied from interface:StringDetector
This is called when the end of file has been reached and the checkLine method has returned false for all lines. If this is considered to be an incorrect decoding condition, the detector should throw an IOException, otherwise it should simply return. Note that this method is not called if TRUE is returned from the checkLine method.- Specified by:
eof
in interfaceStringDetector
- Throws:
IOException
-