Package com.univocity.parsers.csv
Class CsvFormatDetector
- java.lang.Object
-
- com.univocity.parsers.csv.CsvFormatDetector
-
- All Implemented Interfaces:
InputAnalysisProcess
public abstract class CsvFormatDetector extends java.lang.Object implements InputAnalysisProcess
AnInputAnalysisProcess
to detect column delimiters, quotes and quote escapes in a CSV input.
-
-
Field Summary
Fields Modifier and Type Field Description private char[]
allowedDelimiters
private char
comment
private char[]
delimiterPreference
private int
MAX_ROW_SAMPLES
private char
normalizedNewLine
private char
suggestedDelimiter
private int
whitespaceRangeStart
-
Constructor Summary
Constructors Constructor Description CsvFormatDetector(int maxRowSamples, CsvParserSettings settings, int whitespaceRangeStart)
Builds a newCsvFormatDetector
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description (package private) abstract void
apply(char delimiter, char quote, char quoteEscape)
Applies the discovered CSV format elements to theCsvParser
private java.util.Map<java.lang.Character,java.lang.Integer>
calculateTotals(java.util.List<java.util.Map<java.lang.Character,java.lang.Integer>> symbolsPerRow)
void
execute(char[] characters, int length)
A sequence of characters of the input buffer to be analyzed.private char
getChar(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar, boolean min)
Returns the character with the highest or lowest associated number.private static void
increment(java.util.Map<java.lang.Character,java.lang.Integer> map, char symbol)
Increments the number associated with a character in a map by 1private static void
increment(java.util.Map<java.lang.Character,java.lang.Integer> map, char symbol, int incrementSize)
Increments the number associated with a character in a mapprivate boolean
isAllowedDelimiter(char ch)
private boolean
isSymbol(char ch)
private char
max(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar)
Returns the character with the highest associated number.private char
min(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar)
Returns the character with the lowest associated number.
-
-
-
Field Detail
-
MAX_ROW_SAMPLES
private final int MAX_ROW_SAMPLES
-
comment
private final char comment
-
suggestedDelimiter
private final char suggestedDelimiter
-
normalizedNewLine
private final char normalizedNewLine
-
whitespaceRangeStart
private final int whitespaceRangeStart
-
allowedDelimiters
private char[] allowedDelimiters
-
delimiterPreference
private char[] delimiterPreference
-
-
Constructor Detail
-
CsvFormatDetector
CsvFormatDetector(int maxRowSamples, CsvParserSettings settings, int whitespaceRangeStart)
Builds a newCsvFormatDetector
- Parameters:
maxRowSamples
- the number of row samples to collect before analyzing the statisticssettings
- the configuration provided by the user with potential defaults in case the detection is unable to discover the proper column delimiter or quote character.whitespaceRangeStart
- starting range of characters considered to be whitespace.
-
-
Method Detail
-
calculateTotals
private java.util.Map<java.lang.Character,java.lang.Integer> calculateTotals(java.util.List<java.util.Map<java.lang.Character,java.lang.Integer>> symbolsPerRow)
-
execute
public void execute(char[] characters, int length)
Description copied from interface:InputAnalysisProcess
A sequence of characters of the input buffer to be analyzed.- Specified by:
execute
in interfaceInputAnalysisProcess
- Parameters:
characters
- the input bufferlength
- the last character position loaded into the buffer.
-
increment
private static void increment(java.util.Map<java.lang.Character,java.lang.Integer> map, char symbol)
Increments the number associated with a character in a map by 1- Parameters:
map
- the map of characters and their numberssymbol
- the character whose number should be increment
-
increment
private static void increment(java.util.Map<java.lang.Character,java.lang.Integer> map, char symbol, int incrementSize)
Increments the number associated with a character in a map- Parameters:
map
- the map of characters and their numberssymbol
- the character whose number should be incrementincrementSize
- the size of the increment
-
min
private char min(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar)
Returns the character with the lowest associated number.- Parameters:
map
- the map of characters and their numbersdefaultChar
- the default character to return in case the map is empty- Returns:
- the character with the lowest number associated.
-
max
private char max(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar)
Returns the character with the highest associated number.- Parameters:
map
- the map of characters and their numbersdefaultChar
- the default character to return in case the map is empty- Returns:
- the character with the highest number associated.
-
getChar
private char getChar(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar, boolean min)
Returns the character with the highest or lowest associated number.- Parameters:
map
- the map of characters and their numbersdefaultChar
- the default character to return in case the map is emptymin
- a flag indicating whether to return the character associated with the lowest number in the map. Iffalse
then the character associated with the highest number found will be returned.- Returns:
- the character with the highest/lowest number associated.
-
isSymbol
private boolean isSymbol(char ch)
-
isAllowedDelimiter
private boolean isAllowedDelimiter(char ch)
-
apply
abstract void apply(char delimiter, char quote, char quoteEscape)
Applies the discovered CSV format elements to theCsvParser
- Parameters:
delimiter
- the discovered delimiter characterquote
- the discovered quote characterquoteEscape
- the discovered quote escape character.
-
-