Class UTF8DataInputJsonParser

  • All Implemented Interfaces:
    Versioned, java.io.Closeable, java.lang.AutoCloseable

    public class UTF8DataInputJsonParser
    extends ParserBase
    This is a concrete implementation of JsonParser, which is based on a DataInput as the input source.

    Due to limitations in look-ahead (basically there's none), as well as overhead of reading content mostly byte-by-byte, there are some minor differences from regular streaming parsing. Specifically:

    • Input location offsets not being tracked, as offsets would need to be updated for each read from all over the place. If caller wants this information, it has to track this with DataInput. This also affects column number, so the only location information available is the row (line) number (but even that is approximate in case of two-byte linefeeds -- it should work with single CR or LF tho)
    • No white space validation: checks are simplified NOT to check for control characters.
    Since:
    2.8
    • Field Detail

      • FEAT_MASK_TRAILING_COMMA

        private static final int FEAT_MASK_TRAILING_COMMA
      • FEAT_MASK_LEADING_ZEROS

        private static final int FEAT_MASK_LEADING_ZEROS
      • FEAT_MASK_NON_NUM_NUMBERS

        private static final int FEAT_MASK_NON_NUM_NUMBERS
      • FEAT_MASK_ALLOW_MISSING

        private static final int FEAT_MASK_ALLOW_MISSING
      • FEAT_MASK_ALLOW_SINGLE_QUOTES

        private static final int FEAT_MASK_ALLOW_SINGLE_QUOTES
      • FEAT_MASK_ALLOW_UNQUOTED_NAMES

        private static final int FEAT_MASK_ALLOW_UNQUOTED_NAMES
      • FEAT_MASK_ALLOW_JAVA_COMMENTS

        private static final int FEAT_MASK_ALLOW_JAVA_COMMENTS
      • FEAT_MASK_ALLOW_YAML_COMMENTS

        private static final int FEAT_MASK_ALLOW_YAML_COMMENTS
      • _icUTF8

        private static final int[] _icUTF8
      • _icLatin1

        protected static final int[] _icLatin1
      • _objectCodec

        protected ObjectCodec _objectCodec
        Codec used for data binding when (if) requested; typically full ObjectMapper, but that abstract is not part of core package.
      • _symbols

        protected final ByteQuadsCanonicalizer _symbols
        Symbol table that contains field names encountered so far
      • _quadBuffer

        protected int[] _quadBuffer
        Temporary buffer used for name parsing.
      • _tokenIncomplete

        protected boolean _tokenIncomplete
        Flag that indicates that the current token has not yet been fully processed, and needs to be finished for some access (or skipped to obtain the next token)
      • _quad1

        private int _quad1
        Temporary storage for partially parsed name bytes.
      • _inputData

        protected java.io.DataInput _inputData
      • _nextByte

        protected int _nextByte
        Sometimes we need buffering for just a single byte we read but have to "push back"
    • Method Detail

      • releaseBuffered

        public int releaseBuffered​(java.io.OutputStream out)
                            throws java.io.IOException
        Description copied from class: JsonParser
        Method that can be called to push back any content that has been read but not consumed by the parser. This is usually done after reading all content of interest using parser. Content is released by writing it to given stream if possible; if underlying input is byte-based it can released, if not (char-based) it can not.
        Overrides:
        releaseBuffered in class JsonParser
        Returns:
        -1 if the underlying content source is not byte based (that is, input can not be sent to OutputStream; otherwise number of bytes released (0 if there was nothing to release)
        Throws:
        java.io.IOException - if write to stream threw exception
      • getInputSource

        public java.lang.Object getInputSource()
        Description copied from class: JsonParser
        Method that can be used to get access to object that is used to access input being parsed; this is usually either InputStream or Reader, depending on what parser was constructed with. Note that returned value may be null in some cases; including case where parser implementation does not want to exposed raw source to caller. In cases where input has been decorated, object returned here is the decorated version; this allows some level of interaction between users of parser and decorator object.

        In general use of this accessor should be considered as "last effort", i.e. only used if no other mechanism is applicable.

        Overrides:
        getInputSource in class JsonParser
      • _closeInput

        protected void _closeInput()
                            throws java.io.IOException
        Specified by:
        _closeInput in class ParserBase
        Throws:
        java.io.IOException
      • _releaseBuffers

        protected void _releaseBuffers()
                                throws java.io.IOException
        Method called to release internal buffers owned by the base reader. This may be called along with _closeInput() (for example, when explicitly closing this reader instance), or separately (if need be).
        Overrides:
        _releaseBuffers in class ParserBase
        Throws:
        java.io.IOException
      • getText

        public java.lang.String getText()
                                 throws java.io.IOException
        Description copied from class: JsonParser
        Method for accessing textual representation of the current token; if no current token (before first call to JsonParser.nextToken(), or after encountering end-of-input), returns null. Method can be called for any token type.
        Specified by:
        getText in class ParserMinimalBase
        Throws:
        java.io.IOException
      • getText

        public int getText​(java.io.Writer writer)
                    throws java.io.IOException
        Description copied from class: JsonParser
        Method to read the textual representation of the current token in chunks and pass it to the given Writer. Conceptually same as calling:
          writer.write(parser.getText());
        
        but should typically be more efficient as longer content does need to be combined into a single String to return, and write can occur directly from intermediate buffers Jackson uses.
        Overrides:
        getText in class JsonParser
        Returns:
        The number of characters written to the Writer
        Throws:
        java.io.IOException
      • getValueAsString

        public java.lang.String getValueAsString()
                                          throws java.io.IOException
        Description copied from class: JsonParser
        Method that will try to convert value of current token to a String. JSON Strings map naturally; scalar values get converted to their textual representation. If representation can not be converted to a String value (including structured types like Objects and Arrays and null token), default value of null will be returned; no exceptions are thrown.
        Overrides:
        getValueAsString in class ParserMinimalBase
        Throws:
        java.io.IOException
      • getValueAsString

        public java.lang.String getValueAsString​(java.lang.String defValue)
                                          throws java.io.IOException
        Description copied from class: JsonParser
        Method that will try to convert value of current token to a String. JSON Strings map naturally; scalar values get converted to their textual representation. If representation can not be converted to a String value (including structured types like Objects and Arrays and null token), specified default value will be returned; no exceptions are thrown.
        Overrides:
        getValueAsString in class ParserMinimalBase
        Throws:
        java.io.IOException
      • getValueAsInt

        public int getValueAsInt()
                          throws java.io.IOException
        Description copied from class: JsonParser
        Method that will try to convert value of current token to a int. Numbers are coerced using default Java rules; booleans convert to 0 (false) and 1 (true), and Strings are parsed using default Java language integer parsing rules.

        If representation can not be converted to an int (including structured type markers like start/end Object/Array) default value of 0 will be returned; no exceptions are thrown.

        Overrides:
        getValueAsInt in class ParserMinimalBase
        Throws:
        java.io.IOException
      • getValueAsInt

        public int getValueAsInt​(int defValue)
                          throws java.io.IOException
        Description copied from class: JsonParser
        Method that will try to convert value of current token to a int. Numbers are coerced using default Java rules; booleans convert to 0 (false) and 1 (true), and Strings are parsed using default Java language integer parsing rules.

        If representation can not be converted to an int (including structured type markers like start/end Object/Array) specified def will be returned; no exceptions are thrown.

        Overrides:
        getValueAsInt in class ParserMinimalBase
        Throws:
        java.io.IOException
      • _getText2

        protected final java.lang.String _getText2​(JsonToken t)
      • getTextCharacters

        public char[] getTextCharacters()
                                 throws java.io.IOException
        Description copied from class: JsonParser
        Method similar to JsonParser.getText(), but that will return underlying (unmodifiable) character array that contains textual value, instead of constructing a String object to contain this information. Note, however, that:
        • Textual contents are not guaranteed to start at index 0 (rather, call JsonParser.getTextOffset()) to know the actual offset
        • Length of textual contents may be less than the length of returned buffer: call JsonParser.getTextLength() for actual length of returned content.

        Note that caller MUST NOT modify the returned character array in any way -- doing so may corrupt current parser state and render parser instance useless.

        The only reason to call this method (over JsonParser.getText()) is to avoid construction of a String object (which will make a copy of contents).

        Specified by:
        getTextCharacters in class ParserMinimalBase
        Throws:
        java.io.IOException
      • getBinaryValue

        public byte[] getBinaryValue​(Base64Variant b64variant)
                              throws java.io.IOException
        Description copied from class: JsonParser
        Method that can be used to read (and consume -- results may not be accessible using other methods after the call) base64-encoded binary data included in the current textual JSON value. It works similar to getting String value via JsonParser.getText() and decoding result (except for decoding part), but should be significantly more performant.

        Note that non-decoded textual contents of the current token are not guaranteed to be accessible after this method is called. Current implementation, for example, clears up textual content during decoding. Decoded binary content, however, will be retained until parser is advanced to the next event.

        Overrides:
        getBinaryValue in class ParserBase
        Parameters:
        b64variant - Expected variant of base64 encoded content (see Base64Variants for definitions of "standard" variants).
        Returns:
        Decoded binary data
        Throws:
        java.io.IOException
      • readBinaryValue

        public int readBinaryValue​(Base64Variant b64variant,
                                   java.io.OutputStream out)
                            throws java.io.IOException
        Description copied from class: JsonParser
        Similar to JsonParser.readBinaryValue(OutputStream) but allows explicitly specifying base64 variant to use.
        Overrides:
        readBinaryValue in class JsonParser
        Parameters:
        b64variant - base64 variant to use
        out - Output stream to use for passing decoded binary data
        Returns:
        Number of bytes that were decoded and written via OutputStream
        Throws:
        java.io.IOException
      • _readBinary

        protected int _readBinary​(Base64Variant b64variant,
                                  java.io.OutputStream out,
                                  byte[] buffer)
                           throws java.io.IOException
        Throws:
        java.io.IOException
      • nextToken

        public JsonToken nextToken()
                            throws java.io.IOException
        Description copied from class: JsonParser
        Main iteration method, which will advance stream enough to determine type of the next token, if any. If none remaining (stream has no content other than possible white space before ending), null will be returned.
        Specified by:
        nextToken in class ParserMinimalBase
        Returns:
        Next token from the stream, if any found, or null to indicate end-of-input
        Throws:
        java.io.IOException
      • _nextTokenNotInObject

        private final JsonToken _nextTokenNotInObject​(int i)
                                               throws java.io.IOException
        Throws:
        java.io.IOException
      • _nextAfterName

        private final JsonToken _nextAfterName()
      • finishToken

        public void finishToken()
                         throws java.io.IOException
        Description copied from class: JsonParser
        Method that may be used to force full handling of the current token so that even if lazy processing is enabled, the whole contents are read for possible retrieval. This is usually used to ensure that the token end location is available, as well as token contents (similar to what calling, say JsonParser.getTextCharacters(), would achieve).

        Note that for many dataformat implementations this method will not do anything; this is the default implementation unless overridden by sub-classes.

        Overrides:
        finishToken in class JsonParser
        Throws:
        java.io.IOException
      • nextTextValue

        public java.lang.String nextTextValue()
                                       throws java.io.IOException
        Description copied from class: JsonParser
        Method that fetches next token (as if calling JsonParser.nextToken()) and if it is JsonToken.VALUE_STRING returns contained String value; otherwise returns null. It is functionally equivalent to:
          return (nextToken() == JsonToken.VALUE_STRING) ? getText() : null;
        
        but may be faster for parser to process, and can therefore be used if caller expects to get a String value next from input.
        Overrides:
        nextTextValue in class JsonParser
        Throws:
        java.io.IOException
      • nextIntValue

        public int nextIntValue​(int defaultValue)
                         throws java.io.IOException
        Description copied from class: JsonParser
        Method that fetches next token (as if calling JsonParser.nextToken()) and if it is JsonToken.VALUE_NUMBER_INT returns 32-bit int value; otherwise returns specified default value It is functionally equivalent to:
          return (nextToken() == JsonToken.VALUE_NUMBER_INT) ? getIntValue() : defaultValue;
        
        but may be faster for parser to process, and can therefore be used if caller expects to get an int value next from input.
        Overrides:
        nextIntValue in class JsonParser
        Throws:
        java.io.IOException
      • nextLongValue

        public long nextLongValue​(long defaultValue)
                           throws java.io.IOException
        Description copied from class: JsonParser
        Method that fetches next token (as if calling JsonParser.nextToken()) and if it is JsonToken.VALUE_NUMBER_INT returns 64-bit long value; otherwise returns specified default value It is functionally equivalent to:
          return (nextToken() == JsonToken.VALUE_NUMBER_INT) ? getLongValue() : defaultValue;
        
        but may be faster for parser to process, and can therefore be used if caller expects to get a long value next from input.
        Overrides:
        nextLongValue in class JsonParser
        Throws:
        java.io.IOException
      • nextBooleanValue

        public java.lang.Boolean nextBooleanValue()
                                           throws java.io.IOException
        Description copied from class: JsonParser
        Method that fetches next token (as if calling JsonParser.nextToken()) and if it is JsonToken.VALUE_TRUE or JsonToken.VALUE_FALSE returns matching Boolean value; otherwise return null. It is functionally equivalent to:
          JsonToken t = nextToken();
          if (t == JsonToken.VALUE_TRUE) return Boolean.TRUE;
          if (t == JsonToken.VALUE_FALSE) return Boolean.FALSE;
          return null;
        
        but may be faster for parser to process, and can therefore be used if caller expects to get a Boolean value next from input.
        Overrides:
        nextBooleanValue in class JsonParser
        Throws:
        java.io.IOException
      • _parseFloatThatStartsWithPeriod

        protected final JsonToken _parseFloatThatStartsWithPeriod()
                                                           throws java.io.IOException
        Throws:
        java.io.IOException
      • _parsePosNumber

        protected JsonToken _parsePosNumber​(int c)
                                     throws java.io.IOException
        Initial parsing method for number values. It needs to be able to parse enough input to be able to determine whether the value is to be considered a simple integer value, or a more generic decimal value: latter of which needs to be expressed as a floating point number. The basic rule is that if the number has no fractional or exponential part, it is an integer; otherwise a floating point number.

        Because much of input has to be processed in any case, no partial parsing is done: all input text will be stored for further processing. However, actual numeric value conversion will be deferred, since it is usually the most complicated and costliest part of processing.

        Throws:
        java.io.IOException
      • _parseNegNumber

        protected JsonToken _parseNegNumber()
                                     throws java.io.IOException
        Throws:
        java.io.IOException
      • _handleLeadingZeroes

        private final int _handleLeadingZeroes()
                                        throws java.io.IOException
        Method called when we have seen one zero, and want to ensure it is not followed by another, or, if leading zeroes allowed, skipped redundant ones.
        Returns:
        Character immediately following zeroes
        Throws:
        java.io.IOException
      • _parseFloat

        private final JsonToken _parseFloat​(char[] outBuf,
                                            int outPtr,
                                            int c,
                                            boolean negative,
                                            int integerPartLength)
                                     throws java.io.IOException
        Throws:
        java.io.IOException
      • _verifyRootSpace

        private final void _verifyRootSpace()
                                     throws java.io.IOException
        Method called to ensure that a root-value is followed by a space token, if possible.

        NOTE: with DataInput source, not really feasible, up-front. If we did want, we could rearrange things to require space before next read, but initially let's just do nothing.

        Throws:
        java.io.IOException
      • _parseName

        protected final java.lang.String _parseName​(int i)
                                             throws java.io.IOException
        Throws:
        java.io.IOException
      • _parseMediumName

        private final java.lang.String _parseMediumName​(int q2)
                                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • _parseMediumName2

        private final java.lang.String _parseMediumName2​(int q3,
                                                         int q2)
                                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • _parseLongName

        private final java.lang.String _parseLongName​(int q,
                                                      int q2,
                                                      int q3)
                                               throws java.io.IOException
        Throws:
        java.io.IOException
      • parseName

        private final java.lang.String parseName​(int q1,
                                                 int ch,
                                                 int lastQuadBytes)
                                          throws java.io.IOException
        Throws:
        java.io.IOException
      • parseName

        private final java.lang.String parseName​(int q1,
                                                 int q2,
                                                 int ch,
                                                 int lastQuadBytes)
                                          throws java.io.IOException
        Throws:
        java.io.IOException
      • parseName

        private final java.lang.String parseName​(int q1,
                                                 int q2,
                                                 int q3,
                                                 int ch,
                                                 int lastQuadBytes)
                                          throws java.io.IOException
        Throws:
        java.io.IOException
      • parseEscapedName

        protected final java.lang.String parseEscapedName​(int[] quads,
                                                          int qlen,
                                                          int currQuad,
                                                          int ch,
                                                          int currQuadBytes)
                                                   throws java.io.IOException
        Slower parsing method which is generally branched to when an escape sequence is detected (or alternatively for long names, one crossing input buffer boundary). Needs to be able to handle more exceptional cases, gets slower, and hance is offlined to a separate method.
        Throws:
        java.io.IOException
      • _handleOddName

        protected java.lang.String _handleOddName​(int ch)
                                           throws java.io.IOException
        Method called when we see non-white space character other than double quote, when expecting a field name. In standard mode will just throw an exception; but in non-standard modes may be able to parse name.
        Throws:
        java.io.IOException
      • _parseAposName

        protected java.lang.String _parseAposName()
                                           throws java.io.IOException
        Throws:
        java.io.IOException
      • addName

        private final java.lang.String addName​(int[] quads,
                                               int qlen,
                                               int lastQuadBytes)
                                        throws JsonParseException
        This is the main workhorse method used when we take a symbol table miss. It needs to demultiplex individual bytes, decode multi-byte chars (if any), and then construct Name instance and add it to the symbol table.
        Throws:
        JsonParseException
      • _finishString

        protected void _finishString()
                              throws java.io.IOException
        Overrides:
        _finishString in class ParserBase
        Throws:
        java.io.IOException
      • _finishAndReturnString

        private java.lang.String _finishAndReturnString()
                                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • _finishString2

        private final void _finishString2​(char[] outBuf,
                                          int outPtr,
                                          int c)
                                   throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipString

        protected void _skipString()
                            throws java.io.IOException
        Method called to skim through rest of unparsed String value, if it is not needed. This can be done bit faster if contents need not be stored for future access.
        Throws:
        java.io.IOException
      • _handleUnexpectedValue

        protected JsonToken _handleUnexpectedValue​(int c)
                                            throws java.io.IOException
        Method for handling cases where first non-space character of an expected value token is not legal for standard JSON content.
        Throws:
        java.io.IOException
      • _handleApos

        protected JsonToken _handleApos()
                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • _handleInvalidNumberStart

        protected JsonToken _handleInvalidNumberStart​(int ch,
                                                      boolean neg)
                                               throws java.io.IOException
        Method called if expected numeric value (due to leading sign) does not look like a number
        Throws:
        java.io.IOException
      • _matchToken

        protected final void _matchToken​(java.lang.String matchStr,
                                         int i)
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • _checkMatchEnd

        private final void _checkMatchEnd​(java.lang.String matchStr,
                                          int i,
                                          int ch)
                                   throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipWS

        private final int _skipWS()
                           throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipWSOrEnd

        private final int _skipWSOrEnd()
                                throws java.io.IOException
        Alternative to _skipWS() that handles possible EOFException caused by trying to read past the end of InputData.
        Throws:
        java.io.IOException
        Since:
        2.9
      • _skipWSComment

        private final int _skipWSComment​(int i)
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipColon

        private final int _skipColon()
                              throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipColon2

        private final int _skipColon2​(int i,
                                      boolean gotColon)
                               throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipComment

        private final void _skipComment()
                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipCComment

        private final void _skipCComment()
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipYAMLComment

        private final boolean _skipYAMLComment()
                                        throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipLine

        private final void _skipLine()
                              throws java.io.IOException
        Method for skipping contents of an input line; usually for CPP and YAML style comments.
        Throws:
        java.io.IOException
      • _decodeEscaped

        protected char _decodeEscaped()
                               throws java.io.IOException
        Description copied from class: ParserBase
        Method that sub-classes must implement to support escaped sequences in base64-encoded sections. Sub-classes that do not need base64 support can leave this as is
        Overrides:
        _decodeEscaped in class ParserBase
        Throws:
        java.io.IOException
      • _decodeCharForError

        protected int _decodeCharForError​(int firstByte)
                                   throws java.io.IOException
        Throws:
        java.io.IOException
      • _decodeUtf8_2

        private final int _decodeUtf8_2​(int c)
                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • _decodeUtf8_3

        private final int _decodeUtf8_3​(int c1)
                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • _decodeUtf8_4

        private final int _decodeUtf8_4​(int c)
                                 throws java.io.IOException
        Returns:
        Character value minus 0x10000; this so that caller can readily expand it to actual surrogates
        Throws:
        java.io.IOException
      • _skipUtf8_2

        private final void _skipUtf8_2()
                                throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipUtf8_3

        private final void _skipUtf8_3()
                                throws java.io.IOException
        Throws:
        java.io.IOException
      • _skipUtf8_4

        private final void _skipUtf8_4()
                                throws java.io.IOException
        Throws:
        java.io.IOException
      • _reportInvalidToken

        protected void _reportInvalidToken​(int ch,
                                           java.lang.String matchedPart)
                                    throws java.io.IOException
        Throws:
        java.io.IOException
      • _reportInvalidToken

        protected void _reportInvalidToken​(int ch,
                                           java.lang.String matchedPart,
                                           java.lang.String msg)
                                    throws java.io.IOException
        Throws:
        java.io.IOException
      • _growArrayBy

        private static int[] _growArrayBy​(int[] arr,
                                          int more)
      • _decodeBase64

        protected final byte[] _decodeBase64​(Base64Variant b64variant)
                                      throws java.io.IOException
        Efficient handling for incremental parsing of base64-encoded textual content.
        Throws:
        java.io.IOException
      • getTokenLocation

        public JsonLocation getTokenLocation()
        Description copied from class: ParserBase
        Method that return the starting location of the current token; that is, position of the first character from input that starts the current token.
        Overrides:
        getTokenLocation in class ParserBase
      • getCurrentLocation

        public JsonLocation getCurrentLocation()
        Description copied from class: ParserBase
        Method that returns location of the last processed character; usually for error reporting purposes
        Overrides:
        getCurrentLocation in class ParserBase
      • pad

        private static final int pad​(int q,
                                     int bytes)
        Helper method needed to fix [Issue#148], masking of 0x00 character