Class UnicodeEncoding

All Implemented Interfaces:
Cloneable
Direct Known Subclasses:
BaseUTF8Encoding, FixedWidthUnicodeEncoding, UTF16BEEncoding, UTF16LEEncoding

public abstract class UnicodeEncoding extends MultiByteEncoding
  • Field Details

  • Constructor Details

    • UnicodeEncoding

      protected UnicodeEncoding(String name, int minLength, int maxLength, int[] EncLen, int[][] Trans)
    • UnicodeEncoding

      protected UnicodeEncoding(String name, int minLength, int maxLength, int[] EncLen)
  • Method Details

    • getCharsetName

      public String getCharsetName()
      Description copied from class: Encoding
      The name of the equivalent Java Charset for this encoding. Defaults to the name of the encoding. Subclasses can override this to provide a different name.
      Overrides:
      getCharsetName in class Encoding
      Returns:
      the name of the equivalent Java Charset for this encoding
    • isCodeCType

      public boolean isCodeCType(int code, int ctype)
      Description copied from class: Encoding
      Perform a check whether given code is of given character type (e.g. used by isWord(someByte) and similar methods)
      Specified by:
      isCodeCType in class Encoding
      Parameters:
      code - a code point of a character
      ctype - a character type to check against Oniguruma equivalent: is_code_ctype
    • isInCodeRange

      public static boolean isInCodeRange(UnicodeCodeRange range, int code)
    • ctypeCodeRange

      protected final int[] ctypeCodeRange(int ctype)
    • propertyNameToCType

      public int propertyNameToCType(byte[] name, int p, int end)
      Description copied from class: AbstractEncoding
      onigenc_minimum_property_name_to_ctype notably overridden by unicode encodings
      Overrides:
      propertyNameToCType in class AbstractEncoding
    • mbcCaseFold

      public int mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold)
      Description copied from class: AbstractEncoding
      onigenc_ascii_mbc_case_fold
      Overrides:
      mbcCaseFold in class AbstractEncoding
      Parameters:
      flag - case fold flag
      pp - an IntHolder that points at character head
      fold - a buffer where to extract case folded character Oniguruma equivalent: mbc_case_fold
    • applyAllCaseFold

      public void applyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, Object arg)
      Description copied from class: AbstractEncoding
      onigenc_ascii_apply_all_case_fold / used also by multibyte encodings
      Overrides:
      applyAllCaseFold in class AbstractEncoding
      Parameters:
      flag - case fold flag
      fun - case folding functor (look at: ApplyCaseFold)
      arg - case folding functor argument (look at: ApplyCaseFoldArg) Oniguruma equivalent: apply_all_case_fold
    • caseFoldCodesByString

      public CaseFoldCodeItem[] caseFoldCodesByString(int flag, byte[] bytes, int p, int end)
      Description copied from class: AbstractEncoding
      onigenc_ascii_get_case_fold_codes_by_str / used also by multibyte encodings
      Overrides:
      caseFoldCodesByString in class AbstractEncoding
    • caseMap

      public final int caseMap(IntHolder flagP, byte[] bytes, IntHolder pp, int end, byte[] to, int toP, int toEnd)
      Description copied from class: Encoding
      Oniguruma equivalent: case_map
      Overrides:
      caseMap in class MultiByteEncoding
    • readFoldN

      private static Object[] readFoldN(int fromSize, String table)
    • extractLength

      private static int extractLength(int packed)
    • extractCode

      private static int extractCode(int packed)