Class OffsetFactory


  • public final class OffsetFactory
    extends Object
    • Method Detail

      • createOffset

        public static AOffset createOffset​(int[] indexes)
        Main factory pattern creator for Offsets. Note this creator is unsafe it is assumed that the input index list only contain sequential non duplicate incrementing values.
        Parameters:
        indexes - List of indexes, that is assumed to be sorted and have no duplicates
        Returns:
        AOffset object containing offsets to the next value.
      • createOffset

        public static AOffset createOffset​(IntArrayList indexes)
        Create the offsets based on our primitive IntArrayList. Note this creator is unsafe it is assumed that the input index list only contain sequential non duplicate incrementing values.
        Parameters:
        indexes - The List of indexes, that is assumed to be sorted and have no duplicates
        Returns:
        AOffset object containing offsets to the next value.
      • createOffset

        public static AOffset createOffset​(int[] indexes,
                                           OffsetFactory.OFF_TYPE type)
        try to create a specific type of offset.
        Parameters:
        indexes - the List of indexes, that is assumed to be sorted and have no duplicates
        type - The type requested.
        Returns:
        The return offset
      • createOffset

        public static AOffset createOffset​(int[] indexes,
                                           int apos,
                                           int alen)
        Create a Offset based on a subset of the indexes given. This is useful if the input is created from a CSR matrix, since it allows us to not reallocate the indexes[] but use the shared indexes from the entire CSR representation. Note this creator is unsafe it is assumed that the input index list only contain sequential non duplicate incrementing values.
        Parameters:
        indexes - The indexes from which to take the offsets.
        apos - The position to start looking from in the indexes.
        alen - The position to end looking at in the indexes.
        Returns:
        A new Offset.
      • readIn

        public static AOffset readIn​(DataInput in)
                              throws IOException
        Read in AOffset from the DataInput.
        Parameters:
        in - DataInput to read from
        Returns:
        The AOffset data instance
        Throws:
        IOException - If the DataInput fails reading in the variables
      • estimateInMemorySize

        public static long estimateInMemorySize​(int size,
                                                int nRows)
        Avg diff only works assuming a normal distribution of the offsets. This means that if we have 1000 rows and 100 offsets, it is assumed that on average the distance between elements is 10. Optionally todo is to add some number of size if the average distance is almost the same as the max value of the OffsetLists. this would add to the estimated size and approximate better the real compression size. It would also then handle edge cases better.
        Parameters:
        size - The estimated number of offsets
        nRows - The number of rows.
        Returns:
        The estimated size of an offset given the number of offsets and rows.
      • correctionByte

        public static int correctionByte​(int nRows,
                                         int size)
      • correctionChar

        public static int correctionChar​(int nRows,
                                         int size)