Interface ICLAScheme

  • All Superinterfaces:
    Cloneable
    All Known Implementing Classes:
    ACLAScheme, ConstScheme, DDCScheme, DDCSchemeMC, DDCSchemeSC, EmptyScheme, RLEScheme, SDCScheme, SDCSchemeMC, SDCSchemeSC, UncompressedScheme

    public interface ICLAScheme
    extends Cloneable
    Interface for a scheme instance. Instances of this class has the purpose of encoding the minimum values required to reproduce a compression scheme, and apply it to unseen data. The reproduced compression scheme should be able to apply to unseen data and multiple instance of data into the same compression plan, and in extension make it possible to continuously extend already compressed data representations. A single scheme is only responsible for encoding a single column group type.
    • Field Detail

      • LOG

        static final org.apache.commons.logging.Log LOG
        Logging access for the CLA Scheme encoders
    • Method Detail

      • encode

        AColGroup encode​(MatrixBlock data)
        Encode the given matrix block into the scheme provided in the instance. The method is unsafe in the sense that if the encoding scheme does not fit, there is no guarantee that an error is thrown. To guarantee the encoding scheme, first use update on the matrix block and used the returned scheme to ensure consistency.
        Parameters:
        data - The data to encode
        Returns:
        A compressed column group forced to use the scheme provided.
        Throws:
        IllegalArgumentException - In the case the columns argument number of columns does not corelate with the schemes list of columns.
      • encodeT

        AColGroup encodeT​(MatrixBlock data)
        Encode the given matrix block into the scheme provided in the instance, the input data is transposed The method is unsafe in the sense that if the encoding scheme does not fit, there is no guarantee that an error is thrown. To guarantee the encoding scheme, first use update on the matrix block and used the returned scheme to ensure consistency.
        Parameters:
        data - The transposed data to encode
        Returns:
        A compressed column group forced to use the scheme provided.
        Throws:
        IllegalArgumentException - In the case the columns argument number of columns does not corelate with the schemes list of columns.
      • encode

        AColGroup encode​(MatrixBlock data,
                         IColIndex columns)
        Encode a given matrix block into the scheme provided in the instance but overwrite what columns to use. The method is unsafe in the sense that if the encoding scheme does not fit, there is no guarantee that an error is thrown. To guarantee the encoding scheme, first use update on the matrix block and used the returned scheme to ensure consistency.
        Parameters:
        data - The data to encode
        columns - The columns to apply the scheme to, but must be of same number than the encoded scheme
        Returns:
        A compressed column group forced to use the scheme provided.
        Throws:
        IllegalArgumentException - In the case the columns argument number of columns does not corelate with the schemes list of columns.
      • encodeT

        AColGroup encodeT​(MatrixBlock data,
                          IColIndex columns)
        Encode a given matrix block into the scheme provided in the instance but overwrite what columns to use. The method is unsafe in the sense that if the encoding scheme does not fit, there is no guarantee that an error is thrown. To guarantee the encoding scheme, first use update on the matrix block and used the returned scheme to ensure consistency.
        Parameters:
        data - The transposed data to encode
        columns - The columns to apply the scheme to, but must be of same number than the encoded scheme
        Returns:
        A compressed column group forced to use the scheme provided.
        Throws:
        IllegalArgumentException - In the case the columns argument number of columns does not corelate with the schemes list of columns.
      • update

        ICLAScheme update​(MatrixBlock data)
        Update the encoding scheme to enable compression of the given data.
        Parameters:
        data - The data to update into the scheme
        Returns:
        A updated scheme
      • updateT

        ICLAScheme updateT​(MatrixBlock data)
        Update the encoding scheme to enable compression of the given data.
        Parameters:
        data - The transposed data to update into the scheme
        Returns:
        A updated scheme
      • update

        ICLAScheme update​(MatrixBlock data,
                          IColIndex columns)
        Update the encoding scheme to enable compression of the given data.
        Parameters:
        data - The data to update into the scheme
        columns - The columns to extract the data from
        Returns:
        A updated scheme
      • updateT

        ICLAScheme updateT​(MatrixBlock data,
                           IColIndex columns)
        Update the encoding scheme to enable compression of the given data.
        Parameters:
        data - The transposed data to update into the scheme
        columns - The columns to extract the data from
        Returns:
        A updated scheme
      • updateAndEncode

        Pair<ICLAScheme,​AColGroup> updateAndEncode​(MatrixBlock data)
        Update and encode the given block in a single pass. It can fail to do so in cases where the dictionary size increase over the mapping sizes supported by individual encodings. The implementation should always work and fall back to a normal two pass algorithm if it breaks.
        Parameters:
        data - The block to encode
        Returns:
        The updated scheme and an encoded column group
      • updateAndEncodeT

        Pair<ICLAScheme,​AColGroup> updateAndEncodeT​(MatrixBlock data)
        Update and encode the given block in a single pass. It can fail to do so in cases where the dictionary size increase over the mapping sizes supported by individual encodings. The implementation should always work and fall back to a normal two pass algorithm if it breaks.
        Parameters:
        data - The transposed block to encode
        Returns:
        The updated scheme and an encoded column group
      • updateAndEncode

        Pair<ICLAScheme,​AColGroup> updateAndEncode​(MatrixBlock data,
                                                         IColIndex columns)
        Try to update and encode in a single pass over the data. It can fail to do so in cases where the dictionary size increase over the mapping sizes supported by individual encodings. The implementation should always work and fall back to a normal two pass algorithm if it breaks.
        Parameters:
        data - The block to encode
        columns - The column to encode
        Returns:
        The updated scheme and an encoded column group
      • updateAndEncodeT

        Pair<ICLAScheme,​AColGroup> updateAndEncodeT​(MatrixBlock data,
                                                          IColIndex columns)
        Try to update and encode in a single pass over the data. It can fail to do so in cases where the dictionary size increase over the mapping sizes supported by individual encodings. The implementation should always work and fall back to a normal two pass algorithm if it breaks.
        Parameters:
        data - The transposed block to encode
        columns - The column to encode
        Returns:
        The updated scheme and an encoded column group