Class LibMatrixCuDNN
- java.lang.Object
-
- org.apache.sysds.runtime.matrix.data.LibMatrixCUDA
-
- org.apache.sysds.runtime.matrix.data.LibMatrixCuDNN
-
public class LibMatrixCuDNN extends LibMatrixCUDA
This class contains method that invoke CuDNN operations.
-
-
Field Summary
-
Fields inherited from class org.apache.sysds.runtime.matrix.data.LibMatrixCUDA
cudaSupportFunctions, customKernelSuffix, sizeOfDataType
-
-
Constructor Summary
Constructors Constructor Description LibMatrixCuDNN()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static void
batchNormalizationBackward(GPUContext gCtx, String instName, MatrixObject image, MatrixObject dout, MatrixObject scale, MatrixObject dX, MatrixObject dScale, MatrixObject dBias, double epsilon, MatrixObject resultSaveMean, MatrixObject resultSaveInvVariance)
This method computes the backpropagation errors for image, scale and bias of batch normalization layerstatic void
batchNormalizationForwardInference(GPUContext gCtx, String instName, MatrixObject image, MatrixObject scale, MatrixObject bias, MatrixObject runningMean, MatrixObject runningVar, MatrixObject ret, double epsilon)
Performs the forward BatchNormalization layer computation for inferencestatic void
batchNormalizationForwardTraining(GPUContext gCtx, String instName, MatrixObject image, MatrixObject scale, MatrixObject bias, MatrixObject runningMean, MatrixObject runningVar, MatrixObject ret, MatrixObject retRunningMean, MatrixObject retRunningVar, double epsilon, double exponentialAverageFactor, MatrixObject resultSaveMean, MatrixObject resultSaveInvVariance)
Performs the forward BatchNormalization layer computation for trainingstatic void
conv2d(GPUContext gCtx, String instName, MatrixObject image, MatrixObject filter, MatrixObject outputBlock, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, double intermediateMemoryBudget)
Performs a 2D convolutionstatic void
conv2dBackwardData(GPUContext gCtx, String instName, MatrixObject filter, MatrixObject dout, MatrixObject output, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, double intermediateMemoryBudget)
This method computes the backpropogation errors for previous layer of convolution operationstatic void
conv2dBackwardFilter(GPUContext gCtx, String instName, MatrixObject image, MatrixObject dout, MatrixObject outputBlock, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, double intermediateMemoryBudget)
This method computes the backpropogation errors for filter of convolution operationstatic void
conv2dBiasAdd(GPUContext gCtx, String instName, MatrixObject image, MatrixObject bias, MatrixObject filter, MatrixObject output, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, double intermediateMemoryBudget)
Does a 2D convolution followed by a bias_addstatic jcuda.Pointer
getDensePointerForCuDNN(GPUContext gCtx, MatrixObject image, String instName, int numRows, int numCols)
Convenience method to get jcudaDenseMatrixPtr.static void
lstm(ExecutionContext ec, GPUContext gCtx, String instName, jcuda.Pointer X, jcuda.Pointer wPointer, jcuda.Pointer out0, jcuda.Pointer c0, boolean return_sequences, String outputName, String cyName, int N, int M, int D, int T)
Computes the forward pass for an LSTM layer with M neurons.static void
lstmBackward(ExecutionContext ec, GPUContext gCtx, String instName, jcuda.Pointer x, jcuda.Pointer hx, jcuda.Pointer cx, jcuda.Pointer wPointer, String doutName, String dcyName, String dxName, String dwName, String dbName, String dhxName, String dcxName, boolean return_sequences, int N, int M, int D, int T)
static void
pooling(GPUContext gCtx, String instName, MatrixObject image, MatrixObject outputBlock, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, LibMatrixDNN.PoolingType poolingType, double intermediateMemoryBudget)
performs maxpooling on GPU by exploiting cudnnPoolingForward(...)static void
poolingBackward(GPUContext gCtx, String instName, MatrixObject image, MatrixObject dout, MatrixObject maxpoolOutput, MatrixObject outputBlock, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, LibMatrixDNN.PoolingType poolingType, double intermediateMemoryBudget)
Performs maxpoolingBackward on GPU by exploiting cudnnPoolingBackward(...) This method computes the backpropogation errors for previous layer of maxpooling operationstatic void
relu(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName)
Performs the relu operation on the GPU.static void
softmax(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "softmax" operation on a matrix on the GPU-
Methods inherited from class org.apache.sysds.runtime.matrix.data.LibMatrixCUDA
abs, acos, asin, atan, axpy, biasAdd, biasMultiply, cbind, ceil, channelSums, computeNNZ, cos, cosh, cumulativeScan, cumulativeSumProduct, denseTranspose, deviceCopy, double2float, exp, float2double, floor, getCudaKernels, getDenseMatrixOutputForGPUInstruction, getDenseMatrixOutputForGPUInstruction, getDensePointer, getNnz, isInSparseFormat, log, matmultTSMM, matrixMatrixArithmetic, matrixMatrixRelational, matrixScalarArithmetic, matrixScalarOp, matrixScalarRelational, one, rbind, reluBackward, resetFloatingPointPrecision, round, sigmoid, sign, sin, sinh, sliceOperations, solve, sqrt, tan, tanh, toInt, transpose, unaryAggregate, zero
-
-
-
-
Method Detail
-
conv2dBiasAdd
public static void conv2dBiasAdd(GPUContext gCtx, String instName, MatrixObject image, MatrixObject bias, MatrixObject filter, MatrixObject output, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, double intermediateMemoryBudget)
Does a 2D convolution followed by a bias_add- Parameters:
gCtx
- a validGPUContext
instName
- the invoking instruction's name for recordStatistics
.image
- input image matrix objectbias
- bias matrix objectfilter
- filter matrix objectoutput
- output matrix objectN
- number of input imagesC
- number of channelsH
- height of each imageW
- width of each imageK
- number of output "channels"R
- height of filterS
- width of filterpad_h
- padding heightpad_w
- padding widthstride_h
- stride heightstride_w
- string widthP
- output heightQ
- output widthintermediateMemoryBudget
- intermediate memory budget
-
conv2d
public static void conv2d(GPUContext gCtx, String instName, MatrixObject image, MatrixObject filter, MatrixObject outputBlock, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, double intermediateMemoryBudget)
Performs a 2D convolution- Parameters:
gCtx
- a validGPUContext
instName
- the invoking instruction's name for recordStatistics
.image
- input matrix objectfilter
- filter matrix objectoutputBlock
- output matrix objectN
- number of input imagesC
- number of channelsH
- height of each imageW
- width of each imageK
- number of output "channels"R
- height of filterS
- width of filterpad_h
- padding heightpad_w
- padding widthstride_h
- stride heightstride_w
- string widthP
- output heightQ
- output widthintermediateMemoryBudget
- intermediate memory budget
-
softmax
public static void softmax(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "softmax" operation on a matrix on the GPU- Parameters:
ec
- execution contextgCtx
- a validGPUContext
instName
- the invoking instruction's name for recordStatistics
.in1
- input matrixoutputName
- output matrix name
-
conv2dBackwardFilter
public static void conv2dBackwardFilter(GPUContext gCtx, String instName, MatrixObject image, MatrixObject dout, MatrixObject outputBlock, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, double intermediateMemoryBudget)
This method computes the backpropogation errors for filter of convolution operation- Parameters:
gCtx
- a validGPUContext
instName
- the invoking instruction's name for recordStatistics
.image
- input imagedout
- errors from next layeroutputBlock
- output errorsN
- number of imagesC
- number of channelsH
- heightW
- widthK
- number of filtersR
- filter heightS
- filter widthpad_h
- pad heightpad_w
- pad widthstride_h
- stride heightstride_w
- stride widthP
- output activation heightQ
- output activation widthintermediateMemoryBudget
- intermediate memory budget
-
conv2dBackwardData
public static void conv2dBackwardData(GPUContext gCtx, String instName, MatrixObject filter, MatrixObject dout, MatrixObject output, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, double intermediateMemoryBudget)
This method computes the backpropogation errors for previous layer of convolution operation- Parameters:
gCtx
- a validGPUContext
instName
- the invoking instruction's name for recordStatistics
.filter
- filter used in conv2ddout
- errors from next layeroutput
- output errorsN
- number of imagesC
- number of channelsH
- heightW
- widthK
- number of filtersR
- filter heightS
- filter widthpad_h
- pad heightpad_w
- pad widthstride_h
- stride heightstride_w
- stride widthP
- output activation heightQ
- output activation widthintermediateMemoryBudget
- intermediate memory budget
-
pooling
public static void pooling(GPUContext gCtx, String instName, MatrixObject image, MatrixObject outputBlock, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, LibMatrixDNN.PoolingType poolingType, double intermediateMemoryBudget)
performs maxpooling on GPU by exploiting cudnnPoolingForward(...)- Parameters:
gCtx
- a validGPUContext
instName
- the invoking instruction's name for recordStatistics
.image
- image as matrix objectoutputBlock
- output matrixN
- batch sizeC
- number of channelsH
- height of imageW
- width of imageK
- number of filtersR
- height of filterS
- width of filterpad_h
- vertical paddingpad_w
- horizontal paddingstride_h
- horizontal stridestride_w
- vertical strideP
- (H - R + 1 + 2*pad_h)/stride_hQ
- (W - S + 1 + 2*pad_w)/stride_wpoolingType
- type of poolingintermediateMemoryBudget
- intermediate memory budget
-
poolingBackward
public static void poolingBackward(GPUContext gCtx, String instName, MatrixObject image, MatrixObject dout, MatrixObject maxpoolOutput, MatrixObject outputBlock, int N, int C, int H, int W, int K, int R, int S, int pad_h, int pad_w, int stride_h, int stride_w, int P, int Q, LibMatrixDNN.PoolingType poolingType, double intermediateMemoryBudget)
Performs maxpoolingBackward on GPU by exploiting cudnnPoolingBackward(...) This method computes the backpropogation errors for previous layer of maxpooling operation- Parameters:
gCtx
- a validGPUContext
instName
- the invoking instruction's name for recordStatistics
.image
- image as matrix objectdout
- delta matrix, output of previous layermaxpoolOutput
- (optional and can be null) output of maxpool forward functionoutputBlock
- output matrixN
- batch sizeC
- number of channelsH
- height of imageW
- width of imageK
- number of filtersR
- height of filterS
- width of filterpad_h
- vertical paddingpad_w
- horizontal paddingstride_h
- horizontal stridestride_w
- vertical strideP
- (H - R + 1 + 2*pad_h)/stride_hQ
- (W - S + 1 + 2*pad_w)/stride_wpoolingType
- type of poolingintermediateMemoryBudget
- intermediate memory budget
-
relu
public static void relu(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName)
Performs the relu operation on the GPU.- Parameters:
ec
- currently activeExecutionContext
gCtx
- a validGPUContext
instName
- the invoking instruction's name for recordStatistics
.in
- input matrixoutputName
- name of the output matrix
-
lstm
public static void lstm(ExecutionContext ec, GPUContext gCtx, String instName, jcuda.Pointer X, jcuda.Pointer wPointer, jcuda.Pointer out0, jcuda.Pointer c0, boolean return_sequences, String outputName, String cyName, int N, int M, int D, int T) throws DMLRuntimeException
Computes the forward pass for an LSTM layer with M neurons. The input data has N sequences of T examples, each with D features.- Parameters:
ec
- execution contextgCtx
- gpu contextinstName
- name of the instructionX
- input matrix pointerwPointer
- weight matrix pointerout0
- Outputs from previous timestepc0
- Initial cell statereturn_sequences
- Whether to return `out` at all timesteps, or just for the final timestep.outputName
- name of the out variable. If `return_sequences` is True, outputs for all timesteps.cyName
- name of the output cell state. Cell state for final timestep.N
- minibatch sizeM
- hidden sizeD
- number of featuresT
- sequence length- Throws:
DMLRuntimeException
- if error
-
lstmBackward
public static void lstmBackward(ExecutionContext ec, GPUContext gCtx, String instName, jcuda.Pointer x, jcuda.Pointer hx, jcuda.Pointer cx, jcuda.Pointer wPointer, String doutName, String dcyName, String dxName, String dwName, String dbName, String dhxName, String dcxName, boolean return_sequences, int N, int M, int D, int T) throws DMLRuntimeException
- Throws:
DMLRuntimeException
-
batchNormalizationForwardTraining
public static void batchNormalizationForwardTraining(GPUContext gCtx, String instName, MatrixObject image, MatrixObject scale, MatrixObject bias, MatrixObject runningMean, MatrixObject runningVar, MatrixObject ret, MatrixObject retRunningMean, MatrixObject retRunningVar, double epsilon, double exponentialAverageFactor, MatrixObject resultSaveMean, MatrixObject resultSaveInvVariance) throws DMLRuntimeException
Performs the forward BatchNormalization layer computation for training- Parameters:
gCtx
- a validGPUContext
instName
- name of the instructionimage
- input imagescale
- scale (as per CuDNN) and gamma as per original paper: shape [1, C, 1, 1]bias
- bias (as per CuDNN) and beta as per original paper: shape [1, C, 1, 1]runningMean
- running mean accumulated during training phase: shape [1, C, 1, 1]runningVar
- running variance accumulated during training phase: shape [1, C, 1, 1]ret
- (output) normalized inputretRunningMean
- (output) running mean accumulated during training phase: shape [1, C, 1, 1]retRunningVar
- (output) running variance accumulated during training phase: shape [1, C, 1, 1]epsilon
- epsilon value used in the batch normalization formulaexponentialAverageFactor
- factor used in the moving average computationresultSaveMean
- (output) running mean accumulated during training phase: shape [1, C, 1, 1]resultSaveInvVariance
- (output) running variance accumulated during training phase: shape [1, C, 1, 1]- Throws:
DMLRuntimeException
- if error occurs
-
batchNormalizationForwardInference
public static void batchNormalizationForwardInference(GPUContext gCtx, String instName, MatrixObject image, MatrixObject scale, MatrixObject bias, MatrixObject runningMean, MatrixObject runningVar, MatrixObject ret, double epsilon) throws DMLRuntimeException
Performs the forward BatchNormalization layer computation for inference- Parameters:
gCtx
- a validGPUContext
instName
- name of the instructionimage
- input imagescale
- scale (as per CuDNN) and gamma as per original paper: shape [1, C, 1, 1]bias
- bias (as per CuDNN) and beta as per original paper: shape [1, C, 1, 1]runningMean
- running mean accumulated during training phase: shape [1, C, 1, 1]runningVar
- running variance accumulated during training phase: shape [1, C, 1, 1]ret
- normalized inputepsilon
- epsilon value used in the batch normalization formula- Throws:
DMLRuntimeException
- if error occurs
-
batchNormalizationBackward
public static void batchNormalizationBackward(GPUContext gCtx, String instName, MatrixObject image, MatrixObject dout, MatrixObject scale, MatrixObject dX, MatrixObject dScale, MatrixObject dBias, double epsilon, MatrixObject resultSaveMean, MatrixObject resultSaveInvVariance) throws DMLRuntimeException
This method computes the backpropagation errors for image, scale and bias of batch normalization layer- Parameters:
gCtx
- a validGPUContext
instName
- name of the instructionimage
- input imagedout
- input errors of shape C, H, Wscale
- scale (as per CuDNN) and gamma as per original paper: shape [1, C, 1, 1]dX
- (output) backpropagation errors for previous layerdScale
- backpropagation error for scaledBias
- backpropagation error for biasepsilon
- epsilon value used in the batch normalization formularesultSaveMean
- (input) running mean accumulated during training phase: shape [1, C, 1, 1]resultSaveInvVariance
- (input) running variance accumulated during training phase: shape [1, C, 1, 1]- Throws:
DMLRuntimeException
- if error occurs
-
getDensePointerForCuDNN
public static jcuda.Pointer getDensePointerForCuDNN(GPUContext gCtx, MatrixObject image, String instName, int numRows, int numCols) throws DMLRuntimeException
Convenience method to get jcudaDenseMatrixPtr. This method explicitly converts sparse to dense format, so use it judiciously.- Parameters:
gCtx
- a validGPUContext
image
- input matrix objectinstName
- name of the instructionnumRows
- expected number of rowsnumCols
- expected number of columns- Returns:
- jcuda pointer
- Throws:
DMLRuntimeException
- if error occurs while sparse to dense conversion
-
-