Apache SINGA
A distributed deep learning platform .
|
namespace to support sse2 vectorization More...
Classes | |
struct | FVec |
float vector real type, used for vectorization More... | |
struct | FVec< float > |
vector real type for float More... | |
struct | FVec< double > |
vector real type for float More... | |
struct | SSEOp |
sse2 operator type of certain operator More... | |
struct | SSEOp< op::plus > |
struct | SSEOp< op::minus > |
struct | SSEOp< op::mul > |
struct | SSEOp< op::div > |
struct | SSEOp< op::identity > |
struct | Saver |
struct | Saver< sv::saveto, TFloat > |
Functions | |
void * | AlignedMallocPitch (size_t &pitch, size_t lspace, size_t num_line) |
analog to cudaMallocPitch, allocate a aligned space with num_line * lspace cells More... | |
void | AlignedFree (void *ptr) |
free aligned space More... | |
bool | CheckAlign (size_t pitch) |
check if a pointer is aligned | |
bool | CheckAlign (void *ptr) |
check if a pointer is aligned | |
index_t | UpperAlign (index_t size, size_t fsize) |
get upper bound of aligned index of size More... | |
index_t | LowerAlign (index_t size, size_t fsize) |
get lower bound of aligned index of size More... | |
namespace to support sse2 vectorization
|
inline |
free aligned space
ptr | pointer to space to be freed |
|
inline |
analog to cudaMallocPitch, allocate a aligned space with num_line * lspace cells
pitch | output parameter, the actuall space allocated for each line |
lspace | number of cells required for each line |
num_line | number of lines to be allocated |
|
inline |
get lower bound of aligned index of size
size | size of the array |
fsize | size of float |
|
inline |
get upper bound of aligned index of size
size | size of the array |
fsize | size of float |