1. Programming Model

As in Java, the application code is divided in 3 parts: the Task definition interface, the main code and task implementations. These files must have the following notation,: <app_ame>.idl, for the interface file, <app_name>.cc for the main code and <app_name>-functions.cc for task implementations. Next paragraphs provide an example of how to define this files for matrix multiplication parallelised by blocks.

Task Definition Interface

As in Java the user has to provide a task selection by means of an interface. In this case the interface file has the same name as the main application file plus the suffix “idl”, i.e. Matmul.idl, where the main file is called Matmul.cc.

Code 59 Matmul.idl
interface Matmul
{
      // C functions
      void initMatrix(inout Matrix matrix,
                      in int mSize,
                      in int nSize,
                      in double val);

      void multiplyBlocks(inout Block block1,
                          inout Block block2,
                          inout Block block3);
};

The syntax of the interface file is shown in the previous code. Tasks can be declared as classic C function prototypes, this allow to keep the compatibility with standard C applications. In the example, initMatrix and multiplyBlocks are functions declared using its prototype, like in a C header file, but this code is C++ as they have objects as parameters (objects of type Matrix, or Block).

The grammar for the interface file is:

["static"] return-type task-name ( parameter {, parameter }* );

return-type = "void" | type

ask-name = <qualified name of the function or method>

parameter = direction type parameter-name

direction = "in" | "out" | "inout"

type = "char" | "int" | "short" | "long" | "float" | "double" | "boolean" |
       "char[<size>]" | "int[<size>]" | "short[<size>]" | "long[<size>]" |
       "float[<size>]" | "double[<size>]" | "string" | "File" | class-name

class-name = <qualified name of the class>

Main Program

The following code shows an example of matrix multiplication written in C++.

Code 60 Matrix multiplication
#include "Matmul.h"
#include "Matrix.h"
#include "Block.h"
int N; //MSIZE
int M; //BSIZE
double val;
int main(int argc, char **argv)
{
      Matrix A;
      Matrix B;
      Matrix C;

      N = atoi(argv[1]);
      M = atoi(argv[2]);
      val = atof(argv[3]);

      compss_on();

      A = Matrix::init(N,M,val);

      initMatrix(&B,N,M,val);
      initMatrix(&C,N,M,0.0);

      cout << "Waiting for initialization...\n";

      compss_wait_on(B);
      compss_wait_on(C);

      cout << "Initialization ends...\n";

      C.multiply(A, B);

      compss_off();
      return 0;
}

The developer has to take into account the following rules:

  1. A header file with the same name as the main file must be included, in this case Matmul.h. This header file is automatically generated by the binding and it contains other includes and type-definitions that are required.
  2. A call to the compss_on binding function is required to turn on the COMPSs runtime.
  3. As in C language, out or inout parameters should be passed by reference by means of the “&” operator before the parameter name.
  4. Synchronization on a parameter can be done calling the compss_wait_on binding function. The argument of this function must be the variable or object we want to synchronize.
  5. There is an implicit synchronization in the init method of Matrix. It is not possible to know the address of “A” before exiting the method call and due to this it is necessary to synchronize before for the copy of the returned value into “A” for it to be correct.
  6. A call to the compss_off binding function is required to turn off the COMPSs runtime.

Functions file

The implementation of the tasks in a C or C++ program has to be provided in a functions file. Its name must be the same as the main file followed by the suffix “-functions”. In our case Matmul-functions.cc.

#include "Matmul.h"
#include "Matrix.h"
#include "Block.h"

void initMatrix(Matrix *matrix,int mSize,int nSize,double val){
     *matrix = Matrix::init(mSize, nSize, val);
}

void multiplyBlocks(Block *block1,Block *block2,Block *block3){
     block1->multiply(*block2, *block3);
}

In the previous code, class methods have been encapsulated inside a function. This is useful when the class method returns an object or a value and we want to avoid the explicit synchronization when returning from the method.

Additional source files

Other source files needed by the user application must be placed under the directory “src”. In this directory the programmer must provide a Makefile that compiles such source files in the proper way. When the binding compiles the whole application it will enter into the src directory and execute the Makefile.

It generates two libraries, one for the master application and another for the worker application. The directive COMPSS_MASTER or COMPSS_WORKER must be used in order to compile the source files for each type of library. Both libraries will be copied into the lib directory where the binding will look for them when generating the master and worker applications.

The following sections provide a more detailed view of the C++ Binding. It will include the available API calls, how to deal with objects and having tasks as method objects as well as how to define constraints and task versions.

1.1. Binding API

Besides the aforementioned compss_on, compss_off and compss_wait_on functions, the C/C++ main program can make use of a variety of other API calls to better manage the synchronization of data generated by tasks. These calls are as follows:

void compss_ifstream(char * filename, ifstream* & * ifs)
Given an uninitialized input stream ifs and a file filename, this function will synchronize the content of the file and initialize ifs to read from it.
void compss_ofstream(char * filename, ofstream* & * ofs)
Behaves the same way as compss_ifstream, but in this case the opened stream is an output stream, meaning it will be used to write to the file.
FILE* compss_fopen(char * file_name, char * mode)
Similar to the C/C++ fopen call. Synchronizes with the last version of file file_name and returns the FILE* pointer to further reference it. As the mode parameter it takes the same that can be used in fopen (r, w, a, r+, w+ and a+).
void compss_wait_on(T** & * obj) or T compss_wait_on(T* & * obj)
Synchronizes for the last version of object obj, meaning that the execution will stop until the value of obj up to that point of the code is received (and thus all tasks that can modify it have ended).
void compss_delete_file(char * file_name)
Makes an asynchronous delete of file filename. When all previous tasks have finished updating the file, it is deleted.
void compss_delete_object(T** & * obj)
Makes an asynchronous delete of an object. When all previous tasks have finished updating the object, it is deleted.
void compss_barrier()
Similarly to the Python binding, performs an explicit synchronization without a return. When a compss_barrier is encountered, the execution will not continue until all the tasks submitted before the compss_barrier have finished.

1.2. Functions file

The implementation of the tasks in a C or C++ program has to be provided in a functions file. Its name must be the same as the main file followed by the suffix “-functions”. In our case Matmul-functions.cc.

#include "Matmul.h"
#include "Matrix.h"
#include "Block.h"

void initMatrix(Matrix *matrix,int mSize,int nSize,double val){
     *matrix = Matrix::init(mSize, nSize, val);
}

void multiplyBlocks(Block *block1,Block *block2,Block *block3){
     block1->multiply(*block2, *block3);
}

In the previous code, class methods have been encapsulated inside a function. This is useful when the class method returns an object or a value and we want to avoid the explicit synchronization when returning from the method.

1.3. Additional source files

Other source files needed by the user application must be placed under the directory “src”. In this directory the programmer must provide a Makefile that compiles such source files in the proper way. When the binding compiles the whole application it will enter into the src directory and execute the Makefile.

It generates two libraries, one for the master application and another for the worker application. The directive COMPSS_MASTER or COMPSS_WORKER must be used in order to compile the source files for each type of library. Both libraries will be copied into the lib directory where the binding will look for them when generating the master and worker applications.

1.4. Class Serialization

In case of using an object as method parameter, as callee or as return of a call to a function, the object has to be serialized. The serialization method has to be provided inline in the header file of the object’s class by means of the “boost” library. The next listing contains an example of serialization for two objects of the Block class.

#ifndef BLOCK_H
#define BLOCK_H

#include    <vector>
#include    <boost/archive/text_iarchive.hpp>
#include    <boost/archive/text_oarchive.hpp>
#include    <boost/serialization/serialization.hpp>
#include    <boost/serialization/access.hpp>
#include    <boost/serialization/vector.hpp>

using namespace std;
using namespace boost;
using namespace serialization;

class Block {
public:
    Block(){};
    Block(int bSize);
    static Block *init(int bSize, double initVal);
    void multiply(Block block1, Block block2);
    void print();

private:
    int M;
    std::vector< std::vector< double > > data;

    friend class::serialization::access;
    template<class Archive>
    void serialize(Archive & ar, const unsigned int version) {
        ar & M;
        ar & data;
    }
};
#endif

For more information about serialization using “boost” visit the related documentation at www.boost.org <www.boost.org>.

1.5. Method - Task

A task can be a C++ class method. A method can return a value, modify the this object, or modify a parameter.

If the method has a return value there will be an implicit synchronization before exit the method, but for the this object and parameters the synchronization can be done later after the method has finished.

This is because the this object and the parameters can be accessed inside and outside the method, but for the variable where the returned value is copied to, it can’t be known inside the method.

#include "Block.h"

Block::Block(int bSize) {
       M = bSize;
       data.resize(M);
       for (int i=0; i<M; i++) {
              data[i].resize(M);
       }
}

Block *Block::init(int bSize, double initVal) {
       Block *block = new Block(bSize);
       for (int i=0; i<bSize; i++) {
              for (int j=0; j<bSize; j++) {
                     block->data[i][j] = initVal;
              }
       }
       return block;
}

#ifdef COMPSS_WORKER

void Block::multiply(Block block1, Block block2) {
       for (int i=0; i<M; i++) {
              for (int j=0; j<M; j++) {
                     for (int k=0; k<M; k++) {
                            data[i][j] += block1.data[i][k] * block2.data[k][j];
                     }
              }
       }
       this->print();
}

#endif

void Block::print() {
       for (int i=0; i<M; i++) {
              for (int j=0; j<M; j++) {
                     cout << data[i][j] << " ";
              }
              cout << "\r\n";
       }
}

1.6. Task Constraints

The C/C++ binding also supports the definition of task constraints. The task definition specified in the IDL file must be decorated/annotated with the @Constraints. Below, you can find and example of how to define a task with a constraint of using 4 cores. The list of constraints which can be defined for a task can be found in Section [sec:Constraints]

interface Matmul
{
      @Constraints(ComputingUnits = 4)
      void multiplyBlocks(inout Block block1,
                          in Block block2,
                          in Block block3);

};

1.7. Task Versions

Another COMPSs functionality supported in the C/C++ binding is the definition of different versions for a tasks. The following code shows an IDL file where a function has two implementations, with their corresponding constraints. It show an example where the multiplyBlocks_GPU is defined as a implementation of multiplyBlocks using the annotation/decoration @Implements. It also shows how to set a processor constraint which requires a GPU processor and a CPU core for managing the offloading of the computation to the GPU.

interface Matmul
{
        @Constraints(ComputingUnits=4);
        void multiplyBlocks(inout Block block1,
                            in Block block2,
                            in Block block3);

        // GPU implementation
        @Constraints(processors={
               @Processor(ProcessorType=CPU, ComputingUnits=1)});
               @Processor(ProcessorType=GPU, ComputingUnits=1)});
        @Implements(multiplyBlocks);
        void multiplyBlocks_GPU(inout Block block1,
                                in Block block2,
                                in Block block3);

};