Executing COMPSs applications
Prerequisites
Prerequisites vary depending on the application’s code language: for Java applications the users need to have a jar archive containing all the application classes, for Python applications there are no requirements and for C/C++ applications the code must have been previously compiled by using the compss_build_app command.
For further information about how to develop COMPSs applications please refer to Application development.
Runcompss command
COMPSs applications are executed using the runcompss command:
compss@bsc:~$ runcompss [options] application_name [application_arguments]
The application name must be the fully qualified name of the application in Java, the path to the .py file containing the main program in Python and the path to the master binary in C/C++.
The application arguments are the ones passed as command line to main application. This parameter can be empty.
The runcompss
command allows the users to customize a COMPSs
execution by specifying different options. For clarity purposes,
parameters are grouped in Runtime configuration, Tools enablers and
Advanced options.
compss@bsc:~$ runcompss -h
Usage: /opt/COMPSs/Runtime/scripts/user/runcompss [options] application_name application_arguments
* Options:
General:
--help, -h Print this help message
--opts Show available options
--version, -v Print COMPSs version
Tools enablers:
--graph=<bool>, --graph, -g Generation of the complete graph (true/false)
When no value is provided it is set to true
Default: false
--tracing=<bool>, --tracing, -t Set generation of traces.
Default: false
--monitoring=<int>, --monitoring, -m Period between monitoring samples (milliseconds)
When no value is provided it is set to 2000
Default: 0
--external_debugger=<int>,
--external_debugger Enables external debugger connection on the specified port (or 9999 if empty)
Default: false
--jmx_port=<int> Enable JVM profiling on specified port
Runtime configuration options:
--task_execution=<compss|storage> Task execution under COMPSs or Storage.
Default: compss
--storage_impl=<string> Path to an storage implementation. Shortcut to setting pypath and classpath. See Runtime/storage in your installation folder.
--storage_conf=<path> Path to the storage configuration file
Default: null
--project=<path> Path to the project XML file
Default: /opt/COMPSs//Runtime/configuration/xml/projects/default_project.xml
--resources=<path> Path to the resources XML file
Default: /opt/COMPSs//Runtime/configuration/xml/resources/default_resources.xml
--lang=<name> Language of the application (java/c/python)
Default: Inferred is possible. Otherwise: java
--summary Displays a task execution summary at the end of the application execution
Default: false
--log_level=<level>, --debug, -d Set the debug level: off | info | api | debug | trace
Warning: Off level compiles with -O2 option disabling asserts and __debug__
Default: off
Advanced options:
--extrae_config_file=<path> Sets a custom extrae config file. Must be in a shared disk between all COMPSs workers.
Default: /opt/COMPSs//Runtime/configuration/xml/tracing/extrae_basic.xml
--extrae_config_file_python=<path> Sets a custom extrae config file for python. Must be in a shared disk between all COMPSs workers.
Default: null
--trace_label=<string> Add a label in the generated trace file. Only used in the case of tracing is activated.
Default: Applicacion name
--tracing_task_dependencies=<bool> Adds communication lines for the task dependencies (true/false)
Default: false
--generate_trace=<bool> Converts the events register into a trace file. Only used in the case of activated tracing.
Default: true
--delete_trace_packages=<bool> If true, deletes the tracing packages created by the run.
Default: true. Automatically, disabled if the trace is not generated.
--custom_threads=<bool> Threads in the trace file are re-ordered and customized to indicate the function of the thread.
Only used when the tracing is activated and a trace file generated.
Default: true
--comm=<ClassName> Class that implements the adaptor for communications
Supported adaptors:
├── es.bsc.compss.nio.master.NIOAdaptor
└── es.bsc.compss.gat.master.GATAdaptor
Default: es.bsc.compss.nio.master.NIOAdaptor
--conn=<className> Class that implements the runtime connector for the cloud
Supported connectors:
├── es.bsc.compss.connectors.DefaultSSHConnector
└── es.bsc.compss.connectors.DefaultNoSSHConnector
Default: es.bsc.compss.connectors.DefaultSSHConnector
--streaming=<type> Enable the streaming mode for the given type.
Supported types: FILES, OBJECTS, PSCOS, ALL, NONE
Default: NONE
--streaming_master_name=<str> Use an specific streaming master node name.
Default: Empty
--streaming_master_port=<int> Use an specific port for the streaming master.
Default: Empty
--scheduler=<className> Class that implements the Scheduler for COMPSs
Supported schedulers:
├── es.bsc.compss.components.impl.TaskScheduler
├── es.bsc.compss.scheduler.orderstrict.fifo.FifoTS
├── es.bsc.compss.scheduler.lookahead.fifo.FifoTS
├── es.bsc.compss.scheduler.lookahead.lifo.LifoTS
├── es.bsc.compss.scheduler.lookahead.locality.LocalityTS
├── es.bsc.compss.scheduler.lookahead.successors.constraintsfifo.ConstraintsFifoTS
├── es.bsc.compss.scheduler.lookahead.mt.successors.constraintsfifo.ConstraintsFifoTS
├── es.bsc.compss.scheduler.lookahead.successors.fifo.FifoTS
├── es.bsc.compss.scheduler.lookahead.mt.successors.fifo.FifoTS
├── es.bsc.compss.scheduler.lookahead.successors.lifo.LifoTS
├── es.bsc.compss.scheduler.lookahead.mt.successors.lifo.LifoTS
├── es.bsc.compss.scheduler.lookahead.successors.locality.LocalityTS
└── es.bsc.compss.scheduler.lookahead.mt.successors.locality.LocalityTS
Default: es.bsc.compss.scheduler.lookahead.locality.LocalityTS
--scheduler_config_file=<path> Path to the file which contains the scheduler configuration.
Default: Empty
--checkpoint=<className> Class that implements the Checkpoint Management policy
Supported checkpoint policies:
├── es.bsc.compss.checkpoint.policies.CheckpointPolicyInstantiatedGroup
├── es.bsc.compss.checkpoint.policies.CheckpointPolicyPeriodicTime
├── es.bsc.compss.checkpoint.policies.CheckpointPolicyFinishedTasks
└── es.bsc.compss.checkpoint.policies.NoCheckpoint
Default: es.bsc.compss.checkpoint.policies.NoCheckpoint
--checkpoint_params=<string> Checkpoint configuration parameter.
Default: Empty
--checkpoint_folder=<path> Checkpoint folder.
Default: Mandatory parameter
--library_path=<path> Non-standard directories to search for libraries (e.g. Java JVM library, Python library, C binding library)
Default: Working Directory
--classpath=<path> Path for the application classes / modules
Default: Working Directory
--appdir=<path> Path for the application class folder.
Default: /home/user
--pythonpath=<path> Additional folders or paths to add to the PYTHONPATH
Default: /home/user
--env_script=<path> Path to the script file where the application environment variables are defined.
COMPSs sources this script before running the application.
Default: Empty
--log_dir=<path> Directory to store COMPSs log files (a .COMPSs/ folder will be created inside this location)
Default: User home
--master_working_dir=<path> Use a specific directory to store COMPSs temporary files in master
Default: <log_dir>/.COMPSs/<app_name>/tmpFiles
--uuid=<int> Preset an application UUID
Default: Automatic random generation
--master_name=<string> Hostname of the node to run the COMPSs master
Default: Empty
--master_port=<int> Port to run the COMPSs master communications.
Only for NIO adaptor
Default: [43000,44000]
--jvm_master_opts="<string>" Extra options for the COMPSs Master JVM. Each option separed by "," and without blank spaces (Notice the quotes)
Default: Empty
--jvm_workers_opts="<string>" Extra options for the COMPSs Workers JVMs. Each option separed by "," and without blank spaces (Notice the quotes)
Default: -Xms256m,-Xmx1024m,-Xmn100m
--cpu_affinity="<string>" Sets the CPU affinity for the workers
Supported options: disabled, automatic, dlb or user defined map of the form "0-8/9,10,11/12-14,15,16"
Default: automatic
--gpu_affinity="<string>" Sets the GPU affinity for the workers
Supported options: disabled, automatic, user defined map of the form "0-8/9,10,11/12-14,15,16"
Default: automatic
--fpga_affinity="<string>" Sets the FPGA affinity for the workers
Supported options: disabled, automatic, user defined map of the form "0-8/9,10,11/12-14,15,16"
Default: automatic
--fpga_reprogram="<string>" Specify the full command that needs to be executed to reprogram the FPGA with the desired bitstream. The location must be an absolute path.
Default: Empty
--io_executors=<int> IO Executors per worker
Default: 0
--task_count=<int> Only for C/Python Bindings. Maximum number of different functions/methods, invoked from the application, that have been selected as tasks
Default: 50
--input_profile=<path> Path to the file which stores the input application profile
Default: Empty
--output_profile=<path> Path to the file to store the application profile at the end of the execution
Default: Empty
--PyObject_serialize=<bool> Only for Python Binding. Enable the object serialization to string when possible (true/false).
Default: false
--persistent_worker_c=<bool> Only for C Binding. Enable the persistent worker in c (true/false).
Default: false
--enable_external_adaptation=<bool> Enable external adaptation. This option will disable the Resource Optimizer.
Default: false
--gen_coredump Enable master coredump generation
Default: false
--keep_workingdir Do not remove the worker working directory after the execution
Default: false
--python_interpreter=<string> Python interpreter to use (python/python3).
Default: python3 Version:
--python_propagate_virtual_environment=<bool> Propagate the master virtual environment to the workers (true/false).
Default: true
--python_mpi_worker=<bool> Use MPI to run the python worker instead of multiprocessing. (true/false).
Default: false
--python_memory_profile Generate a memory profile of the master.
Default: false
--python_worker_cache=<string> Python worker cache (true/size/false).
Only for NIO without mpi worker and python >= 3.8.
Default: false
--python_cache_profiler=<bool> Python cache profiler (true/false).
Only for NIO without mpi worker and python >= 3.8.
Default: false
--wall_clock_limit=<int> Maximum duration of the application (in seconds).
Default: 0
--shutdown_in_node_failure=<bool> Stop the whole execution in case of Node Failure.
Default: false
--provenance, -p Generate COMPSs workflow provenance data in RO-Crate format from YAML file. Automatically activates -graph and -output_profile.
Default: false
* Application name:
For Java applications: Fully qualified name of the application
For C applications: Path to the master binary
For Python applications: Path to the .py file containing the main program
* Application arguments:
Command line arguments to pass to the application. Can be empty.
Warning
The cpu_affinity
feature is not available in macOS distributions. Then, for all macOS executions the flag
--cpu_affinity=disabled
must be specified, no matter if they are Java, Python or C/C++.
Running a COMPSs application
Before running COMPSs applications the application files must be in
the CLASSPATH. Thus, when launching a COMPSs application, users can
manually pre-set the CLASSPATH environment variable or can add the
--classpath
option to the runcompss
command.
The next three sections provide specific information for launching COMPSs applications developed in different code languages (Java, Python and C/C++). For clarity purposes, we will use the Simple application (developed in Java, Python and C++) available in the COMPSs Virtual Machine or at https://compss.bsc.es/projects/bar webpage. This application takes an integer as input parameter and increases it by one unit using a task. For further details about the codes please refer to Sample Applications.
Tip
For further information about applications scheduling refer to Schedulers.
Running Java applications
A Java COMPSs application can be launched through the following command:
compss@bsc:~$ cd tutorial_apps/java/simple/jar/
compss@bsc:~/tutorial_apps/java/simple/jar$ runcompss simple.Simple <initial_number>
compss@bsc:~/tutorial_apps/java/simple/jar$ runcompss simple.Simple 1
[ INFO] Using default execution type: compss
[ INFO] Using default location for project file: /opt/COMPSs/Runtime/configuration/xml/projects/default_project.xml
[ INFO] Using default location for resources file: /opt/COMPSs/Runtime/configuration/xml/resources/default_resources.xml
[ INFO] Using default language: java
----------------- Executing simple.Simple --------------------------
WARNING: COMPSs Properties file is null. Setting default values
[(1066) API] - Starting COMPSs Runtime v<version>
Initial counter value is 1
Final counter value is 2
[(4740) API] - Execution Finished
------------------------------------------------------------
In this first execution we use the default value of the --classpath
option to automatically add the jar file to the classpath (by executing
runcompss in the directory which contains the jar file). However, we can
explicitly do this by exporting the CLASSPATH variable or by
providing the --classpath
value. Next, we provide two more ways to
perform the same execution:
compss@bsc:~$ export CLASSPATH=$CLASSPATH:/home/compss/tutorial_apps/java/simple/jar/simple.jar
compss@bsc:~$ runcompss simple.Simple <initial_number>
compss@bsc:~$ runcompss --classpath=/home/compss/tutorial_apps/java/simple/jar/simple.jar \
simple.Simple <initial_number>
Running Python applications
To launch a COMPSs Python application users have to provide the
--lang=python
option to the runcompss command. If the extension of
the main file is a regular Python extension (.py
or .pyc
) the
runcompss command can also infer the application language without
specifying the lang flag.
compss@bsc:~$ cd tutorial_apps/python/simple/
compss@bsc:~/tutorial_apps/python/simple$ runcompss --lang=python ./simple.py <initial_number>
compss@bsc:~/tutorial_apps/python/simple$ runcompss simple.py 1
[ INFO] Using default execution type: compss
[ INFO] Using default location for project file: /opt/COMPSs/Runtime/configuration/xml/projects/default_project.xml
[ INFO] Using default location for resources file: /opt/COMPSs/Runtime/configuration/xml/resources/default_resources.xml
[ INFO] Inferred PYTHON language
----------------- Executing simple.py --------------------------
WARNING: COMPSs Properties file is null. Setting default values
[(616) API] - Starting COMPSs Runtime v<version>
Initial counter value is 1
Final counter value is 2
[(4297) API] - Execution Finished
------------------------------------------------------------
Attention
Executing without debug (e.g. default log level or --log_level=off
)
uses -O2 compiled sources, disabling asserts
and __debug__
.
Alternatively, it is possible to execute the a COMPSs Python application
using pycompss
as module:
compss@bsc:~$ python -m pycompss <runcompss_flags> <application> <application_parameters>
Consequently, the previous example could also be run as follows:
compss@bsc:~$ cd tutorial_apps/python/simple/
compss@bsc:~/tutorial_apps/python/simple$ python -m pycompss simple.py <initial_number>
If the -m pycompss
is not set, the application will be run ignoring
all PyCOMPSs imports, decorators and API calls, that is, sequentially.
In order to run a COMPSs Python application with a different interpreter, the runcompss command provides a specific flag:
compss@bsc:~$ cd tutorial_apps/python/simple/
compss@bsc:~/tutorial_apps/python/simple$ runcompss --python_interpreter=python3 ./simple.py <initial_number>
However, when using the pycompss module, it is inferred from the python used in the call:
compss@bsc:~$ cd tutorial_apps/python/simple/
compss@bsc:~/tutorial_apps/python/simple$ python3 -m pycompss simple.py <initial_number>
Finally, both runcompss and pycompss module provide a particular
flag for virtual environment propagation
(--python_propagate_virtual_environment=<bool>
). This, flag is
intended to activate the current virtual environment in the worker nodes
when set to true.
Specific flags
Some of the runcompss flags are only for PyCOMPSs application execution:
- --pythonpath=<path>
Additional folders or paths to add to the PYTHONPATH Default: /home/user
- --PyObject_serialize=<bool>
Only for Python Binding. Enable the object serialization to string when possible (true/false). Default: false
- --python_interpreter=<string>
Python interpreter to use (python/python2/python3). Default: “python” version
- --python_propagate_virtual_environment=<true>
Propagate the master virtual environment to the workers (true/false). Default: true
- --python_mpi_worker=<false>
Use MPI to run the python worker instead of multiprocessing. (true/false). Default: false
- --python_memory_profile
Generate a memory profile of the master. Default: false
See: Memory Profiling
- --python_worker_cache=<string>
Python worker cache (true/true:size/false). Only for NIO without mpi worker and python >= 3.8. Available for GPU if cupy installed. Default: false
See: Worker cache
- --python_cache_profiler=<bool>
Python cache profiler (true/false). Only for NIO without mpi worker and python >= 3.8. Default: false
Warning
For macOS systems, the flag --python_interpreter=/path_to/python
must be passed to ensure the
same Python version is used both in master and worker parts of the application (the application will crash
otherwise). We recommend to use pyenv to manage the macOS installed
Python versions. An example using pyenv would be: --python_interpreter=/Users/username/.pyenv/shims/python3
In addition, be careful with Xcode
updates, since they can modify the Python system version.
Worker cache
The --python_worker_cache
is used to enable a cache between processes on
each worker node. More specifically, this flag enables a shared memory space
between the worker processes, so that they can share objects between processess
in order to leverage the deserialization overhead.
If CUPY
is installed the cache is enabled, the cupy.ndarrays
will also
be cacheables in each GPU memory.
The possible values are:
--python_worker_cache=false
Disable the cache (CPU/GPU). This is the default value.
--python_worker_cache=true
Enable the cache (CPU/GPU). The default cache size is 25% of the worker node memory. And the hard limited gpu cache size is 25% of the gpu memory.
--python_worker_cache=true:<SIZE>
Enable the cache with specific cache size (in bytes and only for CPU). Setting the gpu cache size is not yet supported.
During execution, each worker will try to store automatically the parameters and return objects, so that next tasks can make use of them without needing to deserialize from file.
Important
The supported objects to be stored in the cache is limited to: python primitives (int, float, bool, str (less than 10 Mb), bytes (less than 10 Mb) and None), lists (composed by python primitives), tuples (composed by python primitives), Numpy ndarrays and Cupy ndarrays.
It is important to take into account that storing the objects in cache has some non negligible overhead that can be representative, while getting objects from cache shows to be more efficient than deserialization. Consequently, the applications that most benefit from the cache are the ones that reuse many times the same objects.
Avoiding to store an object into the cache is possible by setting Cache
to
False
into the @task
decorator for the parameter. For example,
Code 140 shows how to avoid caching the value
parameter.
from pycompss.api.task import task
from pycompss.api.parameter import *
@task(value={Cache: False})
def mytask(value):
....
Task return objects are also automatically stored into cache. To avoid caching
return objects it is necessary to set cache_returns=False
into the
@task
decorator, as Code 141 shows.
from pycompss.api.task import task
@task(returns=1, cache_returns=False)
def mytask():
return list(range(10))
Worker cache profiling
In order to use the cache profiler, you need to add the following flag:
--python_cache_profiler=true
Additionally, you also need to activate the cache with
--python_worker_cache=true
.
When using the cache profiler, the cache parameter in @task
decorator
is going to be ignored and all elements that can be stored in the cache
will be stored.
The cache profiling file will be located in the workers’ folder within the log folder. In this file, you will find a summary showing for each function and parameter (including the return of the function), how many times has been the parameter been added to the cache (PUT), and how many times has been this parameter been deserialized from the cache (GET). Furthermore, there is also a list (USED IN), that shows in which parameter of which function the added parameter has been used.
Additional features
Concurrent serialization
It is possible to perform concurrent serialization of the objects in the master
when using Python 3.
To this end, just export the COMPSS_THREADED_SERIALIZATION
environment
variable with any value:
compss@bsc:~$ export COMPSS_THREADED_SERIALIZATION=1
Caution
Please, make sure that the COMPSS_THREADED_SERIALIZATION
environment
variable is not in the environment (env
) to avoid the concurrent
serialization of the objects in the master.
Tip
This feature can also be used within supercomputers in the same way.
Running C/C++ applications
To launch a COMPSs C/C++ application users have to compile the
C/C++ application by means of the compss_build_app
command. For
further information please refer to C/C++ Binding. Once
complied, the --lang=c
option must be provided to the runcompss
command. If the main file is a C/C++ binary the runcompss command
can also infer the application language without specifying the lang
flag.
compss@bsc:~$ cd tutorial_apps/c/simple/
compss@bsc:~/tutorial_apps/c/simple$ runcompss --lang=c simple <initial_number>
compss@bsc:~/tutorial_apps/c/simple$ runcompss ~/tutorial_apps/c/simple/master/simple 1
[ INFO] Using default execution type: compss
[ INFO] Using default location for project file: /opt/COMPSs/Runtime/configuration/xml/projects/default_project.xml
[ INFO] Using default location for resources file: /opt/COMPSs/Runtime/configuration/xml/resources/default_resources.xml
[ INFO] Inferred C/C++ language
----------------- Executing simple --------------------------
JVM_OPTIONS_FILE: /tmp/tmp.ItT1tQfKgP
COMPSS_HOME: /opt/COMPSs
Args: 1
WARNING: COMPSs Properties file is null. Setting default values
[(650) API] - Starting COMPSs Runtime v<version>
Initial counter value is 1
[ BINDING] - @compss_wait_on - Entry.filename: counter
[ BINDING] - @compss_wait_on - Runtime filename: d1v2_1497432831496.IT
Final counter value is 2
[(4222) API] - Execution Finished
------------------------------------------------------------
Walltime
The runcompss
command provides the --wall_clock_limit
for the users to
specify the maximum execution time for the application (in seconds).
If the time is reached, the execution is stopped.
Tip
This flag enables to stop the execution of an application in a contolled way if the execution is taking more than expected.
Additional configurations
The COMPSs runtime has two configuration files: resources.xml
and
project.xml
. These files contain information about the execution
environment and are completely independent from the application.
For each execution users can load the default configuration files or
specify their custom configurations by using, respectively, the
--resources=<absolute_path_to_resources.xml>
and the
--project=<absolute_path_to_project.xml>
in the runcompss
command. The default files are located in the
/opt/COMPSs/Runtime/configuration/xml/
path. Users can manually edit
these files or can use the Eclipse IDE tool developed for COMPSs.
For further details please check the Configuration Files.