COMPSs Context

class ddf_library.context.COMPSsContext

Bases: object

Controls the DDF tasks executions

static context_status()

Generates a DAG (in dot file) and some information on screen about the status process.

Returns:
static import_compss_data(df_list, schema=None, parquet=False)

Import a previous Pandas DataFrame list into DDF abstraction.

Parameters:
  • df_list – DataFrame input
  • parquet – if data is saved as list of parquet files
  • schema – (Optional) A list of columns names, data types and size in each partition;
Returns:

DDF

Example:
>>> cc = COMPSsContext()
>>> ddf1 = cc.import_compss_data(df_list)
static parallelize(df, num_of_parts='*')

Distributes a DataFrame into DDF.

Parameters:
  • df – DataFrame input
  • num_of_parts – number of partitions (default, ‘*’ meaning all cores available in master CPU);
Returns:

DDF

Example:
>>> cc = COMPSsContext()
>>> ddf1 = cc.parallelize(df)
set_log(enabled=True)

Set the log level.

Parameters:enabled – True to debug, False to off.
Returns:
show_tasks()

Show all tasks in the current code. Only to debug. :return:

start_monitor()

Start a web service monitor that informs the environment current status. The process will be shown in http://127.0.0.1:58227/. :return:

static stop()

Stop the DDF environment. It is important to stop at end of an application in order to avoid that COMPSs sends back all partial result at end.