Skip to content

Python Java Implementation Notes [Draft]

Chathura Widanage edited this page Sep 4, 2019 · 1 revision

TSet function(fn) implementations

Some functions implementation such as edu.iu.dsc.tws.api.tset.fn.PartitionFunc has predefined concrete classes written in Java(edu.iu.dsc.tws.api.tset.fn.HashingPartitioner,edu.iu.dsc.tws.api.tset.fn.LoadBalancePartitioner, edu.iu.dsc.tws.api.tset.fn.OneToOnePartitioner). Python API should allow users to use these predefined functions as well as their own implementations.

Steps to support predefined classes

Predefined classes will be exposed as follows.

env = Twister2Environment()
env.functions.partition.load_balance

load_balance is a python object which keeps an internal pointer to an instance of edu.iu.dsc.tws.api.tset.fn.LoadBalancePartitioner. Hence this pointer can be directly passed to the partition method of a Java TSet via py4j.

Steps to support user defined functions

When user define a function, it will be serialized and wrapped in a Java implementation as follows.

PythonClassProcessor partitionPython = new PythonClassProcessor(pyBinary);

    return new PartitionFunc() {
      @Override
      public void prepare(Set sources, Set destinations) {
        partitionPython.invoke("prepare", sources, destinations);
      }

      @Override
      public int partition(int sourceIndex, Object val) {
        return (Integer) partitionPython.invoke("partition", sourceIndex, val);
      }

      @Override
      public void commit(int source, int partition) {
        partitionPython.invoke("commit", source, partition);
      }
    };
  }

This reference will be then passed to python side and python can then treat it similar to a predefined implementation.