Skip to content
This repository has been archived by the owner on May 14, 2024. It is now read-only.

WebAppDecisionRecord

Robert Clegg edited this page Apr 23, 2019 · 10 revisions

Piezo Web App Decision Record


Do not specify driver core limit

28 Feb 2019

A manifest of specifications is passed to the Spark Operator when submitting a job request. Key-value pairs in this manifest include the driver core and driver core limit. These specify:

  • core what proportion of the node's CPU is reserved for the driver pod
  • coreLimit the maximum proportion of the node's CPU the driver pod may use

Despite describing the same underlying quantity (proportion of a CPU), the input format of the values is different; in the example manifests core is specified as a float (e.g. 0.1) and coreLimit as a string representation of milli-CPUs (e.g. "200m").

Attempts to make this consistent for users of the Piezo web app frequently caused the Spark Operator to reject the manifest. No documentation on the required formats of the manifest values could be found.

The decision was made to remove this key-value pair from the submitted manifest, therefore implicitly accepting the default value. This did not cause any problems during informal testing, and it was even observed that the time taken to spin up a Spark cluster reduced from about a minute to a few seconds.

It is quite possible that accepting the default value for coreLimit performs well during development, but may cause performance issues when more jobs are submitted during production use. This topic should therefore be re-visited when assessing Piezo performance.


Maximum length of a job name

11 March 2019

The maximum length of a Spark Operator resource is 63 characters. This is not documented, but is set as a constant in the source code: see maxNameLength of this file.

The job UUID tag length (see below) must be subtracted from this to give the maximum job name submitted by users of the Piezo system.

18 March 2019

Exposing metrics from the spark pods creates services which are required to run for the spark job to be submitted. These services append a POSTFIX and a timestamp with random component to the job name as can been seen here where the driver service is created. This now becomes the limiting factor setting a character limit of 35.

Once again, the job UUID tag length (see below) must be subtracted from this to give the maximum job name submitted by users of the Piezo system.

02 April 2019

Recent activity in the spark operator codebase suggests that future work might enable longer names and as such the restriction of job name may be relaxed a bit. Keep track of this issue for updates: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/465 .


Job UUID tag length

8 March 2019

The standard UUID generated by the uuid package is a sequence of 32 hexadecimal digits, giving a total number of permutations of over 3.4E38. This is likely far larger than needed, and users may find such large tags inconvenient to work with.

If we define

  • the number of jobs with the same name already on the Kubernetes cluster as j
  • the number of characters in the UUID tag as c
  • the number of attempts at finding an unused UUID tag as a

then the probability of failure (i.e. randomly selecting a UUID that already exists a times in a row) is

P(failure) = ( j 16 ^(-c) )^a

Both c and a can be set in piezo_web_app/PiezoWebApp/src/services/spark_job/spark_job_namer.py.

From experience, 5 characters is a comfortable string length for users to work with. Plugging this into the formula above with, say, j set to 100 and a set to 10 gives a probability of failure of ~ 6E-41. This is deemed acceptably low.

The remaining character limit available to the user is then 57.


Updating validation rules

1 April 2019

The method chosen to configure the web app to use updated validation rules involves scaling the web app to 0 replicas (essentially switching it off) before resetting the replicas to 1 which will re-initiate the web app. Downtime should be limited to just a couple of seconds.

Any update without restarting the pod would require entirely rebuilding and restarting the web app. The simplest way by hand to restart the pod is to delete it which will cause a new copy to initiate, however with the kubernetes unique id added to each pod name, automating this process would become more complicated. As such scaling the replicas appears to be the simplest option without many drawbacks.

Another possible route would be to save the configmap with a new name and edit this in the deployment and using kubectl apply. This would cause a rolling update removing any downtime. However, periodic clean-ups of old configmaps would be required and it would require careful attention to keeping track and updating the correct manifests created during deployment. It would also not get around the issue of logs being lost when the pods are updated as these are currently saved in the pod.


Exposing the Spark UI instead of the Grafana

16 April 2019

The Spark UI provides users with an interactive dashboard for the duration of their Spark job. This shows the users the progress through their application and specific metrics for each stage of the job. Grafana provides visualisations of the metrics produced from the Kubernetes cluster and can be used to create custom dashboards for individual spark applications.

Research into Grafana shows that it can only be made accessible outside of the cluster at /. We tried to get around this both directly, by editing the grafana configmap (in particular seting the root_url parameter), and indirectly; by using ingress rules on a sub-path and rewriting our targets to /. However, all efforts failed to produce success.

Similarly, the Spark UI is defaultly produced on / and contains automatic redirection that makes it impossible to move. We managed to allow access to a unique UI for each spark application by using a proxy (see here)[https://github.com/aseigneurin/spark-ui-proxy] but even with the proxy we are still restrained to starting all roots at /.

As a result we have had to choose whether to expose Grafana or the Spark UIs. The benefits of the spark UI are that it comes ready built with just information specific to a users spark application. It is also provides a better visualisation for the progress of a spark job and has a simpler and easier interface for the user to interact with. On the other-hand, the spark UI is only accessible while a job is running whilst Grafana would remain present indefinitely. Grafana also provides additonal metrics and is particularlly rich for metrics related to resource usage. This information is still accessible via Prometheus but Grafana gives a clearer visualisation.

Ideally we would use Grafana and the Spark UI together to help users optimise their spark jobs but with our restriction we have chosen the Spark UI as it involves less maintenance, set up and provides users with a clearer overview of their application that will be useful for all users not just when trying to optimise jobs.

It is also worth noting that in the future a user may wish to set up a Grafana server external to the Kubernetes cluster. This would be accessible directly without the need to go through ingress rules and would bypass the issues mentioned above.


System tests only in Python

23 April 2019

In task 111 an extra argument is prepended for each job submission, which specifies the directory for log files and output files. This is necessary to ensure that every job has a unique output directory (the UUID tag for the job is used, and the user does not know this tag in advance).

Updating the Python test scripts with this change was straightforward, since the altered scripts can be uploaded directly onto the S3 storage. However, updating the Scala scripts also requires recompilation: the effort required to set this up in the time remaining was not considered a high enough priority.

It is worth therefore noting that, although Scala scripts should run on Piezo (so long as they are written to accept the output directory as the first input), this is not currently tested.

Clone this wiki locally