-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PUBLIC-2591: App code extraction, test improvement #16
Conversation
… tests less awful.
if hasattr(cls, "ws_name"): | ||
try: | ||
cls.ws_client.delete_workspace({"workspace": cls.ws_name}) | ||
except: |
Check notice
Code scanning / CodeQL
Except block handles 'BaseException' Note test
# for field in msa_in.keys(): | ||
# log(console, "MSA key: '"+field+"'") |
Check notice
Code scanning / CodeQL
Commented-out code Note
# for row_id in row_order: | ||
# log(console, "row_id: '"+row_id+"' default_row_label: '"+default_row_labels[row_id]+"'") |
Check notice
Code scanning / CodeQL
Commented-out code Note
# with open(input_msa_file_path, 'r') as input_msa_file_handle: | ||
# for line in input_msa_file_handle: | ||
# #log(console,"MSA_LINE: '"+line+"'") # too big for console | ||
# log(invalid_msgs,"MSA_LINE: '"+line+"'") |
Check notice
Code scanning / CodeQL
Commented-out code Note
# tree_attributes will always be None. | ||
# goodness knows what this was supposed to check for | ||
if tree_attributes is not None: | ||
output_tree["tree_attributes"] = tree_attributes |
Check warning
Code scanning / CodeQL
Unreachable code Warning
@@ -2,9 +2,6 @@ name: KBase SDK Tests | |||
|
|||
on: | |||
pull_request: | |||
branches: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run the tests on every PR
pip install --upgrade pip && \ | ||
# install the python requirements | ||
# Note: You must use PyQt5==5.11.3 on debian | ||
pip install ete3==3.1.2 PyQt5==5.11.3 numpy==1.23.1 pytest coverage pytest-cov vcrpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved up the python installation to prior to copying over the repo
@@ -1,23 +1,4 @@ | |||
[kb_fasttree] | |||
kbase-endpoint = {{ kbase_endpoint }} | |||
job-service-url = {{ job_service_url }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
none of the rest of these are used - just the workspace URL and the scratch dir
@@ -0,0 +1,411 @@ | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
standard kbase app module - adding it here as I plan to update the base image in the next PR
@@ -0,0 +1,738 @@ | |||
"""Implementation of the fasttree app.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The majority of this is a big ol' cut and paste from the impl file. I separated out a couple of functions to make it easier to mock certain things for testing. There's a whole load of cruft still in the code that I will deal with in an upcoming PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll comment anything that I did change
def generate_run_id() -> str: | ||
"""Generate an ID for this fasttree run, based on the current date and time. | ||
|
||
:return: temporally unique run ID | ||
:rtype: str | ||
""" | ||
return datetime.now(tz=timezone.utc).strftime("%Y%m%d_%H%M%S") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new: previously this app produced files with either a UUID or a timestamp as part of the filename. Instead of doing that, the app now creates a directory (in the scratch
dir) with the timestamp as its name, and adds the files sans UUID/timestamp to that directory. It means that app generates a predictable set of files, which makes testing easier, and if for some reason you run the app numerous times in a row, each run will have its output in a separate directory.
# create a directory for all files related to this run to live in | ||
run_id = generate_run_id() | ||
run_dir = Path(config["scratch"]) / run_id | ||
run_dir.mkdir(parents=True, exist_ok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all new files go into this directory
I switched over to using Path
from pathlib
for os-related stuff whilst I was updating file names and adding the new run directory
if intree_type_name != "Tree": | ||
raise ValueError("Cannot yet handle intree type of: " + intree_type_name) | ||
|
||
intree_newick_file_path = run_dir / f"{intree_name}.newick" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplified this file name
# timestamp = int((datetime.utcnow() - datetime.utcfromtimestamp(0)).total_seconds() * 1000) | ||
# html_output_dir = run_dir / f"output_html.{timestamp}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed in upcoming PR
def now_ISO(self): | ||
now_timestamp = datetime.now() | ||
now_secs_from_epoch = (now_timestamp - datetime(1970,1,1)).total_seconds() | ||
now_timestamp_in_iso = datetime.fromtimestamp(int(now_secs_from_epoch)).strftime('%Y-%m-%d_%T') | ||
return now_timestamp_in_iso | ||
|
||
def log(self, target, message): | ||
message = '['+self.now_ISO()+'] '+message | ||
if target is not None: | ||
target.append(message) | ||
print(message) | ||
sys.stdout.flush() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both these functions have been transferred over to fasttree.py
I simplified now_ISO
to just record a timestamp for now()
print(message) | ||
sys.stdout.flush() | ||
|
||
def get_single_end_read_library(self, ws_data, ws_info, forward): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
surprisingly none of the following four functions are used
# fasttree_cmd.append('-out') | ||
# fasttree_cmd.append(output_newick_file_path) | ||
# BEGIN run_FastTree | ||
config = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplified how the args are passed to the fasttree.py
file
|
||
self.log(console,"run_FastTree DONE") | ||
#END run_FastTree | ||
returnVal = fasttree.run_fasttree(config, ctx, params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice diff
work, GitHub!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1 @@ | |||
(Desulfotomaculum gibsoniae DSM 7213 - DESGI RS22175:0.12153,Candidatus Desulforudis audaxviator MP104C - DAUD RS11160:0.15587,(((Desulfovibrio vulgaris DP4 - DVUL RS12960:0.0,Desulfovibrio vulgaris str. Hildenborough - DVU0402:0.0,Desulfovibrio vulgaris RCH1 - DEVAL RS01880:0.0):0.04805,Desulfovibrio vulgaris str. 'Miyazaki F' - DVMF RS07895:0.05673)1.000:0.14274,((Candidatus Desulforudis audaxviator MP104C - DAUD RS11155:0.30486,(Desulfotomaculum nigrificans DSM 574 - DESNIDRAFT RS0211670:0.11266,Desulfotomaculum alkaliphilum DSM 12257 - BR02 RS0103515:0.08482)0.953:0.08154)1.000:1.05608,(Desulfotomaculum nigrificans DSM 574 - DESNIDRAFT RS0211675:0.13918,Desulfotomaculum alkaliphilum DSM 12257 - BR02 RS0103520:0.12923)0.971:0.07526)0.995:0.11388)1.000:0.08045); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected output from the sample file in test/data
@@ -0,0 +1 @@ | |||
(21855%2f4%2f1.f%3aDESGI_RS22175:0.12153,21855%2f2%2f1.f%3aDAUD_RS11160:0.15587,(((21855%2f8%2f1.f%3aDVUL_RS12960:0.0,21855%2f6%2f1.f%3aDVU0402:0.0,21855%2f7%2f1.f%3aDEVAL_RS01880:0.0):0.04805,21855%2f9%2f1.f%3aDVMF_RS07895:0.05673)1.000:0.14274,((21855%2f2%2f1.f%3aDAUD_RS11155:0.30486,(21855%2f3%2f1.f%3aDESNIDRAFT_RS0211670:0.11266,21855%2f5%2f1.f%3aBR02_RS0103515:0.08482)0.953:0.08154)1.000:1.05608,(21855%2f3%2f1.f%3aDESNIDRAFT_RS0211675:0.13918,21855%2f5%2f1.f%3aBR02_RS0103520:0.12923)0.971:0.07526)0.995:0.11388)1.000:0.08045); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should be UPAs -- this will be fixed in an upcoming PR
@@ -0,0 +1,4 @@ | |||
[kb_fasttree] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config for running tests
else: | ||
print("Test workspace was deleted") | ||
|
||
def get_ws_client(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turned all these function names and variables into the proper pythonic versions
def get_context(self): | ||
return self.ctx | ||
|
||
def get_msa_ref(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was part of the test itself but I moved it to be a separate function to make the test a bit clearer and to allow more tests to be run using the same msa_ref
self.assertEqual(created_obj_0_info[NAME_I], obj_out_name) | ||
self.assertEqual(created_obj_0_info[TYPE_I].split("-")[0], obj_out_type) | ||
# patch generate_run_id to return self.run_id | ||
with patch("kb_fasttree.fasttree.generate_run_id", return_value=self.run_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensures that the test output is in a predictable location
assert created_obj_0_info[1] == obj_out_name | ||
assert created_obj_0_info[2].split("-")[0] == obj_out_type | ||
|
||
# check that the expected files are in the expected places |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New test assertions below -- these check that the expected files are produced with the expected content. This allows the fasttree app code to be edited with at least a little confidence that any changes in output are deliberate.
try: | ||
cls.ws_client.delete_workspace({"workspace": cls.ws_name}) | ||
except: | ||
print("Workspace already deleted") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added this in to handle the case where the server responses are being played back by vcrpy and thus the call to delete the workspace is not needed
for sub_dir_name in file_list: | ||
for f in file_list[sub_dir_name]: | ||
check_non_zero_file_size(run_dir / sub_dir_name / f) | ||
if not f.endswith("pdf") and not f.endswith("png"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't check binary files as the compare_files
function only looks at text files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine. Didn't really review the actual fasttree code, I'm assuming 95+% was copy-pasted. The other changes seem good. 👍
pip install --upgrade pip && \ | ||
# install the python requirements | ||
# Note: You must use PyQt5==5.11.3 on debian | ||
pip install ete3==3.1.2 PyQt5==5.11.3 numpy==1.23.1 pytest coverage pytest-cov vcrpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No requirements file?
And do the dev requirements need versioning? Probably not, realistically, but I figured I should mention.
|
||
self.log(console,"run_FastTree DONE") | ||
#END run_FastTree | ||
returnVal = fasttree.run_fasttree(config, ctx, params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ValueError('Method run_FastTree return value ' + | ||
'returnVal is not type dict as required.') | ||
raise ValueError( | ||
"Method run_FastTree " + "return value returnVal " + "is not type dict as required." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could maybe get squashed together. Not sure if that's generated code at this point, though, just post-auto-formatted.
I have the following PRs planned: