Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added method to GoogleBigQuery to copy between projects #1111

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Charlie-Kramer
Copy link
Contributor

No description provided.

Copy link
Collaborator

@shaunagm shaunagm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes requested - happy to chat more about any of my review if you'd like. Also, ideally there would be a new test for this new method.

parsons/google/google_bigquery.py Show resolved Hide resolved
except Exception as e:
print("Exception during copy between projects", e)
else:
print("BigQuery copy failed")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we hit this failure after creating the dataset in the first try/except, the attempt might fail but still have a side effect of creating the dataset, right? That feels a little unpredictable although it wouldn't necessarily cause problems because this code doesn't try to re-create an already existing table. Still, you might want to either leave a note about order of execution so that it's clear that a failure might still have created a dataset or restructure this so we're checking for both failure modes at once rather than sequentially.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I shouldn't do the try/except at all but just create the job and if there's a general exception just let the computer say so. (I think this would be an example of your guidance below re general exceptions?)

dataset = bigquery.Dataset(dataset_id)
dataset = self.client.create_dataset(dataset, timeout=30)
else: # fail
print("BigQuery copy failed")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to use the logger here (and throughout the method) rather than print. Here's an example of the logger in this file - if you've not used it before and are unfamiliar with log levels happy to explain a bit more how they work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks--this looks like an error level (from what I know)--does that sound right?

result = job.result()
print(result)
except Exception as e:
print("Exception during copy between projects", e)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it makes sense to capture the NotFound exceptions and log the results without actually raising the exception, this exception here just captures any exception and saves the feedback. Generally speaking we want to raise any unexpected errors so we can notice and fix whatever the problem is sooner rather than later, so I would remove this specific try/except block and just call the code normally. If there's an exception, there's an exception - and we want to know about it!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking, in software development, "don't catch general exceptions" is a good rule of thumb. This blog post has some good details of why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants