-
I am wanting to work though the HTCondor totorial. There was a failure on my first attempt. I'm sorry I did not note it down. Now I want to make a second attempt. I removed all that I could find regarding my first attempt but now I get several 409 errors saying that various features already exist. I can provide more information if that is helpful but I wondered if there is a basic clean-up step that I am missing? Thanks, Carl Ross |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 25 replies
-
Hi Carl! Thanks for trying the HTCondor tutorial. My guess, based on what you are telling me is that you tried the HTCondor tutorial using Cloud Shell (the command line tool in our web interface) and that you manually removed resources after provisioning a first go? Then you tried to provision a 2nd set?
You can do (3) by changing the deployment_name inside the blueprint YAML or by supplying an override at the command line. e.g. We'd also love to know more about what your overall objective is. |
Beta Was this translation helpful? Give feedback.
-
You might find designing a blueprint to be a useful link to look at. The idea is that What I'm trying to show you above is that you can set the deployment_name in the YAML file or at the command-line. e.g.
We have a couple paths forward:
(1) is probably the simplest. (2) might teach you a bit about how terraform works if you're eager to learn. There is also a 3rd path. I do think that we should discuss Batch would be more appropriate for you. An advantage of Batch is there may be no infrastructure for you to maintain. Are you already running your job containerized in HTCondor? Our Batch solution can create a "login node" in which you can test your job and then submit them at larger scale. The magic of both Batch and our HTCondor solutions is that the compute nodes expand and contract based upon job submissions. So you don't need to say in advance how many nodes you want. They will be turned on to run a job and powered down when no jobs remain. Let me know about (1) or (2); we should definitely get you through a clean |
Beta Was this translation helpful? Give feedback.
-
I think the following manual steps are shortest path to cleaning up. Browse to each URL and take the action noted. You will probably need to make sure that your project is selected the first time, but then it should "stick".
Regarding Batch and VM type selection... Are the tasks fully independent? e.g. if you run on a 112 core machine you are running 112/N jobs, where N is the number of threads per job. You are running the containerized version under Docker? |
Beta Was this translation helpful? Give feedback.
-
I'd like to begin by suggesting you run a non-containerized job using Batch:
The blueprint above provisions:
You could remove the shared filesystem if it's not useful for you, but it might be a useful place to install non-containerized code from the VM environment. Also a useful place for output data. We can then discuss a containerized approach if you'd like. |
Beta Was this translation helpful? Give feedback.
-
@carlkross we are going to close this discussion, but please do not hesitate to respond if you would like further assistance, esp. with Batch solution. My recommendation would be to try the example without (for now) considering your application and submit a few "hello, world!" jobs. |
Beta Was this translation helpful? Give feedback.
Hi Carl!
Thanks for trying the HTCondor tutorial. My guess, based on what you are telling me is that you tried the HTCondor tutorial using Cloud Shell (the command line tool in our web interface) and that you manually removed resources after provisioning a first go? Then you tried to provision a 2nd set?
terraform destroy
when you are done with your work. Did you do that?You can do (3) by changing the d…