Difference between revisions of "DOCK on AWS"

From DISI
Jump to navigation Jump to search
m (asd)
Line 40: Line 40:
  
  
= after job complete =  
+
= after job completes =  
 
* 1. check for complete and run
 
* 1. check for complete and run
 
* 2. combine blazing fast
 
* 2. combine blazing fast
 
* 3. extract mol2 files
 
* 3. extract mol2 files
 
* 4. download data for processing and review.
 
* 4. download data for processing and review.
 +
 +
 +
= job maintenance =
 +
* 1. move to glacier
 +
* 2. run a variation of the job
 +
* 3. harvest a variation of the job
 +
 +
 +
= background reading =
 +
* https://aws.amazon.com/getting-started/hands-on/run-batch-jobs-at-scale-with-ec2-spot/
 +
* troubleshooting: https://aws.amazon.com/premiumsupport/knowledge-center/batch-invalid-compute-environment
 +
* watching out for spending too much money
 +
* more debugging https://aws.amazon.com/premiumsupport/knowledge-center/batch-job-stuck-runnable-status/
 +
  
 
[[Category:DOCK3.8]]
 
[[Category:DOCK3.8]]
 
[[Category:ZINC22]]
 
[[Category:ZINC22]]
 
[[Category:AWS]]
 
[[Category:AWS]]

Revision as of 20:53, 17 March 2021

DOCK on AWS

First time set up

  • 1. create an account
https://docs.aws.amazon.com/batch/latest/userguide/spot_fleet_IAM_role.html#spot-fleet-roles-cli

define these roles

AmazonEC2SpotFleetRole, AWSServiceRoleForEC2Spot and AWSServiceRoleForEC2SpotFleet.
  • 2. create awsuser (optional!)
  • 3. create an S3 bucket "results2021"
within create dockfiles, database.txt output1
  • 4. set up aws cli access
set up the ability to upload. On your client computer you need awscli. you need to set up your credentials. you need to upload. also work on download.
  • 5. AWS Batch. choose new batch experience.
set up compute environments. env1.  managed. env1. enable computer environment. AWSBatchService role. Spot. maximum on demand price 100.  minimum vCPUs 0. max vCPUs 256.  desired vCPUs 0. BEST_FIT_PROGRESSIVE. 
set up job queue. queue1. 
set up job definition. jobdef4.  EC2. retry 1. execution timeout 14400. image btingle/dockaws:latest . bash . vcpus 1. memory 2048. 
S3_DOCKFILES_LOCATION s3://results2021/dockfiles
SHRTCACHE /tmp
AWS_ACCESS_KEY_ID xxxxx
S3_INPUT_LOCATION s3://btingletestbucket/input
AWS_SECRET_ACCESS_KEY  xxxxx
S3_OUTPUT_LOCATION s3://btingletestbucket/output1
AWS_DEFAULT_REGION us-east-2
Enable privileged mode root. 
log driver awslogs


Set up database list to run

  • 1. set this up.


Each job

  • 1. Upload someproject.dockfiles.tgz into dockfiles.
  • 2. reference database set up above.
  • 2. set up output directory in S3.
  • 3. set up job
  • 4. run job


after job completes

  • 1. check for complete and run
  • 2. combine blazing fast
  • 3. extract mol2 files
  • 4. download data for processing and review.


job maintenance

  • 1. move to glacier
  • 2. run a variation of the job
  • 3. harvest a variation of the job


background reading