Here is the link to the Rocoto GitHub repository, along with a README and documentation that provides some instructions on configuring and using it:
https://
Molly and I have always found the instructions to be a bit limiting but it has improved over the past few years. However, there will be 4 rocoto commands that will make your life much easier. I’ll note that the very first thing you’ll need to do in your terminal is load the rocoto module. Simply type module load rocoto and you’ll be set with the following command options:
rocotorun¶
This command starts and continuously updates the workflow jobs/tasks so long as it is executed. We typically include this command in a type of wrapper script that is then called every 5 minutes in the role.amb-verif crontab. For example, the workflow we were looking at earlier is inside the cron with this entry:
*/5 * * * * $HOME/VERIF/xml/submit_precip_1hr_rtInside the submit_precip_1hr_rt file, you’ll find the rocotorun command, with the XML file that configures the workflow and the .db file that rocoto uses as a type of database system to track progress:
rocotorun -w $HOME/VERIF/xml/precip_1hr_rt.xml -d $HOME/VERIF/xml/precip_1hr_rt.dbrocotostat¶
This command is how you see the status of jobs/tasks that rocoto is running. The syntax is very similar to rocotorun, you can simply use the following:
rocotostat -w $HOME/VERIF/xml/precip_1hr_rt.xml -d $HOME/VERIF/xml/precip_1hr_rt.dbHowever, this will give you a list of all cycles/tasks/jobs that the workflow has or ever will run, so it can be overwhelming. A shortcut would be the following command with the -s status flag that looks for all “Active” cycles
$ rocotostat -w $HOME/VERIF/xml/precip_1hr_rt.xml -d $HOME/VERIF/xml/precip_1hr_rt.db -s | grep Active
202310191200 Active Oct 19 2023 12:00:00Then you can include the -c flag to look at a specific cycle (I’ve only included the operational HRRR tasks here but there are more):
$ rocotostat -w $HOME/VERIF/xml/precip_1hr_rt.xml -d $HOME/VERIF/xml/precip_1hr_rt.db -c 202310191200
CYCLE TASK JOBID STATE EXIT STATUS TRIES DURATION
================================================================================================================================
202310191200 StageIV_interp 59546758 SUCCEEDED 0 1 28.0
202310191200 HRRR_OPER_precip_init 59546762 SUCCEEDED 0 1 27.0
202310191200 HRRR_OPER_precip_03km 59547493 SUCCEEDED 0 1 178.0
202310191200 HRRR_OPER_precip_13km 59547494 SUCCEEDED 0 1 60.0
202310191200 HRRR_OPER_precip_20km 59547495 SUCCEEDED 0 1 53.0
202310191200 HRRR_OPER_precip_40km 59547496 SUCCEEDED 0 1 48.0
202310191200 HRRR_OPER_precip_80km 59547497 SUCCEEDED 0 1 48.0This command gives you details on the task status, length, and success or not. If you want to look at a particular task, then the following command will help.
rocotocheck¶
This command gives you the complete picture of the taks, including the exact configuration that was set up in our workflow XML file. I’ve found this particularly helpful when I’m trying to figure out what exactly goes wrong with a task failure, from being able to grab the exact log file to seeing if there is a problem with the environment. The syntax is very similar to rocotostat, just add the -t task flag to specify which one to look at:
$ rocotocheck -w $HOME/VERIF/xml/precip_1hr_rt.xml -d $HOME/VERIF/xml/precip_1hr_rt.db -c 202310191200 -t StageIV_interp
Task: StageIV_interp
account: amb-verif
command: /home/role.amb-verif/VERIF/bin/wrapper.ksh
cores: 1
cycledefs: 1hr
deadline: 202310191600
final: false
jobname: StageIV_interp
join: /home/role.amb-verif/VERIF/verif/precip_1hr/StageIV/log/StageIV_interp_2023101812.log
maxtries: 2
memory: 2G
name: StageIV_interp
partition: tjet:ujet:sjet:xjet
throttle: 9999999
walltime: 00:40:00
environment
DAY ==> 18
EXECDIR ==> /home/role.amb-verif/VERIF/exec
GRIB2VAR ==> APCP
HOUR ==> 12
IPOPTS ==> 2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
MONTH ==> 10
NCLDIR ==> /home/role.amb-verif/VERIF/bin
OBSDIR ==> /public/data/precip/stage4/grib2
PCPMASK ==> /home/role.amb-verif/VERIF/static/stageIV_pcp_mask.nc
REALTIMEDIR ==> /home/role.amb-verif/VERIF/verif/precip_1hr/StageIV/realtime
SCRIPT ==> /home/role.amb-verif/VERIF/bin/StageIV_interp_1hr_grib2.py
SCRIPTDIR ==> /home/role.amb-verif/VERIF/bin
YEAR ==> 2023
Cycle: 202310191200
Valid for this task: YES
State: done
Activated: 2023-10-19 12:00:00 UTC
Completed: 2023-10-19 12:10:12 UTC
Expired: -
Job: 59546758
State: SUCCEEDED (COMPLETED)
Exit Status: 0
Tries: 1
Unknown count: 0
Duration: 28.0rocotoboot¶
Finally, should you ever need to manually resubmit a task, this is the command that will help you. The syntax is exactly the same as rocotocheck, and it may prompt you for confirmation if a task is in an expired state (it has exceeded the cycle lifespan):
$ rocotoboot -w $HOME/VERIF/xml/precip_1hr_rt.xml -d $HOME/VERIF/xml/precip_1hr_rt.db -c 202310191200 -t StageIV_interp
task 'StageIV_interp' for cycle '202310191200' has been bootedI hope this helps and feel free to make changes wherever you’d like!