Hello,
I am new to posting on the forum, but I follow it frequently. Thank you to the various contributors, It is very helpful.
I have a regression that I need to run 10 (I need to run it 1000 times, but the trial run is for 10 times) times and save the output using regsave. In each iteration, there is one variable (id_random) that needs to be randomly generated. It is one of the independent variables in the regression, which has a large number of fixed effects. Therefore, I am trying to use batch mode. My dataset is an unbalanced panel dataset with firms across years.
I am having two issues
1) all 10 iterations seem to start from the same seed. I could define the seed every time as set seed `=123456+`SLURM_ARRAY_TASK_ID'' but I am having difficulty capturing the SLURM_ARRAY_TASK_ID in my do file.
2) the output is not getting saved with the task id suffix. I want each iteration to have a stata output file _FEresults_1 _FEresults_2 and so on up to _FEresults_10 based on the regsave command in my do file
regsave using ./_FEresults_SLURM_ARRAY_TASK_ID,
again the issue seems to be that I am having difficulty capturing the SLURM_ARRAY_TASK_ID in my do file
I submit the following script for batch mode.
-------------------------------------------------------------------
#!/bin/bash
#SBATCH -J state # Job name
#SBATCH -o stata_%A-%a.out # Job output file name
#SBATCH --array=1-10 # Replace with your range
#SBATCH -p standard-mem-s # Job queue
#SBATCH -c 6 # Cores
#SBATCH --mem-per-cpu=6G # Memory
module purge
module load stata/mp-15
stata-mp do randomization.do ${SLURM_CPUS_PER_TASK} ${SLURM_ARRAY_TASK_ID}
The randomization.do file contains the following:
-----------------------------------------------------
ssc install regsave
***my input file
use ./_Main, clear
***drop the variable from prior iteration and generate a new random number
drop id_random
gen id_random = runiform(0 , 1200)
***regression
sort firm year
tsset firm year, annual
reg y size i.firm i.year id_random, vce(cluster firm)
regsave using ./_FEresults_SLURM_ARRAY_TASK_ID, replace ci level(95) detail(scalars)
My output is as follows:
-------------------------------
10 log files named stata_SLURM_CPUS_PER_TASK_SLURM_ARRAY_TASK_ID
single stata output file _FEresults_SLURM_ARRAY_TASK_ID instead of _FEresults_1 _FEresults_2 and so on up to _FEresults_10
How do I get the do file to recognize and use the array task id?
Thank you!
Gauri
Related Posts with Using stata in batch mode - issue with random numbers and saving output with task array id
Replace missing values in panel dataset by the mean, by pasting up, by pasting downHello, I have the following dataset: clear input year id value 1990 1 . 1991 1 . 1992 1 4 1993 1 . …
Tabs for multiple subgroups in one tableI am hoping to run tabs for by various subgroups (different job roles, in this instance). For my tw…
question on line chart outputHello, I created a twoway line chart but the output is too small and I'm not sure how to make it la…
Convert date and time to Stata formatHi all, I would like to convert the following date and time format into Stata format, so that I can…
Creating dummy variable based on percentilesHi Everyone, I have a variable G-Index with the following distribution: Governance | Index | (Gom…
Subscribe to:
Post Comments (Atom)
0 Response to Using stata in batch mode - issue with random numbers and saving output with task array id
Post a Comment