Hello,
I am new to posting on the forum, but I follow it frequently. Thank you to the various contributors, It is very helpful.
I have a regression that I need to run 10 (I need to run it 1000 times, but the trial run is for 10 times) times and save the output using regsave. In each iteration, there is one variable (id_random) that needs to be randomly generated. It is one of the independent variables in the regression, which has a large number of fixed effects. Therefore, I am trying to use batch mode. My dataset is an unbalanced panel dataset with firms across years.
I am having two issues
1) all 10 iterations seem to start from the same seed. I could define the seed every time as set seed `=123456+`SLURM_ARRAY_TASK_ID'' but I am having difficulty capturing the SLURM_ARRAY_TASK_ID in my do file.
2) the output is not getting saved with the task id suffix. I want each iteration to have a stata output file _FEresults_1 _FEresults_2 and so on up to _FEresults_10 based on the regsave command in my do file
regsave using ./_FEresults_SLURM_ARRAY_TASK_ID,
again the issue seems to be that I am having difficulty capturing the SLURM_ARRAY_TASK_ID in my do file
I submit the following script for batch mode.
-------------------------------------------------------------------
#!/bin/bash
#SBATCH -J state # Job name
#SBATCH -o stata_%A-%a.out # Job output file name
#SBATCH --array=1-10 # Replace with your range
#SBATCH -p standard-mem-s # Job queue
#SBATCH -c 6 # Cores
#SBATCH --mem-per-cpu=6G # Memory
module purge
module load stata/mp-15
stata-mp do randomization.do ${SLURM_CPUS_PER_TASK} ${SLURM_ARRAY_TASK_ID}
The randomization.do file contains the following:
-----------------------------------------------------
ssc install regsave
***my input file
use ./_Main, clear
***drop the variable from prior iteration and generate a new random number
drop id_random
gen id_random = runiform(0 , 1200)
***regression
sort firm year
tsset firm year, annual
reg y size i.firm i.year id_random, vce(cluster firm)
regsave using ./_FEresults_SLURM_ARRAY_TASK_ID, replace ci level(95) detail(scalars)
My output is as follows:
-------------------------------
10 log files named stata_SLURM_CPUS_PER_TASK_SLURM_ARRAY_TASK_ID
single stata output file _FEresults_SLURM_ARRAY_TASK_ID instead of _FEresults_1 _FEresults_2 and so on up to _FEresults_10
How do I get the do file to recognize and use the array task id?
Thank you!
Gauri
Related Posts with Using stata in batch mode - issue with random numbers and saving output with task array id
calculate percentageHello, everyone, in my panel data I would like to calculate the percentage of restatements during th…
Why there is missing t-test information for one controlling variable after using PSM and ptest?Array Dear friends, I applied PSM using below command, psmatch2 treat x1 x2 , out(Y) logit ate neig…
Computing predicted probabilities of uptake in a discrete choice experimentDear Statalist, I conduct a discrete choice experiment. I have seven attributes with 6, 4, 3, 3, 3,…
R Class ProgramI use a rclass program to obtain significance stars after running two regressions. When I run the fi…
Panel ARDL Huasman test>>>"invalid new variable name"Dear Stata Team, I am running a Panel ARDL model but want to choose between "mg" or "pmg" using the …
Subscribe to:
Post Comments (Atom)
0 Response to Using stata in batch mode - issue with random numbers and saving output with task array id
Post a Comment