Hello,
I am new to posting on the forum, but I follow it frequently. Thank you to the various contributors, It is very helpful.
I have a regression that I need to run 10 (I need to run it 1000 times, but the trial run is for 10 times) times and save the output using regsave. In each iteration, there is one variable (id_random) that needs to be randomly generated. It is one of the independent variables in the regression, which has a large number of fixed effects. Therefore, I am trying to use batch mode. My dataset is an unbalanced panel dataset with firms across years.
I am having two issues
1) all 10 iterations seem to start from the same seed. I could define the seed every time as set seed `=123456+`SLURM_ARRAY_TASK_ID'' but I am having difficulty capturing the SLURM_ARRAY_TASK_ID in my do file.
2) the output is not getting saved with the task id suffix. I want each iteration to have a stata output file _FEresults_1 _FEresults_2 and so on up to _FEresults_10 based on the regsave command in my do file
regsave using ./_FEresults_SLURM_ARRAY_TASK_ID,
again the issue seems to be that I am having difficulty capturing the SLURM_ARRAY_TASK_ID in my do file
I submit the following script for batch mode.
-------------------------------------------------------------------
#!/bin/bash
#SBATCH -J state # Job name
#SBATCH -o stata_%A-%a.out # Job output file name
#SBATCH --array=1-10 # Replace with your range
#SBATCH -p standard-mem-s # Job queue
#SBATCH -c 6 # Cores
#SBATCH --mem-per-cpu=6G # Memory
module purge
module load stata/mp-15
stata-mp do randomization.do ${SLURM_CPUS_PER_TASK} ${SLURM_ARRAY_TASK_ID}
The randomization.do file contains the following:
-----------------------------------------------------
ssc install regsave
***my input file
use ./_Main, clear
***drop the variable from prior iteration and generate a new random number
drop id_random
gen id_random = runiform(0 , 1200)
***regression
sort firm year
tsset firm year, annual
reg y size i.firm i.year id_random, vce(cluster firm)
regsave using ./_FEresults_SLURM_ARRAY_TASK_ID, replace ci level(95) detail(scalars)
My output is as follows:
-------------------------------
10 log files named stata_SLURM_CPUS_PER_TASK_SLURM_ARRAY_TASK_ID
single stata output file _FEresults_SLURM_ARRAY_TASK_ID instead of _FEresults_1 _FEresults_2 and so on up to _FEresults_10
How do I get the do file to recognize and use the array task id?
Thank you!
Gauri
Related Posts with Using stata in batch mode - issue with random numbers and saving output with task array id
Append command changes the content of observations of the appended fileI have an issue with the append command. I have a couple of stata files that contain three variable…
Merge panel data with same variables but different observationsHi I am trying to merge two different datasets with panel data. One of them contains data about sub…
Total Factor productivty using malmq2 commandHi every body,,, Currently i am conducting study to evlaute the productivty and the tehcnical effic…
Drop observations if they have equal values on variable A but different values dummy variable BHello everyone, I would like to drop all observations, that have the same value on variable product…
Weighted mean for each year, by regions in a panel dataI have a panel dataset of 51 countries for 19 years. For each of the panels, it is subdivided into f…
Subscribe to:
Post Comments (Atom)
0 Response to Using stata in batch mode - issue with random numbers and saving output with task array id
Post a Comment