BJ Data Tech Solution

Specialized on Data processing, Data management Implementation plan, Data Collection tools - electronic and paper base, Data cleaning specifications, Data extraction, Data transformation, Data load, Analytical Datasets, and Data analysis. BJ Data Tech Solutions teaches on design and developing Electronic Data Collection Tools using CSPro, and STATA commands for data manipulation. Setting up Data Management systems using modern data technologies such as Relational Databases, C#, PHP and Android.

  • Home
  • Data Management
  • Data Analysis
  • Data Collection Tools Tips
How to create a weighted average for multiple observations for a variable by borrower_id and year

How to create a weighted average for multiple observations for a variable by borrower_id and year

Tuesday, January 31, 2023 Data Cleaning Data management Data Processing
Borroer_id Year Loan amount Loan maturity Loan interest 1 2011 101 60 8.5 1 2011 95 55 5.7 2 2011 85 55 8.6 3 2011 90 44 6.5 3 2012 82 ...
Calculating differences between adjacent elements

Calculating differences between adjacent elements

8:24 PM Data Cleaning Data management Data Processing
Dear all, I wonder if there is any easy solution available in Mata for calculating the difference between adjacent elements of a (column) v...
save regression results for margins ploting

save regression results for margins ploting

6:23 PM Data Cleaning Data management Data Processing
My question is straightforward. Could Stata save regression results so that I do not have to re-run the model when trying to use margins an...
Estimate the difference-in-differences estimator

Estimate the difference-in-differences estimator

3:24 PM Data Cleaning Data management Data Processing
Hi, I restricted my wave sample from 1-10 to 9-10 and I was wondering how would i use the command bysort to create a covid variable. To sh...
Combining variables in the variable list by country

Combining variables in the variable list by country

2:23 PM Data Cleaning Data management Data Processing
I am looking at the dataset where I have three variable lists: life expectancy by country for specific age groups . The age interval for n...
Interpretation of Competing Risk with stcrreg

Interpretation of Competing Risk with stcrreg

1:23 PM Data Cleaning Data management Data Processing
I'm new to using competing risk analysis and want to make sure I'm interpreting and using it correctly (given that I'm getting g...
Multiplying observations in a variable list

Multiplying observations in a variable list

Monday, January 30, 2023 Data Cleaning Data management Data Processing
In a dataset of multiple variable lists, I would like to multiply the value of only selected variables in one variable list by 4. Namely, I ...
Zero event meta analysis with ipdmetan

Zero event meta analysis with ipdmetan

8:24 PM Data Cleaning Data management Data Processing
Hey everyone, I'm trying to run a two-stage meta-analysis to generate a forest plot with individual participant data using ipdmetan to p...
Traj command (dropout model)

Traj command (dropout model)

5:24 PM Data Cleaning Data management Data Processing
Hello everyone, I am using the traj command to examined gait speed trajectories and I have a question about the dropout model option. When...
Making a dummy for threshold

Making a dummy for threshold

3:29 PM Data Cleaning Data management Data Processing
Hi, I am going to make a dummy Ii ( Em (i)<= Em) that selects firms whose employment (Em(i)) is below a certain threshold, the median ...
Selecting Date Range

Selecting Date Range

1:40 PM Data Cleaning Data management Data Processing
I have a data where one of the variables is CloseDate whose Variable Type is int and Format is %tdnn/dd/CCYY. I would simply like to select ...
How are p-values calculated in an Oaxaca-Blinder Decomposition

How are p-values calculated in an Oaxaca-Blinder Decomposition

1:40 PM Data Cleaning Data management Data Processing
Hi all, I am trying to interpret the results of an Oaxaca-Blinder Decomposition. I am using the popular Oaxaca command. For my project I a...
Solve dynamic non linear deterministic system

Solve dynamic non linear deterministic system

12:24 AM Data Cleaning Data management Data Processing
Dear forum, After several hours of work, I am looking for the forum to help me to find a solution to my problem. I am used to work with GA...
Calculating summary statistics for correlation coefficients

Calculating summary statistics for correlation coefficients

Sunday, January 29, 2023 Data Cleaning Data management Data Processing
Hello, I have calculated some pairwise correlation coefficients between observations in a panel data set within a grouping variable, and I ...
Country-pair specific id for gravity model

Country-pair specific id for gravity model

8:23 PM Data Cleaning Data management Data Processing
Hello all, I'm trying to estimate a gravity model with trade data that is disaggregated at industry level and trying to include countr...
How to merge datasets using joinby?

How to merge datasets using joinby?

8:23 PM Data Cleaning Data management Data Processing
I have the following data. In Data set A, I have rows of children with "h13hhidc" indicating their household ID including the rele...
Describe variables

Describe variables

5:24 PM Data Cleaning Data management Data Processing
I have a dataset (example below) and I would like to know the frequency of males (sex==1) and females (sex==0) that mentioned (resinsulted3_...
How to perform relative mortality in STATA?

How to perform relative mortality in STATA?

3:25 PM Data Cleaning Data management Data Processing
Hi Everyone. I have a dataset of cancer patients from 2004-2014 with survival status until 2015. Now I've been asked to do relative mort...
Sactter plot with different groups while adding label for one particular observation (sepscatter command)

Sactter plot with different groups while adding label for one particular observation (sepscatter command)

4:24 AM Data Cleaning Data management Data Processing
Hi everyone, I have a cross section data for a group of countries and three variables: government spending, test scores and regions (pleas...
Creating a combined averaged using a panel dataset

Creating a combined averaged using a panel dataset

2:24 AM Data Cleaning Data management Data Processing
I have a panel dataset in vertical form with 6 countries; each shows values for 14 indicators between 2000 and 2020. There is a column that ...
How to deal with multicollinearity when adding fixed effects dummies in regression with cross sectional data?

How to deal with multicollinearity when adding fixed effects dummies in regression with cross sectional data?

1:23 AM Data Cleaning Data management Data Processing
Hello, I have cross sectional data with 26 groups. I estimated a probit and fracreg regression for my two research questions. Since my key...
scatter Y X || lfit Y X ||, by(variable)

scatter Y X || lfit Y X ||, by(variable)

Saturday, January 28, 2023 Data Cleaning Data management Data Processing
Good evening, I'd like to run scatter Y X || lfit Y X ||, by(variable) but have everything on one graph instead of multiple graphs. I...
Brant Test Significance Question

Brant Test Significance Question

7:23 PM Data Cleaning Data management Data Processing
Hello everyone. Thank you all for taking the time to answer other questions on this forum, it had been very helpful. This is my first time p...
Best way to improve processing speed for large data sets (~3gb)

Best way to improve processing speed for large data sets (~3gb)

4:24 PM Data Cleaning Data management Data Processing
How much will more RAM help me with processing speed. I am working with a dataset of 87 million records and is about 3gb in size. Its a 20...
Hausman Test Failed

Hausman Test Failed

Friday, January 27, 2023 Data Cleaning Data management Data Processing
Hi! my name is Karina. I run a Hausman test on my stata using command: quietly xtreg ecgrowth jubgrowth sukubunga inflation, fe estimates...
Running ANOVA in loops

Running ANOVA in loops

7:24 PM Data Cleaning Data management Data Processing
Hi, I want to run repeated measures ANOVA on math scores and semester, but the analysis has to be done for all levels of stress (high/mid/lo...
Time invariant dummies.

Time invariant dummies.

5:25 PM Data Cleaning Data management Data Processing
Hi, I am performing a fe model and I am adding a series of country dummies to control for country fe. Unfortunately, all the country dumm...
How can I check if a string has repeated words?

How can I check if a string has repeated words?

4:23 PM Data Cleaning Data management Data Processing
I know the moss package is related to this but I cannot make it work yet(one similar post can be found here ). For example for the string ...
forvalue loop for discontinuous variable

forvalue loop for discontinuous variable

4:23 PM Data Cleaning Data management Data Processing
How I can use a forvalue loop over the variable whose values are not sequentially continuous? I want to use forvalue loop for the industry c...
Panel Data

Panel Data

2:28 PM Data Cleaning Data management Data Processing
Hi, I am using panel data with the wave variable ranging from 1 to 10 however i would like to focus on wave 1 to 8 instead when testing fo...
Testing stability of reggression discontinuity model (TED)

Testing stability of reggression discontinuity model (TED)

1:24 AM Data Cleaning Data management Data Processing
Hello Statalisters, I need your valuable advice. Having applied in my analysis regression discontinuity design (RD), I want to check the ...
matching single county with unique congressional district

matching single county with unique congressional district

Thursday, January 26, 2023 Data Cleaning Data management Data Processing
According to us house election data a single county for a specific state belongs to multiple congressional districts . However, based on th...
Jackknife xtqreg

Jackknife xtqreg

7:24 PM Data Cleaning Data management Data Processing
Using the Grunfeld data, this works: Code: bootstrap, reps(50) cluster(company) idcluster(comp): xtqreg invest mvalue kstock i.comp#c.tim...
Combine multiple rows into one when end_date = start_date

Combine multiple rows into one when end_date = start_date

7:23 PM Data Cleaning Data management Data Processing
Hi there, I am looking for a code that combine the rows when end_date = start_date for a certain ID when all the variables have same value...
Wild bootstrap with ML

Wild bootstrap with ML

7:23 PM Data Cleaning Data management Data Processing
Dear users, I need to estimate a maximum likelihood model with wild bootstrap as I have few cluster issue. I wonder if I can get any advice ...
Bug in the dir extended macro

Bug in the dir extended macro

4:23 PM Data Cleaning Data management Data Processing
Dear All, I wonder if anyone can interpret the below syntax line for me. Thank you, Sergiy Array
New Stata package: ddml for Double/debiased Machine Learning

New Stata package: ddml for Double/debiased Machine Learning

2:25 PM Data Cleaning Data management Data Processing
I am happy to announce a new package that I have written together with Christian Hansen, Mark Schaffer and Thomas Wiemann. We introduce t...
Predicitve Margins and Marginsplots dor continuous variable

Predicitve Margins and Marginsplots dor continuous variable

4:25 AM Data Cleaning Data management Data Processing
Hello together, I am completely new to margins and marginsplots an have the following question: I have this regression: Code: reghdf...
Merging problem of congressional district data with counties

Merging problem of congressional district data with counties

Wednesday, January 25, 2023 Data Cleaning Data management Data Processing
This is my one set of data where I have statefip , district ( which stands for congressional district) and county. I want to merge this data...
Is there a way to make Python stop on error when running PyStata ?

Is there a way to make Python stop on error when running PyStata ?

7:24 PM Data Cleaning Data management Data Processing
Suppose I have the following (toy) code Code: import stata_setup stata_setup.config("C:/Program Files/Stata17", "mp")...
Create empty graph in a loop

Create empty graph in a loop

4:24 PM Data Cleaning Data management Data Processing
Dear Statlisters, I have a series of graphs made in a loop which are then amalgamated using grc1leg (a wrapper for graph combine). There ar...
Changing an estimation stored macro with a previously stored macro

Changing an estimation stored macro with a previously stored macro

2:25 PM Data Cleaning Data management Data Processing
I am trying to change an estimation stored macro with the stored result from the previous estimation. I manage to do so with `e(cmd)' bu...
Discrete Choice Experiment - Fractonnial factorial design

Discrete Choice Experiment - Fractonnial factorial design

2:25 PM Data Cleaning Data management Data Processing
Dear collegues, As PhD student, I want to set up a Discrete Choice Experiment in a consumers survey. The main objective is to measure cons...
10000 posts by William Lisowski

10000 posts by William Lisowski

Tuesday, January 24, 2023 Data Cleaning Data management Data Processing
Congratulations William Lisowski on reaching the milestone of 10,000 posts on Statalist! Your contributions have greatly enriched the commu...
Need some help dealing with duplicates

Need some help dealing with duplicates

7:24 PM Data Cleaning Data management Data Processing
Hi all, I need some help with dealing with duplicates in my data. I have something like this: var1 var2 var3 var4 a x Red 1 ...
Mediation effects with MLM?

Mediation effects with MLM?

6:23 PM Data Cleaning Data management Data Processing
I'm trying to do a somewhat tricky mediation test. Basically, I'm looking at the relationship between parental education (4 levels) ...
Very large T- Statistics

Very large T- Statistics

4:24 PM Data Cleaning Data management Data Processing
Hi Stata Community, I'm running some regressions using Fixed Effect Methodology. The issue is that the outcome of my regression shows ve...
Calculate economic significance for censored (Tobit) regression with multiple imputation

Calculate economic significance for censored (Tobit) regression with multiple imputation

4:24 PM Data Cleaning Data management Data Processing
Hi, I'm trying to calculate economic significance from a censored (Tobit) regression with multiple imputation according to this defini...
creating table following multiple imputation with svy suite

creating table following multiple imputation with svy suite

2:25 PM Data Cleaning Data management Data Processing
Hello. Is it possible to create tables using "collect" series of commands using mi estimate:svy, subpop (subpopulation): logit x...
Computing a cumulative score

Computing a cumulative score

Monday, January 23, 2023 Data Cleaning Data management Data Processing
Hi, I have the following sample of data. I have an indicator of 1 and 0 in once column and I want to compute the sum of consecutive 1s (i....
Adding Number of Observations Label to Stacked Bar Chart

Adding Number of Observations Label to Stacked Bar Chart

7:23 PM Data Cleaning Data management Data Processing
Hi all, I'm trying to make a stacked bar chart that shows the number of observations within each group. My current code looks somethin...
Dates in Stata

Dates in Stata

5:24 PM Data Cleaning Data management Data Processing
I am trying convert some dates in quarters. I am doing this by going into the variables manager and changing the format to quarters, as seen...
Dynamic Panel Equation

Dynamic Panel Equation

3:25 PM Data Cleaning Data management Data Processing
Hello I am trying to build a Dynamic Panel Equation with the following information - Variables: Y, X, and C (controlling variable) - Time...
Sorting data

Sorting data

1:23 PM Data Cleaning Data management Data Processing
Hello, I have the following data below: Code: * Example generated by -dataex-. To install: ssc install dataex clear input double(ID st...
Graphign Functions

Graphign Functions

Sunday, January 22, 2023 Data Cleaning Data management Data Processing
Hello, any idea on how to graph the following function: y=35+2x^2-ln(x*3) I tried this: twoway function y=35+2x^2-ln(x^3), range (0 1) ...
stpm2cr - Maximum Number of Iterations Reached

stpm2cr - Maximum Number of Iterations Reached

10:23 PM Data Cleaning Data management Data Processing
Hi everyone, I'm using -stpm2cr- to model the effect of a treatment on a set of cardiovascular outcomes over a three year period. I ca...
Line graph of percent of frequencies within categories instead of bar graph

Line graph of percent of frequencies within categories instead of bar graph

10:23 PM Data Cleaning Data management Data Processing
Dear statalist, I have a dataset of skin cancer over ten years. I want to plot a line graph (instead of a bar graph) of the percent of fre...
Interpreting Log transformed ITSA model

Interpreting Log transformed ITSA model

6:23 PM Data Cleaning Data management Data Processing
This question might not be so much as a programming question as an interpretation question. I have a sample dataset as follows: Code: *...
Why line, bar, and pie plot dominate publications

Why line, bar, and pie plot dominate publications

4:23 PM Data Cleaning Data management Data Processing
Dear Stata users, There are so many plot types in statistical world, line, bar, pie, box, histogram, violin, radar, mosaic, and diagnose s...
Regression output featuring a period for one variable

Regression output featuring a period for one variable

3:23 PM Data Cleaning Data management Data Processing
I'm using logit regression to link state-level policies with an individual's probability of reemployment. As controls, I feature a h...
Filling in missing data from previous values?

Filling in missing data from previous values?

1:23 PM Data Cleaning Data management Data Processing
Below I have included an example of a wide dataset in which children have ages reported at each wave (ex: wave 5 age is k5agebg). In some in...
Table1 command issue

Table1 command issue

11:23 AM Data Cleaning Data management Data Processing
Hello, when I write this command: table1, by (pes) vars(edad conts \ par_cat cat \ hta cat \ preprev cat \ diabm cat \ diabg cat \ imc conts...
Regression analysis with panel data

Regression analysis with panel data

8:23 AM Data Cleaning Data management Data Processing
Hello Everyone, I have a panel dataset with approximately 5300 observations and I am analysing if a firm's ESG score has a significant...
cannot run GMM

cannot run GMM

12:24 AM Data Cleaning Data management Data Processing
Dear Statalist, I am using stata v.14... I have unbalanced panel data with T = 17 and N = 18. I mostly have reversal causality from 2 of ...
What does "star" option in pwcorr mean?

What does "star" option in pwcorr mean?

12:24 AM Data Cleaning Data management Data Processing
Normally I used pwcorr to have a correlation matrix. And for such a command, we have a star option. From the description, they stated: ...
Why using the same code but can run in one machine and cannot run in another machine resulting r(123)?

Why using the same code but can run in one machine and cannot run in another machine resulting r(123)?

Saturday, January 21, 2023 Data Cleaning Data management Data Processing
Hi all, I suspect the problems previously but I can confirm today that some code of mine can work in my shcool's machine but my person...
Newer Posts Older Posts Home
Subscribe to: Posts (Atom)

Latest Articles

Categories

  • CouchDb Skills
  • Data Analysis
  • Data Cleaning
  • Data management
  • Data Processing
  • Research Methodology

Popular Articles

  • How to drop random years from panel data?
    I have a panel data set, consisting of 125 countries, 36 years. I want to run an IV regression multible times and randomly drop 5 (of the 36...
  • Saving pointer matrixes using -mata matsave-
    I am relatively new to the use of pointers in Mata and have thusfar been impressed with their utility. Specific to this query, I have been...
  • instrumenting a binary endogenous regressor
    Hello, I am trying to run a model with a binary endogenous regressor. I am still learning econometrics so I am sorry if this may be a trivi...
  • "tsegen" by group
    Hi, I would like to calculate the moving average of _b_LogSize _b_LogBM _b_MOM12 _b_cons by months of the year over the last 10 years. For...
  • Fixed Effects for a Panel at a Coarser Level
    Hello, I want to include some fixed effects in my model that I believe are difficult to include so any advice on how exactly this can be d...
  • RDD rdrobust problem
    Dear all, I am researching the effect of grade retention on exam results (which can vary from 0 to 20) and I am using a RDD to research th...
  • Growth model - No convergence
    I would like to develop a latent growth model (LGM) with Stata. The point is to illustrate estimated effects of predictors by using Stata...
  • Nvidia Organizational Structure: functional and hybrid
    Nvidia is 7th largest company in the world with a market cap of USD 1 trillion. Due to the size and scope of its operations, it is difficult...
  • Getting values from second to last loop of a -while- loop
    Hi fellow Statalisters, I am using a -while- loop for a particular application, where I need to retrieve a particular value from the secon...
  • Using weights with xtheckman | xtheckman's fixed effects equivalent
    Hi, I am using six waves of the PSID to estimate several determinants (particularly wealth) of the wage equation and the selection equatio...

Recomended Articles

Powered by Blogger.

About Me

Mtenga Baltazar
View my complete profile

Blog Archive

  • ►  2024 (6)
    • ►  February (6)
  • ▼  2023 (877)
    • ►  November (1)
    • ►  October (9)
    • ►  September (14)
    • ►  July (9)
    • ►  June (15)
    • ►  May (133)
    • ►  April (174)
    • ►  March (176)
    • ►  February (157)
    • ▼  January (189)
      • How to create a weighted average for multiple obse...
      • Calculating differences between adjacent elements
      • save regression results for margins ploting
      • Estimate the difference-in-differences estimator
      • Combining variables in the variable list by country
      • Interpretation of Competing Risk with stcrreg
      • Multiplying observations in a variable list
      • Zero event meta analysis with ipdmetan
      • Traj command (dropout model)
      • Making a dummy for threshold
      • Selecting Date Range
      • How are p-values calculated in an Oaxaca-Blinder D...
      • Solve dynamic non linear deterministic system
      • Calculating summary statistics for correlation coe...
      • Country-pair specific id for gravity model
      • How to merge datasets using joinby?
      • Describe variables
      • How to perform relative mortality in STATA?
      • Sactter plot with different groups while adding la...
      • Creating a combined averaged using a panel dataset
      • How to deal with multicollinearity when adding fix...
      • scatter Y X || lfit Y X ||, by(variable)
      • Brant Test Significance Question
      • Best way to improve processing speed for large dat...
      • Hausman Test Failed
      • Running ANOVA in loops
      • Time invariant dummies.
      • How can I check if a string has repeated words?
      • forvalue loop for discontinuous variable
      • Panel Data
      • Testing stability of reggression discontinuity mod...
      • matching single county with unique congressional d...
      • Jackknife xtqreg
      • Combine multiple rows into one when end_date = sta...
      • Wild bootstrap with ML
      • Bug in the dir extended macro
      • New Stata package: ddml for Double/debiased Machin...
      • Predicitve Margins and Marginsplots dor continuous...
      • Merging problem of congressional district data wit...
      • Is there a way to make Python stop on error when r...
      • Create empty graph in a loop
      • Changing an estimation stored macro with a previou...
      • Discrete Choice Experiment - Fractonnial factorial...
      • 10000 posts by William Lisowski
      • Need some help dealing with duplicates
      • Mediation effects with MLM?
      • Very large T- Statistics
      • Calculate economic significance for censored (Tobi...
      • creating table following multiple imputation with ...
      • Computing a cumulative score
      • Adding Number of Observations Label to Stacked Bar...
      • Dates in Stata
      • Dynamic Panel Equation
      • Sorting data
      • Graphign Functions
      • stpm2cr - Maximum Number of Iterations Reached
      • Line graph of percent of frequencies within catego...
      • Interpreting Log transformed ITSA model
      • Why line, bar, and pie plot dominate publications
      • Regression output featuring a period for one variable
      • Filling in missing data from previous values?
      • Table1 command issue
      • Regression analysis with panel data
      • cannot run GMM
      • What does "star" option in pwcorr mean?
      • Why using the same code but can run in one machine...
      • Dealing w/ multicollinearity in logit model w/ FE ...
      • xthybrid and interactions
      • Requesting help with outreg2 command
      • Differentiating between extended missings in -miss...
      • Heckman probit (heckprobit) with fixed effects; ob...
      • Time varying covariate in Cox model: stset with st...
      • Interflex graph region white
      • Histogram formatting
      • standard error double lasso with clustering
      • Difference in difference Post and Treatment dummies
      • opreg with If
      • Interpreting triple interaction term with continuo...
      • Dynamic Panel Model for Large T and Small N.
      • Substantive differences between "memory" and "size"
      • Drop missing observations at the end of each group
      • ideas for: extracting all unique items, across all...
      • Randomization or pairwise matching
      • xtqreg bootstrap with fixed effects
      • Rolling sd over same quarters
      • Plotting Intercepts from Another Graph
      • Generating a new variable with an if command
      • Help with my work!
      • Second legend Stata
      • confidence intervals on proportion
      • Fractional response model with balanced panel data...
      • Maybe the dumbest question ever
      • Problem with outfile
      • Panel data Fractional Response Models with a binar...
      • ADF unit root test
      • Dropping all the variables that are positioned aft...
      • Reshaping long an usual dataset
      • can't install ivreg2/ivreg28/ivreg29/ivreg210
      • Moderating effects explanation
      • GMM model
  • ►  2022 (2201)
    • ►  December (181)
    • ►  November (180)
    • ►  October (198)
    • ►  September (182)
    • ►  August (182)
    • ►  July (194)
    • ►  June (174)
    • ►  May (167)
    • ►  April (181)
    • ►  March (186)
    • ►  February (170)
    • ►  January (206)
  • ►  2021 (7379)
    • ►  December (327)
    • ►  November (645)
    • ►  October (646)
    • ►  September (639)
    • ►  August (557)
    • ►  July (649)
    • ►  June (656)
    • ►  May (697)
    • ►  April (683)
    • ►  March (697)
    • ►  February (518)
    • ►  January (665)
  • ►  2020 (7956)
    • ►  December (653)
    • ►  November (659)
    • ►  October (598)
    • ►  September (654)
    • ►  August (660)
    • ►  July (682)
    • ►  June (683)
    • ►  May (708)
    • ►  April (692)
    • ►  March (698)
    • ►  February (638)
    • ►  January (631)
  • ►  2019 (9458)
    • ►  December (601)
    • ►  November (643)
    • ►  October (650)
    • ►  September (637)
    • ►  August (645)
    • ►  July (681)
    • ►  June (654)
    • ►  May (1034)
    • ►  April (1079)
    • ►  March (1122)
    • ►  February (876)
    • ►  January (836)
  • ►  2018 (931)
    • ►  December (692)
    • ►  November (239)

© BJ Data Tech Solution | Theme by Rifki.id | Premium Blogger Templates | PBT | Powered by Blogger |-| About | Privacy Policy | Sitemap | Contact | Disclaimer