Dear Statalisters,

I want to create a set of observations which is used to test different models later on. Lets say I have three Variables y and x1 and x2 with 1000 obervations each. The observations should be matched in a quasi-random process following a dependency structure between the Variables x1/x2 and y. This matching should have a stochastic component.
For example, I divide the observations in quintiles based on their values and create a matching-matrix with a negative dependency:
matrix match=
(0.025,0.05,0.1,0.2,0.625\
0.05,0.1,0.2,0.45,0.2\
0.1,0.2,0.4,0.2,0.1\
0.2,0.45,0.2,0.1,0.05\
0.625,0.2,0.1,0.05,0.025)

As a result of this process I would have a data set where 62.5% of the values of quintile 1 of y are correctly matched with quintile 1
5 of x1, 20% are falsly matched with quintile 4 and so on. Here, for quintile 3 the correct matching is only 40%.

Is such a matching process possible or are there other ways to accomplish this matching.
Prior to this I have generated y, which follows a mixed distribution and x1/x2 which are not normal distributed.


Kind regards
Steffen