Correlation of one variable between players in panel data

Hi all,

I am working with a panel dataset regarding financial products. The dataset has about 200,000 observations and 50 variables. I struggle with finding correlations due to the form of my dataset.

The dataset (a simplification) looks as follows:

Date	Return (monthly)	ProductID	CompanyID	Currency	Characteristic 1	Characteristic 2	Charcteristic 3
31jan2005	1.3843	1	A	USD	1	0	0
28feb2005	-.47	1	A	USD	1	0	0
30mar2005	2.213	1	A	USD	1	0	0
30jun2001	0.0321	2	C	EUR	1	1	1
30jul2001	0.8911	2	C	EUR	1	1	1
31jan2005	1.383	3	A	EUR	1	0	0
28feb2005	-.047	3	A	EUR	1	0	0
30mar2005	2.213	3	A	EUR	1	0	0
31dec2013	.	4	C	CAD	1	1	1
30jan2014	3.42	4	C	CAD	1	1	1
28feb2016	5.0134	5	A	USD	0	1	0
30mar2016	0.13	5	A	USD	0	1	0
31apr2016	-30.1	5	A	USD	0	1	0

The goal of my research is to determine performance differences between certain characteristics, but this requires cleaning this data first. As you can see, the products linked to productID 1 and 3 have a correlation of almost 1. You can see that the only differences between these two products is the reported currency. This is caused by the company (ProductID 1 & 3 are from the same company) reporting the same product but for two currencies, for whatever reason they have.

Now, this duplication is well-known in this database and I aim to use the adjustment from another research: exclude a product when its returns has a correlation of 0.99 or more with the returns of another product. I reckon with about 3000 unique ProductIDs and 2500 CompanyIDs, the most efficient way [read: not blow up my pc] is to calculate correlations of products if one company reports more than one product. In the case of CompanyID A, run a STATA command that calculates the correlation between the returns of:
ProductID 1 and ProductID 3
ProductID 1 and ProductID 5
ProductID 3 and ProductID 5

I am stuck on this issue as I cannot find other posts that solve this issue and my source that cleans this database does not provide a .do file (if STATA was even used). So here comes my question that I have been grinding for the last hours: With what STATA command, or code, do I calculate the correlations between all products by CompanyID?

BJ Data Tech Solution

Home / Data Cleaning / Data management / Data Processing / Correlation of one variable between players in panel data
Correlation of one variable between players in panel data

0 Response to Correlation of one variable between players in panel data

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Correlation of one variable between players in panel data Correlation of one variable between players in panel data

Related Posts with Correlation of one variable between players in panel data

0 Response to Correlation of one variable between players in panel data

Post a Comment

Home / Data Cleaning / Data management / Data Processing / Correlation of one variable between players in panel data
Correlation of one variable between players in panel data