in IMHO, currently biggest Stata´s flaw, in data management area, is the native "lack" of support for key-value (including JSON, flexible schemas). Maybe, It relegates that to the python integration (sfi) or it will debut in v18.
Anyway, most data available, nowadays, comes in such format.
See toy example below:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input byte(month day) int year str8 product str45 tags 1 1 2022 "product1" `"{ "Dept": "HR", "Proj": "alpha"}"' 1 2 2022 "product3" `"{ "Dept": "OPS", "Proj": "beta"}"' 1 3 2022 "product3" `"{ "Dept": "IT", "Proj": "delta"}"' 1 4 2022 "product1" `"{ "Dept": "IT", "Proj": "gamma"}"' 1 5 2022 "product3" `"{ "Dept": "OPS", "Proj": "beta"}"' 1 6 2022 "product1" `"{ "Dept": "HR", "Proj": "epsilon"}"' 1 7 2022 "product4" `"{ "Dept": "HR", "Proj": "zeta"}"' 1 8 2022 "product4" `"{ "Dept": "MKT", "Proj": "gamma"}"' end ------------------ copy up to and including the previous line ------------------ Listed 8 out of 8 observations . list +----------------------------------------------------------------------+ | month day year product tags | |----------------------------------------------------------------------| 1. | 1 1 2022 product1 { "Dept": "HR", "Proj": "alpha"} | 2. | 1 2 2022 product3 { "Dept": "OPS", "Proj": "beta"} | 3. | 1 3 2022 product3 { "Dept": "IT", "Proj": "delta"} | 4. | 1 4 2022 product1 { "Dept": "IT", "Proj": "gamma"} | 5. | 1 5 2022 product3 { "Dept": "OPS", "Proj": "beta"} | |----------------------------------------------------------------------| 6. | 1 6 2022 product1 { "Dept": "HR", "Proj": "epsilon"} | 7. | 1 7 2022 product4 { "Dept": "HR", "Proj": "zeta"} | 8. | 1 8 2022 product4 { "Dept": "MKT", "Proj": "gamma"} | +----------------------------------------------------------------------+
Possible tools would be Regular expressions, SSC JSON commands, Associative Arrays.
output formats :
Code:
+------------------------------------------------+ | month day year product Dept Proj | |------------------------------------------------| 1. | 1 1 2022 product1 HR alpha | 2. | 1 2 2022 product3 OPS beta | 3. | 1 3 2022 product3 IT delta | 4. | 1 4 2022 product1 IT gamma | 5. | 1 5 2022 product3 OPS beta | |------------------------------------------------| 6. | 1 6 2022 product1 HR epsilon | 7. | 1 7 2022 product4 HR zeta | 8. | 1 8 2022 product4 MKT gamma | +------------------------------------------------+
Code:
+-------------------------------------------------------------------------------------------------------------------------------------------------------+ | month day year product Dept_HR Dept_IT Dept_OPS Dept_MKT Proj_alpha Proj_beta Proj_gamma Proj_delta Proj_epsilon Proj_zeta | |-------------------------------------------------------------------------------------------------------------------------------------------------------| 1. | 1 1 2022 product1 1 0 0 0 1 0 0 0 0 0 | 2. | 1 2 2022 product3 0 0 1 0 0 1 0 0 0 0 | 3. | 1 3 2022 product3 0 1 0 0 0 0 0 1 0 0 | 4. | 1 4 2022 product1 0 1 0 0 0 0 1 0 0 0 | 5. | 1 5 2022 product3 0 0 1 0 0 1 0 0 0 0 | |-------------------------------------------------------------------------------------------------------------------------------------------------------| 6. | 1 6 2022 product1 1 0 0 0 0 0 0 0 1 0 | 7. | 1 7 2022 product4 1 0 0 0 0 0 0 0 0 1 | 8. | 1 8 2022 product4 0 0 0 1 0 0 1 0 0 0 | +-------------------------------------------------------------------------------------------------------------------------------------------------------+
0 Response to key-value format
Post a Comment