Dear all,
I am currently working on my master thesis, but I am struggling to find the right command for the following problem since I just started using Stata.
The database I use is based on the NBER patent datasets with the focus of matching patent numbers and names. Therefore, the patent number is assigned to several inventor names, working on the same project within the same organization. However, the names are not unique since the same people can work on several patents over time.
The problem can be illustrated by the following example:
Theoretical you need to create pair-wise links with a new unique ID between every inventor A and B, A and C, B and C etc. based on their identical patent number (e.g. 1,2,3). But since the dataset contains over 5.700 inventors, I am not sure whether it is possible to create dummy variables. Afterwards, the unique IDs have to be counted.
Any guidance would be greatly appreciated.
Thank you very much!
Best,
Carolin
I am currently working on my master thesis, but I am struggling to find the right command for the following problem since I just started using Stata.
The database I use is based on the NBER patent datasets with the focus of matching patent numbers and names. Therefore, the patent number is assigned to several inventor names, working on the same project within the same organization. However, the names are not unique since the same people can work on several patents over time.
The problem can be illustrated by the following example:
Patent ID | Inventor | create unique Id pairs | COUNT |
1 | A | AB | 2 |
1 | B | AC | 1 |
1 | C | BC | 1 |
2 | D | DE | 2 |
2 | E | ||
3 | A | AB | 2 |
3 | B | AD | 1 |
3 | D | AE | 1 |
3 | E | AF | 1 |
3 | F | BD | 1 |
BE | 1 | ||
BF | 1 | ||
DE | 2 | ||
DF | 1 | ||
EF | 1 |
Theoretical you need to create pair-wise links with a new unique ID between every inventor A and B, A and C, B and C etc. based on their identical patent number (e.g. 1,2,3). But since the dataset contains over 5.700 inventors, I am not sure whether it is possible to create dummy variables. Afterwards, the unique IDs have to be counted.
Any guidance would be greatly appreciated.
Thank you very much!
Best,
Carolin