Hi,
First let me provide a generalised version of my dataset.
I wish to sum the rows based on the variables. Here, as in my true dataset, I want to sum the variables depending common variable prefixes. Therefore, for this very simple dataset, I used the following code, which provides the desired results.
The issue lies in the fact that my true dataset contains 5000 variables, each with 5000 observations. Therefore, I need to generate a large number of additional variables, which is time consuming.
My data is formatted like a square matrix, just like the sample dataset I provided above. Therefore, I was wondering if I convert the data to a matrix and then compute the sums, would this be quicker?
Is there another alternative I have not yet thought of?
First let me provide a generalised version of my dataset.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str2 a byte(b1 b2 b3 c1 c2 c3 d1 d2 d3 e1 e2 e3) "b1" 1 4 7 5 3 9 7 5 3 2 1 6 "b2" 3 6 9 12 15 6 9 12 15 3 7 3 "b3" 5 3 3 3 3 3 3 3 3 15 9 4 "c1" 8 2 4 6 8 2 4 6 8 3 3 5 "c2" 9 4 5 6 7 4 5 6 7 8 4 1 "c3" 9 5 7 9 11 5 7 9 11 7 5 1 "d1" 8 6 1 2 5 6 1 3 4 11 7 1 "d2" 7 4 1 2 7 4 1 5 1 5 1 5 "d3" 5 3 2 1 7 3 2 1 2 7 1 9 "e1" 4 3 3 3 3 3 3 3 3 7 2 8 "e2" 6 2 8 14 20 2 8 14 20 3 3 9 "e3" 6 1 9 17 25 1 9 17 25 20 8 6 end
I wish to sum the rows based on the variables. Here, as in my true dataset, I want to sum the variables depending common variable prefixes. Therefore, for this very simple dataset, I used the following code, which provides the desired results.
Code:
egen b = rowtotal(b1-b3) egen c = rowtotal(c1-c3) egen d = rowtotal(d1-d3) egen e = rowtotal(e1-e3) keep a b c d e
The issue lies in the fact that my true dataset contains 5000 variables, each with 5000 observations. Therefore, I need to generate a large number of additional variables, which is time consuming.
My data is formatted like a square matrix, just like the sample dataset I provided above. Therefore, I was wondering if I convert the data to a matrix and then compute the sums, would this be quicker?
Is there another alternative I have not yet thought of?