Quantcast
Channel: Statalist
Viewing all articles
Browse latest Browse all 65574

Is there anything more efficient than rowtotal?

$
0
0
Hi,

First let me provide a generalised version of my dataset.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str2 a byte(b1 b2 b3 c1 c2 c3 d1 d2 d3 e1 e2 e3)
"b1" 1 4 7  5  3 9 7  5  3  2 1 6
"b2" 3 6 9 12 15 6 9 12 15  3 7 3
"b3" 5 3 3  3  3 3 3  3  3 15 9 4
"c1" 8 2 4  6  8 2 4  6  8  3 3 5
"c2" 9 4 5  6  7 4 5  6  7  8 4 1
"c3" 9 5 7  9 11 5 7  9 11  7 5 1
"d1" 8 6 1  2  5 6 1  3  4 11 7 1
"d2" 7 4 1  2  7 4 1  5  1  5 1 5
"d3" 5 3 2  1  7 3 2  1  2  7 1 9
"e1" 4 3 3  3  3 3 3  3  3  7 2 8
"e2" 6 2 8 14 20 2 8 14 20  3 3 9
"e3" 6 1 9 17 25 1 9 17 25 20 8 6
end

I wish to sum the rows based on the variables. Here, as in my true dataset, I want to sum the variables depending common variable prefixes. Therefore, for this very simple dataset, I used the following code, which provides the desired results.

Code:
egen b = rowtotal(b1-b3)
egen c = rowtotal(c1-c3)
egen d = rowtotal(d1-d3)
egen e = rowtotal(e1-e3)

keep a b c d e

The issue lies in the fact that my true dataset contains 5000 variables, each with 5000 observations. Therefore, I need to generate a large number of additional variables, which is time consuming.

My data is formatted like a square matrix, just like the sample dataset I provided above. Therefore, I was wondering if I convert the data to a matrix and then compute the sums, would this be quicker?

Is there another alternative I have not yet thought of?

Viewing all articles
Browse latest Browse all 65574

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>