Stata Help

Calculating New Variables Using Existing Variables in Stata

The basic command to generate a new variable is gen newvar 

This will create a variable called, appropriately, newvar. There are a variety of built-in functions and manual processes you can then do to calculate your desired value. Note that you can only calculate variables from existing string variables.

For example, if I had four variables, named se1, se2, se3 and se4 (implying they all came from a self-esteem scale) and I wanted to calculate a self-esteem value using them, I could type the following into the Command window: gen se=(se1+se2+se3+se4)/4 This would create a new variable called se that had as its value the average of the four self-esteem items. Note that the / tells Stata to divide. The * tells Stata to multiply, and the + and - signs of course tell Stata to add and subtract. 

Now pretend I have a variable that is skewed. One way to attempt to correct for the skew is to take a logarithm. This is a good example of one of Stata's built-in mathematical functions that can be used in the generation of new variables. Let's pretend the variable is named askew. To take the log of askew, you would type gen logskew=log(askew) As before, this command will generate a new variable called "logskew" that has as its content the log base ten of the skewed variable. To see a complete list of mathematical functions available for calculating new variables type help math functions into the Command window. In addition to the more common mathematical functions, help functions will bring up a complete list of all the types of functions Stata offers. Click on blue text to go to the specific help file.



Back