Stata Help

Stata Data Formats & Changing Them

Numbers

Numbers in Stata can take a variety of interesting formats, including negative values, decimals and positive and negative scienfitic notation (e.g., 1.0e+2 for a hundred).

Any variable in Stata's numeric format begins with a % sign. From there what you do depends on how you want the data to be displayed. Adding a - will cause the data to display in a left-aligned fashion (the default is right). If you want to retain leading zeroes, you can add a 0. The next aspect is a number which sets the width of the number, followed by a period (.) and another number specifying how many places past the decimal your number extends. Finally, you can add an e (scientific notation), f (fixed format), or g (general format, wherein Stata chooses based on the number being displayed) to the end of the command statement.

Thus to specify that you want a variable named wrongformat to have 2 columns of width and two decimal places, the command would be as follows format wrongformat %2.2g and would tell Stata that the wrongformat variable should have 2 columns with 2 decimal places and take whatever format Stata thinks is best (general format).

Strings

Only one type exists for strings (shorthand in most data programs for string of characters), which is str. After str can come any number between 1 and 224. This translates into the default format as well. That is, a str6 type has a %6s format. Often when importing data, Stata can mistake a numeric variable for a string variable. When Stata does this, the number will look right when you browse , but attempts to run commands will turn up zero observations. In this case, you want to convert the actual variable, not the format type (see here).

Note: When a string length is set, Stata does not care if your actual data is longer. That is, if you have a state variable with a cell that reads 'Washington' but set the string length to display %4s, the cell will now read 'Wash...', while returning to a format of %10s will change the display back to 'Washington'.

Similar to changing the number format, the command to change the string format is format %[string length]s [variable name] with the optional - before the number to align the display to the left. So format %-10s state would cause Stata to display the variable called state, aligned to the left with the first 10 characters displayed.

Dates

Similar rules exist for changing the way the date and time displayed. See the dates and times tutorial

Back to Data Entry

Back to Tutorials