## Weighted Data in Stata

There are four different ways to weight things in Stata. These four weights are frequency weights (`fweight`

or `frequency`

), analytic weights (`aweight`

or `cellsize`

), sampling weights (`pweight`

), and importance weights (`iweight`

).

*Frequency* weights are the kind you have probably dealt with before. Basically, by adding a frequency weight, you are telling Stata that a single line represents observations for multiple people. The other weighting options are a bit more complicated.

*Analytic* weights observations as if each observation is a mean computed from a sample of size n, where n is the weight variable.

*Sampling* weights (a.k.a. probability weights) cover situations where random sampling without replacement occurs. You can learn more about sampling weights reading this Demographic and Health Survey help page.

*Importance* weights, unlike the other three types, do not have a specific formula and can only be used with certain commands; they are primarily useful to programmers and will not be discussed here any further.

Most estimation commands can take the first three types of weights. If you are uncertain whether a command can take a weighting, read its help page `help [command]`

. For example, the first thing in `help regress`

is a syntax diagram which includes [weight], which means you can use weight commands. To use a weight command you must have a variable that contains the weight information.

Assuming a command allows weights, the syntax simply adds `[[weight type]=[name of weight variable]] `

before listing any options. For example, presuming I wanted to run a regression and had an analytic weight column called "n", the command would be ` regress y x1 x2 x3 [aweight=n]`

Typing `regress y x1 x2 x3 [cellsze=n]`

runs the exact same command.