Data @ Reed


This is from work with an Economics thesis student; s/he was conducting a timeseries analysis on crop data. The crop data started as a *.csv (comma-separated values) file. Below you can find explanation of the steps as well as code used to carry out analysis.

1) Bring the data in to Stata, from the *.csv format

. import delimited "/Users/bottk/Downloads/crop_price.csv"


2) Generate a new datetime variable

(Note: datetime can be pretty complicated. Read the datetime documentation for background.)

In this case, a variable “month” existed in the dataset that was in the format M20Y. (So, for example, the dates were entered as “Sept00”, “Oct01” — with month (M) stated explicitly, assuming a year of “20__” and ending in the stated last two digits.) The below code generated a new variable called “eventdate2”. The “monthly” notation tells Stata where to find the different pieces of the date information

. gen double eventdate2 = monthly(month, "M20Y")


3) Format the variable you just created

Once you have generated the new variable from the old information, you need to set that variable to a specific datetime format. This can be tricky; see datetime in the Stata help documentation (type “help datetime” at the command line in Stata) for details.

. format eventdate2 %tm


4) Declare your data to be time series data

Now you have a time variable that Stata understands — so you are ready to define your timeseries. Use the variable you have created (in this case, “eventdate2”) to set that time series.

. tsset eventdate2


5) Visualize your time series

Graphs can provide an accessible way to assess data — moreso than tables. “tsline” is a time-series specific line graph.

. tsline wheat_price rice_price