[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Reshape: from wide to long |

Date |
Tue, 16 Sep 2003 10:28:39 +0100 |

Christer.Thrane@hil.no > > I have a panel data set (in wide format) where the same > individuals were > interviewed eight times during a ten-year period. During > the ten-year > period, about 30% of the orginal sample was lost due to attrition. > > Some variables were measured 2 times, some were measured 3 > ... and some > were measured 5 times. That is, my data looks like: > > y1 y2 y3 y4 x1 x2 z1 z2 z3 v1 v2 v3 v4... > > First, if I understand the manual correctly (which I > probably don't), the > reshape command treats x3 above as a missing observation. > The manual says > (p. 393): "Missing variables are treated like variables with missing > values". As I understand it, however, x3 is not missing (as > in attrition) - > it was just not included in the survey that particular year. > > Should this concern me, or is it just a technicality that I > can overlook? > More important, will reshape "solve" this case in point? First, you are superimposing your own thinking about the data and what they mean here. That's most sensible for you, but, from the list above, -reshape- is aware only that there are no variables -x3- -x4- -z4-. Whether they are logically impossible, or the people dropped out, or they were there but you never interviewed them, or someone destroyed the data by accident, or in a fit of pique: all that is your issue, not Stata's. Stata has a predisposition to rectangular data structures, so on something like . reshape long y x z v, i(id) missing observations will indeed spring into existence for "times" 3 and 4 for -x-, and so forth. What you do with them is up to you. By and large, they do no harm and Stata commands will almost always do what you want given them. (The main exception is if you forget that -if x > 5- includes missing x.) On the whole, I'd leave them in. I can't say whether this solves your problem, as I'm not clear what the problem is. > > Second, must the variables be placed adjacant as above for > reshape to work, > or can they be organized as: > > y1 x1 z1 v1 y2 x2.... v4 > -reshape- appears indifferent to variable order. If your data are like this it might help you to tidy them up, but again that's your concern. By the way, I find -reshape- a daunting command at times and it's not always easy for me to think through what -reshape- will do in problematic cases. Making up a little data set and playing with it to find out is an obvious tip, but perhaps one worth mentioning. I use small random integers -- it is easy to trace mappings with such data -- so to answer the question above I did something like this: set obs 10 foreach v in y1 y2 y3 y4 x1 x2 z1 z2 z3 v1 v2 v3 v4 { egen `v' = rndint(), max(10) } d l gen id = _n reshape long y x z v, i(id) Here -rndint()- comes from -egenmore- on SSC. Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Reshape: from wide to long***From:*Christer.Thrane@hil.no

- Prev by Date:
**st: RE: foreach nesting** - Next by Date:
**st: RE: xtabond** - Previous by thread:
**st: Reshape: from wide to long** - Next by thread:
**st: Juhn, Murphy and Pierce (1991) changes in wage gap decomposition** - Index(es):

© Copyright 1996–2021 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |