banner



How To Check What Template You Are Using For Graphs In Stata

three Stata Graphics

Stata has fantabulous graphic facilities, attainable through the graph control, see help graph for an overview. The most mutual graphs in statistics are 10-Y plots showing points or lines. These are available in Stata through the twoway subcommand, which in plow has many sub-subcommands or plot types, the about important of which are scatter and line. I will also describe briefly bar plots, available through the bar subcommand, and other plot types.

Stata ten introduced a graphics editor that can be used to alter a graph interactively. I do non recomment this do, however, because it conflicts with the goals of documenting and ensuring reproducibility of all the steps in your enquiry.

All the graphs in this section (except where noted) utilize a custom scheme with bluish titles and a white groundwork, just otherwise should look the same as your own graphs. I discuss schemes in Section 3.2.5.

three.1 Scatterplots

In this department I will illustrate a few plots using the data on fertility turn down first used in Section 2.1. To read the information from net-enlightened Stata type

. infile str14 country setting effort change /// >     using https://information.princeton.edu/wws509/datasets/effort.raw, articulate (xx observations read)        

To whet your ambition, here's the plot that we volition produce in this section:

3.1.ane A Simple Scatterplot

To produce a simple scatterplot of fertility change by social setting you use the command

          graph twoway scatter modify setting                  

Annotation that you specify y first, then x. Stata labels the axes using the variable labels, if they are defined, or variable names if non. The command may exist abbreviated to twoway scatter, or simply scatter if that is the only plot on the graph. We volition now add together a few bells and whistles.

iii.ane.two Fitted Lines

Suppose we want to show the fitted regression line as well. In some packages you would need to run a regression, compute the fitted line, and so plot information technology. Stata tin do all that in ane footstep using the lfit plot type. (There is also a qfit plot for quadratic fits.) This can be combined with the scatter plot by enclosing each sub-plot in parenthesis. (One can also combine plots using two horizontal bars || to carve up them.)

          graph twoway (besprinkle setting try) ///              (lfit setting endeavor)        

Now suppose we wanted to put confidence bands around the regression line. Stata can do this with the lfitci plot type, which draws the confidence region every bit a gray band. (There is as well a qfitci band for quadratic fits.) Because the confidence band tin can obscure some points nosotros draw the region starting time and the points later

          graph twoway (lfitci setting effort) ///              (besprinkle setting effort)                  

Annotation that this command doesn't label the y-axis simply uses a fable instead. You could specify a label for the y-axis using the ytitle() option, and omit the (rather obvious) legend using legend(off). Here we specify both equally options to the twoway command. To brand the options more obvious to the reader, I put the comma at the start of a new line:

          graph twoway (lfitci setting endeavour) ///              (scatter setting effort) ///            , ytitle("Fertility Reject") fable(off)        

iii.1.3 Labeling Points

There are many options that permit y'all to control the markers used for the points, including their shape and color, see assist marker_options. It is also possible to label the points with the values of a variable, using the mlabel(varname) option. In the next pace we add the country names to the plot:

          graph twoway (lfitci change setting) ///              (besprinkle change setting, mlabel(land) )                  

One slight trouble with the labels is the overlap of Costa Rica and Trinidad Tobago (and to a lesser extent Panama and Nicaragua). We can solve this problem past specifying the position of the label relative to the marker using a 12-hour clock (and so 12 is to a higher place, 3 is to the right, half-dozen is below and ix is to the left of the marking) and the mlabv() option. We create a variable to hold the position prepare past default to 3 o'clock and then motion Republic of costa rica to ix o'clock and Trinidad Tobago to just a bit above that at 11 o'clock (we tin besides motility Nicaragua and Panama upwardly a chip, say to 2 o'clock):

. gen pos=iii  . replace pos = xi if country == "TrinidadTobago" (ane real alter made)  . replace pos = 9 if country == "CostaRica" (ane real change fabricated)  . replace pos = 2 if country == "Panama" | country == "Nicaragua" (ii real changes made)        

The command to generate this version of the graph is as follows

          graph twoway (lfitci change setting) ///              (scatter modify setting, mlabel(country) mlabv(pos) )                  

3.1.iv Titles, Legends and Captions

There are options that apply to all two-fashion graphs, including titles, labels, and legends. Stata graphs can have a title() and subtitle(), commonly at the top, and a legend(), note() and caption(), unremarkably at the bottom, blazon aid title_options to learn more. Commonly a championship is all yous need. Stata xi allows text in graphs to include bold, italics, greek letters, mathematical symbols, and a choice of fonts. Stata 14 introduced Unicode, greatly expanding what tin can be done. Type help graph text to acquire more.

Our final tweak to the graph volition be to add a legend to specify the linear fit and 95% confidence interval, but not fertility turn down itself. We do this using the order(2 "linear fit" one "95% CI") option of the fable to label the second and first items in that order. Nosotros too utilise band(0) to move the legend inside the plotting area, and pos(5) to place the legend box near the v o'clock position. Our complete control is then

. graph twoway (lfitci alter setting) /// >          (scatter change setting, mlabel(land) mlabv(pos) ) /// >        , championship("Fertility Decline by Social Setting") /// >          ytitle("Fertility Decline") /// >          legend(ring(0) pos(5) club(2 "linear fit" one "95% CI"))   . graph export fig31.png, width(500) replace              (file fig31.png written in PNG format)        

The result is the graph shown at the offset of this section.

3.ane.5 Axis Scales and Labels

There are options that control the scaling and range of the axes, including xscale() and yscale(), which tin be arithmetic, log, or reversed, type help axis_scale_options to learn more than. Other options control the placing and labeling of major and minor ticks and labels, such every bit equally xlabel(), xtick() and xmtick(), and similarly for the y-axis, see assistance axis_label_options. Unremarkably the defaults are acceptable, but it's dainty to know that yous can change them.

3.two Line Plots

I will illustrate line plots using data on U.S. life expectancy, available equally 1 of the datasets shipped with Stata. (Try sysuse dir to see what else is available.)

. sysuse uslifeexp, clear (U.S. life expectancy, 1900-1999)        

The idea is to plot life expectancy for white and black males over the 20th century. Over again, to whet your appetite I'll start by showing y'all the final product, and then nosotros will build the graph bit by bit.

3.2.1 A Simple Line Plot

The simplest plot uses all the defaults:

          graph twoway line le_wmale le_bmale year                  

If yous are puzzled past the dip earlier 1920, Google "US life expectancy 1918". We could abbreviate the control to twoway line, or even line if that's all we are plotting. (This shortcut only works for besprinkle and line.)

The line plot allows you to specify more than one "y" variable, the social club is y1, y2, …, ym, ten. In our example we specified two, corresponding to white and black life expectancy. Alternatively, we could take used two line plots: (line le_wmale year) (line le_bmale year).

3.two.2 Titles and Legends

The default graph is quite good, but the legend seems also wordy. Nosotros volition move most of the information to the championship and keep only ethnicity in the legend:

          graph twoway line le_wmale le_bmale yr ///     , title("U.S. Life Expectancy") subtitle("Males") ///       legend( club(1 "white" two "black") )        

Hither I used 3 options, which as usual in Stata go after a comma: title, subtitle and legend. The legend selection has many sub options; I used order to list the keys and their labels, saying that the first line represented whites and the second blacks. To omit a key you simply exit information technology out of the listing. To add together text without a matching key apply a hyphen (or minus sign) for the key. There are many other legend options, see help legend_option to larn more.

I would like to utilise infinite a bit meliorate past moving the legend within the plot area, say around the five o'clock position, where improving life expectancy has left some spare room. As noted earlier we tin move the legend inside the plotting area by using ring(0), the "inner circle", and identify information technology well-nigh the 5 o'clock position using pos(5). Because these are legend sub-options they take to go inside fable():

          graph twoway line le_wmale le_bmale yr ///     , title("U.Due south. Life Expectancy") subtitle("Males") ///       legend( club(1 "white" 2 "black") ring(0) pos(v) )        

3.2.iii Line Styles

I don't know about y'all, but I find hard to distinguish the default lines on the plot. Stata lets you control the line manner in different ways. The clstyle() option lets yous utilize a named manner, such as foreground, filigree, yxline, or p1-p15 for the styles used by lines 1 to xv, see help linestyle. This is useful if you want to option your style elements from a scheme, as noted further below.

Alternatively, y'all tin specify the iii components of a style: the line pattern, width and color:

  • Patterns are specified using the clpattern() choice. The most common patterns are solid, dash, and dot; meet help linepatternstyle for more information.
  • Line width is specified using clwidth(); the available options include sparse, medium and thick, encounter assist linewidthstyle for more.
  • Colors can be specified using the clcolor() option using color names (such as reddish, white and blue, teal, sienna, and many others) or RGB values, see assist colorstyle.

Here'south how to specify bluish for whites and red for blacks:

          graph twoway (line le_wmale le_bmale year , clcolor(blue red) ) ///         , championship("U.Southward. Life Expectancy") subtitle("Males") ///         legend( order(1 "white" 2 "black") ring(0) pos(v))                  

Notation that clcolor() is an option of the line plot, and then I put parentheses circular the line control and inserted it in that location.

3.2.iv Scale Options

Information technology looks as if improvements in life expectancy slowed downward a fleck in the second half of the century. This can exist better appreciated using a log scale, where a straight line would indicate a constant percentage improvement. This is easily washed using the axis options of the two-way command, see help axis_options, and in particular yscale(), which lets you choose arithmetics, log, or reversed scales. There's likewise a suboption range() to control the plotting range. Here I volition specify the y-range as 25 to 80 to motion the curves a chip up:

. graph twoway (line le_wmale le_bmale year , clcolor(blue red) ) /// >     , title("U.S. Life Expectancy") subtitle("Males") /// >     fable( order(ane "white" 2 "black") ring(0) pos(v)) /// >     yscale(log range(25 80))        

iii.2.v Graph Schemes

Stata uses schemes to control the advent of graphs, meet help scheme. Yous tin can set the default scheme to be used in all graphs with prepare scheme_name. You can also redisplay the (last) graph using a different scheme with graph brandish, scheme(scheme_name).

To run across a listing of available schemes type graph query, schemes. Try s2color for screen graphs, s1manual for the style used in the Stata manuals, and economist for the style used in The Economist. Using the latter we obtain the graph shown at the get-go of this section.

. graph display, scheme(economist)  . graph export fig32.png, width(400) replace (file fig32.png written in PNG format)        

3.3 Other Graphs

I conclude the graphics section discussing bar graphs, box plots, and kernel density plots using surface area graphs with transparency.

iii.three.i Bar Graphs

Bar graphs may be used to plot the frequency distribution of a categorical variable, or to plot descriptive statistics of a continuous variable within groups divers past a chiselled variables. For our examples nosotros will use the urban center temperature information that ships with Stata.

If I was to simply type graph bar, over(region) I would obtain the frequency distribution of the region variable. Permit us prove instead the boilerplate temperatures in January and July. To practice this I could specify (mean) tempjan (mean) tempjuly, but because the default statistic is the hateful I can use the shorter version below. I call up the default legend is too long, so I likewise specified a custom ane.

I use over() so the regions are overlaid in the same graph; using by() instead, would upshot in a graph with a carve up panel for each region. The bargap() pick controls the gap between bars for different statistics in the same over group; here I put a pocket-size space. The gap() option, not used here, controls the space betwixt confined for different over groups. I also set the intensity of the color fill to lxx%, which I think looks nicer.

. sysuse citytemp, clear (Urban center Temperature Data)  . graph bar tempjan tempjul, over(region) bargap(ten) intensity(lxx) /// >     title(Mean Temperature) legend(order(1 "January" 2 "July"))   . graph export bar.png, width(500) replace (file bar.png written in PNG format)        

Plain the north-east and northward-central regions are much colder in January than the south and w. At that place is less variation in July, but temperatures are higher in the south.

3.3.2 Box Plots

A quick summary of the distribution of a variable may exist obtained using a "box-and-wiskers" plot, which draws a box ranging from the first to the 3rd quartile, with a line at the median, and adds "wiskers" going out from the box to the side by side values, defined as the highest and lowest values that are no farther from the median than ane.v times the inter-quartile range. Values further out are outliers, indicated past circles.

Let united states of america draw a box plot of January temperatures by region. I will use the over(region) option, and so the boxes will exist overlaid in the same graph, rather than by(region), which would produce a split up panel for each region. The option sort(i) arranges the boxes in order of the median of tempjan, the first (and in this case simply) variable. I also gear up the box color to a squeamish blue by specifying the Reddish, Bluish and Light-green (RGB) colour components in a scale of 0 to 255:

. graph box tempjan, over(region, sort(ane)) box(1, color("51 102 204")) /// >     title(Box Plots of January Temperature by Region)  . graph consign boxplot.png, width(500) replace (file boxplot.png written in PNG format)        

We meet that January temperatures are lower and less variable in the north-due east and n-central regions, with quite a few cities with unusually cold averages.

3.3.3 Kernel Density Estimates

A more detailed view of the distribution of a variable may be obtained using a smooth histogram, calculated using a kernel density smoother using the kdensity control.

Permit u.s.a. run separate kernel density estimates for January temperatures in each region using all the defaults, and salvage the results.

. forvalues i=ane/4 {   2.     capture drop x`i' d`i'   iii.     kdensity tempjan if region== `i', generate(x`i'  d`i')   4. }  . gen zippo = 0        

Adjacent we plot the density estimates using expanse plots with a floor at zero. Because the densities overlap, I utilise the new opacity option introduced in Stata 15 to make them l% transparent. In this case I used color names, followed by a % symbol and the opacity. I besides simplify the legend a bit, lucifer the society of the densities, and put it in the summit right corner of the plot.

. twoway rarea d1 nothing x1, color("blue%50") /// >    ||  rarea d2 zero x2, color("purple%50") /// >    ||  rarea d3 zero x3, color("orange%50")  /// >    ||  rarea d4 zero x4, color("ruby-red%l") /// >        title(January Temperatures past Region) /// >        ytitle("Smoothed density") /// >        legend(band(0) pos(ii) col(1) society(2 "NC" 1 "NE" three "Southward" 4 "Westward"))       . graph export kernel.png, width(500) supersede (file kernel.png written in PNG format)        

The plot gives united states of america a articulate pic of regional differences in Jan temperatures, with colder and narrower distributions in the north-east and northward-central regions, and warmer with quite a bit of overlap in the south and w.

three.iv Managing Graphs

Stata keeps track of the concluding graph you lot accept fatigued, which is stored in memory, and calls information technology "Graph". Yous can actually keep more than one graph in memory if you use the name() selection to name the graph when you create information technology. This is useful for combining graphs, type help graph combine to learn more. Note that graphs kept in retention disappear when y'all exit Stata, even if you save the data, unless yous save the graph itself.

To save the current graph on disk using Stata's own format, type graph relieve filename. This control has two options, replace, which you demand to use if the file already exists, and asis, which freezes the graph (including its current manner) and so saves it. The default is to save the graph in a live format that tin can exist edited in future sessions, for instance by irresolute the scheme. After saving a graph in Stata format you tin can load it from the deejay with the command graph use filename. (Annotation that graph relieve and graph utilise are analogous to salve and utilize for Stata files.) Whatever graph stored in retentivity tin exist displayed using graph display [name]. (You tin also list, describe, rename, copy, or drop graphs stored in memory, blazon help graph_manipulation to acquire more than.)

If you plan to incorporate the graph in another document you lot will probably need to save information technology in a more portable format. Stata's command graph export filename can export the graph using a broad variety of vector or raster formats, usually specified past the file extension. Vector formats such as Windows metafile (wmf or emf) or Adobe's PostScript and its variants (ps, eps, pdf) contain essentially cartoon instructions and are thus resolution independent, so they are best for inclusion in other documents where they may be resized. Raster formats such as Portable Network Graphics (png) salvage the epitome pixel past pixel using the current display resolution, and are all-time for inclusion in web pages. Stata 15 added Scalable Vector Graphics (SVG), a vector image format that is supported past all major modern web browsers.

You lot can as well print a graph using graph impress, or copy and paste it into a document using the Windows clipboard; to do this right click on the window containing the graph and then select copy from the context carte du jour.

Continue with four. Programming Stata

How To Check What Template You Are Using For Graphs In Stata,

Source: https://data.princeton.edu/stata/graphics

Posted by: robinsonlitaltalat.blogspot.com

0 Response to "How To Check What Template You Are Using For Graphs In Stata"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel