The objective of the sampling procedures used on this study was to produce a random sample of the target population. A random sample shares the same properties and characteristics of the total population from which it is drawn, subject to a certain level of sampling error. This means that with a properly drawn sample we can make statements about the properties and characteristics of the total population within certain specified limits of certainty and sampling variability.
The confidence interval for sample estimates of population proportions, using simple random sampling without replacement, is calculated by the following formula:
Where:
| se (x) | = |
the standard error of the sample estimate for a proportion |
| p | = |
some proportion of the sample displaying a certain characteristic or attribute |
| q | = |
(1 p) |
| n | = |
the size of the sample |
| z | = |
the standardized normal variable, given a specified confidence level (1.96 for samples of this size) |
The sample sizes for the surveys are large enough to permit estimates for
sub-samples of particular interest. Table 5, on the next page, presents the
expected size of the sampling error for specified sample sizes of 12,000
and less, at different response distributions on a categorical variable.
As the table shows, larger samples produce smaller expected sampling variances,
but there is a constantly declining marginal utility of variance reduction
per sample size increase.
|
Expected Sampling Error (Plus or Minus) Percentage of the Sample or Sub-Sample Giving Size of 12,000 0.5 0.7 0.8 0.9 0.9 6,000 0.8 1.0 1.2 1.2 1.3 4,500 0.9 1.2 1.3 1.4 1.5 4,000 0.9 1.2 1.4 1.5 1.5 3,000 1.1 1.4 1.6 1.8 1.8 2,000 1.3 1.8 2.0 2.1 2.2 1,500 1.5 2.0 2.3 2.5 2.5 1,300 1.6 2.2 2.5 2.7 2.7 1,200 1.7 2.3 2.6 2.8 2.8 1,100 1.8 2.4 2.7 2.9 3.0 1,000 1.9 2.5 2.8 3.0 3.1 900 2.0 2.6 3.0 3.2 3.3 800 2.1 2.8 3.2 3.4 3.5 700 2.2 3.0 3.4 3.6 3.7 600 2.4 3.2 3.7 3.9 4.0 500 2.6 3.5 4.0 4.3 4.4 400 2.9 3.9 4.5 4.8 4.9 300 3.4 4.5 5.2 5.6 5.7 200 4.2 5.6 6.4 6.8 6.9 150 4.8 6.4 7.4 7.9 8.0 100 5.9 7.9 9.0 9.7 9.8 75 6.8 9.1 10.4 11.2 11.4 50 8.4 11.2 12.8 13.7 14.0 _______________________________________________________________________ NOTE: Entries are expressed as percentage points (+ or -) |
|---|
However, the sampling design for this study included a separate, concurrently
administered over-sample of youth and young adults (age 16-39). Both the
cross-sectional sample and the over-sample of the youth/younger adult population
were drawn as simple random samples; however, the disproportionate sampling
of the age 16-39 population introduces a design effect that makes it inappropriate
to assume that the sampling error for total sample estimates will be identical
to those of a simple random sample.
In order to calculate a specific interval for estimates from a sample, the appropriate statistical formula for calculating the allowance for sampling error (at a 95% confidence interval) in a stratified sample with a disproportionate design is:
where:
| ASE | = |
allowance for sampling error at the 95% confidence level; |
| h | = |
a sample stratum; |
| g | = |
number of sample strata; |
| Wh | = |
stratum h as a proportion of total population; |
| fh | = |
the sampling fraction for group h the number in the sample divided by the number in the universe; |
| s2h | = |
the variance in the stratum h for proportions this is equal to ph (1.0 ph); |
| nh | = |
the sample size for the stratum h. |
Although Table 5 above provides a useful approximation of the magnitude of expected sampling error, precise calculation of allowances for sampling error requires the use of this formula. To assess the design effect for sample estimates, we calculated sampling errors for the disproportionate sample for a number of key variables using the above formula. These estimates were then compared to the sampling errors for the same variables, assuming a simple random sample of the same size. The two strata (h1 and h2) in the disproportionate sample were all respondents age 16-39 and all respondents age 40 and over respectively. The proportion for the 16-39 year old stratum (w1) was 53.0 percent while the proportion for the 40 and over stratum (w2) was 47.0 percent.
As shown in Table 6, the disproportionate sampling increases the confidence
interval by an average of 0.7 percent, compared to a simple random sample
of the same size. This means the sample design slightly decreases the sampling
precision for total population estimates, while increasing the precision
of sampling estimates for the sub-sample aged 16-39 years old. Since the
average difference in the confidence interval between the stratified disproportionate
sample and a simple random sample is less than one percentage point, the
sampling error table for a simple random sample will provide a reasonable
approximation of the precision of sampling estimates in the survey.
CONFIDENCE
INTERVALS PERCENTAGE POINTS + AT 95% CONFIDENCE LEVEL |
||||
|---|---|---|---|---|
| p= | HYPOTHETICAL PROPORTIONATE SAMPLING |
CURRENT DISPROPORTIONATE SAMPLING |
DIFFERENCE
IN CONFIDENCE INTERVALS ABOUT ESTIMATES |
|
| VARIABLE (Version 1 only) | ||||
| Driven in the past year | 89.2% |
0.77 |
0.78 |
1.3% |
| Drank alcohol in past year | 63.4% |
1.21 |
1.23 |
1.7% |
| Always use safety belt (N=5502) | 85.1% |
0.94 |
0.94 |
---- |
| Dislike seat belts (N=5505) | 33.1% |
1.24 |
1.26 |
1.6% |
| Always use passenger belt (N=5655) | 82.7% |
0.98 |
0.98 |
---- |
| Favor (a lot) seat belt laws | 69.3% |
1.15 |
1.16 |
.9% |
| Should be primary enforcement | 63.9% |
1.20 |
1.22 |
.9% |
| Ever ticketed by police for seatbelt | 9.3% |
0.73 |
0.72 |
-1.4% |
| Ever injured in vehicle accident | 23.6% |
1.06 |
1.08 |
1.9% |
| Drives a car for work almost every day | 17.2% |
0.94 |
0.96 |
2.1% |
| Set a good example for others (N=5413) (reason for using seat belts) | 74.1% |
1.17 |
1.19 |
1.7% |
| Driver-side air bag in vehicle (N=5551) | 76.5% |
1.12 |
1.14 |
1.8% |
| Race: Black/African American | 8.6% |
0.70 |
0.70 |
---- |
| Ethnicity: Hispanic | 13.2% |
0.84 |
0.81 |
-3.6% |
| Gender: Male | 48.0% |
1.24 |
1.27 |
2.4% |
| AVERAGE DIFFERENCE IN CONFIDENCE INTERVALS | 0.7%* |
|||
| Total sample proportions using SRS formula Unless specified otherwise N=6180 |
||||