Field Outcomes

Survey data collection by the Federal government requires prior approval by the Office of Management and Budget (OMB). Before submitting the formal request for data collection to OMB, NHTSA published a Notice in the Federal Register soliciting comments on the information collection. The Notice appeared in the Federal Register, 63:50, pages 12858-12859, March 16, 1998. The closing date for comments was May 15, 1998. No comments were received in response to the Notice. NHTSA then submitted the request for data collection to OMB June 29, 1998. OMB approved the information collection September 9, 1998, assigning it the OMB number 2127-0596 with and expiration date of December 31, 1999.

The field interviewing for the study commenced on November 5, 1998, following training of the field interviewers, and was completed on January 12, 1999. This is approximately the same time period in which the 1996 Occupant Protection Survey (November 4, 1996 to January 5, 1997) and the 1994 Occupant Protection Survey (October 5, 1994-December 11, 1994) were conducted. Status of cases as of the end of the field period are reported using the categories defined below.



FIGURE 4
Sample Disposition Categories

NIS/DIS/change#
The number was not in service, had been disconnected, or yielded a recording indicating that it was no longer an active number
Non-residential
The number yielded a contact with a business, government agency, pay telephone, or other non-residential unit
Computer/fax
The number yielded an electronic tone indicating a fax machine or data line
Pre-screened NIS/DIS
Automated dialer used to pre-screen numbers that are no longer in service or disconnected prior to that number being included in the sample
SSI Business Numbers
Pre-screened phone numbers through Survey Sampling Inc. that pre-identifies a place of business before actual dialing begins.
No answer
The number rang, but no one answered. The protocol required five calls to non-answering numbers.
Busy
A busy signal was encountered
Answering machine
An answering machine was reached at the telephone number
Language
The interview could not be completed because of language barriers
Not Available
Health/Deaf/Deceased
Those unable to participate due to death, self-defined health reasons, or deafness
Away for duration
The designated respondent was out of the area for the entire field period
Callback
Contact was made with the household, but not necessarily the designated respondent. By the end of the field period, the case had neither yielded a refusal or completed interview
Callback to complete
The interview was interrupted, but not terminated. The field period ended before the full interview could be completed
Refusal -- Initial
Someone in the household refused to participate in the study
Refusal -- Second
During a refusal conversion attempt, a second refusal to participate in the study was encountered
Screen Outs
Households whose eligible participants had met the gender and/or age quotas by Region
Terminate
A respondent began the interview but refused to finish
Complete
An interview was completed with the designated respondent

For survey Version 1 - Seat Belt Usage Issues, a total of 30,620 randomly selected telephone numbers were sampled within a geographically stratified national sampling frame for both sample components (the cross-section of youth and adults age 16 and older and the oversample of persons age 16-39):

At the close of the field period, only 521 cases (2%) were in callback status.

The participation rate represents one of the most critical measures of potential sample bias because it indicates the degree of self-selection by potential respondents into or out of the survey. The participation rate is calculated as the number of completed interviews (including respondents who screen out as ineligible) divided by the combined total number of completed interviews, terminated interviews, and refusals to interview. (The inclusion of screen outs in the numerator and denominator is mathematically equivalent to discounting the refusals by the estimated rate of non-eligibility among refusals.) The participation rate for Version 1 is based on the following elements:

Based on the standard calculations of participation rate, as defined by the Council of Applied Statistical Research Organizations (CASRO), the participation rate for Version 1 was 79.6%. This formula treats the numerator as all respondents who participate by completing required survey questions, while the denominator includes those who complete required questions, those who begin but terminate before completing all required questions, and those who refuse entirely.

The Final Summary Disposition of the Version 1 sample is given in Table 3. The table includes breakouts for each survey component (national youth and adult cross-section and the age 16-39 oversample). The average interview length for Version 1 was 22.7 minutes.




TABLE 3
Sample Disposition:
Version 1, Seat Belt Usage Issues

CROSS-
SECTION
OVER-
SAMPLE
TOTAL
TOTAL NUMBERS DIALED 16891 13729 30620
   NIS/Dis/Change#/Wrong# 1794 1331 3125
   Non-residential 1137 924 2061
   Computer/fax 525 420 945
   Prescreened NIS-DIS 5440 4364 9804
   SSI Business Numbers 1228 1012 2240
   Other Reason Terminating 11 4 15
   No Answer 1164 968 2132
   Answering Machine 342 310 652
   Busy 106 49 155
   Callback 315 206 521
   Not Available 30 114 144
   Language 146 77 223
   Health/Deaf/Deceased 236 61 297
   Away for Duration 72 31 103
   Refusals -- Initial 228 179 407
   Refusals -- Second 935 285 1220
   Total Contacts 3180 3393 6573
   Screen out 74 2362 2436
   Total Qualified 3106 1031 4137
   Callback to Complete 0 0 0
   Terminates 35 8 43
   Completes 3071 1023 4094
   Participation Rate 72.4% 87.8% 79.6%

For survey Version 2 - Child Safety Seat Issues, a total of 29,592 randomly selected telephone numbers were sampled within a geographically stratified national sampling frame for both sample components (the cross-section of youth and adults age 16 and older and the oversample of persons age 16-39):

At the close of the field period, there were 235 cases (1%) in callback status.

Based on the standard calculations of participation rate, the participation rate for Version 2 was 81.2 percent.

The Final Summary Disposition of the Version 2 sample is given in Table 4, on the next page. The table includes breakouts for each survey component (national youth and adult cross-section and the age 16-39 oversample). The average interview length for Version 2 was 16.3 minutes.




TABLE 4
Sample Disposition:
Version 2, Child Safety Seat Issues

CROSS-
SECTION
OVER-
SAMPLE
TOTAL
TOTAL NUMBERS DIALED 15711 13881 29592
   NIS/Dis/Change#/Wrong# 1478 1494 2972
   Business# 1061 909 1970
   Computer/Fax Tone 488 421 909
   Prescreened NIS-DIS 4930 4372 9302
   SSI Business Numbers 1254 986 2240
   Other Reason Terminating 16 15 31
   No Answer 1056 836 1892
   Answering Machine 275 211 486
   Busy 70 38 108
   Callback 164 71 235
   Not Available 306 210 516
   Language 109 74 183
   Health/Deaf/Deceased 208 62 270
   Resp. Away for Duration 54 25 79
   Refusals 361 153 514
   Second Refusals 687 337 1024
   Total Contacts 3190 3664 6854
   Screen out 79 2614 2693
   Total Qualified 3111 1050 4161
   Callback to Complete 0 0 0
   Terminates 28 12 40
   Completes 3083 1038 4121
   Participation Rate 74.6% 87.9% 81.2%

Sample Weighting

The characteristics of a perfectly drawn sample of a population will vary from true population characteristics only within certain limits of sample variability (i.e., sampling error). Unfortunately, social surveys do not permit perfect samples. The sampling frames available to survey research are less than perfect. The absence of perfect cooperation from sampled units means that the completed sample will differ from the drawn sample. In order to correct these known problems of sample bias, the achieved sample is weighted to certain characteristics of the total population. Each of the survey samples was weighted separately.

The weighting plan for the survey was a multi-stage sequential process of weighting the achieved sample to correct for sampling and non-sampling biases in the final sample. The first stage in the sample weighting procedures was designed to correct the cases in the completed sample for known selection biases in the sampling procedures. At the household selection stage, a random digit dialing process will give households with more than one telephone number an unequal likelihood of selection. Nationally, about 18% of households selected by random digit dialing will have more than one telephone number. This selection bias was corrected by giving each household a first stage weight equal to the inverse of the number of different telephone numbers in the household, up to a maximum of three phone numbers..

The second step in the weighting process was to correct for selection procedures that yielded unequal probability of selection within sampled households. Although the survey was designed as a population survey, only one eligible person per household could be interviewed (because multiple interviews per household are burdensome and introduce additional design effects into the survey estimates). A respondent's probability for selection is inverse to the size (number of other eligible adults) of the household. Hence, the second stage weight was equal to the number of eligible respondents within the household.

The next step in the weighting process was to correct the study design for deliberate disproportionate selection of younger population subsets in the sample design. The survey included both a cross-sectional sample of 3,000 respondents, aged 16 and older, and an oversample of 1,000 persons, aged 16 to 39 years old. Hence, the total achieved sample yielded a disproportionate sample distribution by age. A third stage weight was used to correct the achieved sample for disproportionate sampling by dividing the expected population distribution, based on Census projections, by the achieved sample distribution on the stratification variables. Specifically, the third stage weight corrected the sample to the cell distribution of the population for six age cohorts (16-24, 25-34, 35-44, 45-54, 55-64, and 65 or older) by gender, using the Census Population Projections for Age, Sex and Race for 1998. After these corrections were made, no further weighting by other Census characteristics (e.g., race) was considered necessary or desirable.




FIGURE 5A
SPSS Program for Assigning Weights

    VERSION 1

    compute numtel=q109.
    recode numtel (sysmis=1)(4 thru 10=3)(11 thru highest=1).
    compute nadults=(q101a).
    recode nadults (7 thru 98=7)(0,99=1).
    compute weight1=(1/numtel).
    compute weight2=nadults.
    COMPUTE WEIGHT3=(WEIGHT1*WEIGHT2).

    *age by gender weight.
    compute catage=q100.
    recode catage (16 thru 24=1)(25 thru 34=2)(35 thru 44=3)
         (45 thru 54=4)(55 thru 64=5)(65 thru 97=6)(99=7).
    value labels catage 1 '16-24' 2 '25-34' 3 '35-44'
         4 '45-54' 5 '55-64' 6 '65+' 7 'Refused'.
    compute gender=cq283.
    value labels gender 1 'Male' 2 'Female'.
    compute weight4=0.
    if (gender eq 1 and catage eq 1) weight4=0.790.
    if (gender eq 1 and catage eq 2) weight4=0.796.
    if (gender eq 1 and catage eq 3) weight4=1.034.
    if (gender eq 1 and catage eq 4) weight4=1.283.
    if (gender eq 1 and catage eq 5) weight4=1.496.
    if (gender eq 1 and catage eq 6) weight4=1.624.
    if (gender eq 2 and catage eq 1) weight4=0.665.
    if (gender eq 2 and catage eq 2) weight4=0.721.
    if (gender eq 2 and catage eq 3) weight4=0.902.
    if (gender eq 2 and catage eq 4) weight4=1.204.
    if (gender eq 2 and catage eq 5) weight4=1.288.
    if (gender eq 2 and catage eq 6) weight4=1.749.
    compute weight5=(weight3*weight4).
    compute weight6=(weight5*.58061).
    recode weight6 (0=1).




FIGURE 5B
SPSS Program for Assigning Weights

    VERSION 2

    compute numtel=q104.
    recode numtel (sysmis=1)(4 thru 10=3)(11 thru highest=1).
    compute nadults=(q12a).
    recode nadults (7 thru 98=7)(99=1).
    compute weight1=(1/numtel).
    compute weight2=nadults.
    COMPUTE WEIGHT3=(WEIGHT1*WEIGHT2).

    *age by gender weight. compute catage=q11.
    recode catage (16 thru 24=1)(25 thru 34=2)(35 thru 44=3)
         (45 thru 54=4)(55 thru 64=5)(65 thru 97=6)(99=7).
    value labels catage 1 '16-24' 2 '25-34' 3 '35-44'
         4 '45-54' 5 '55-64' 6 '65+' 7 'Refused'.
    compute gender=cq273.
    value labels gender 1 'Male' 2 'Female'.
    compute weight4=0.
    if (gender eq 1 and catage eq 1) weight4=0.787.
    if (gender eq 1 and catage eq 2) weight4=0.815.
    if (gender eq 1 and catage eq 3) weight4=0.982.
    if (gender eq 1 and catage eq 4) weight4=1.272.
    if (gender eq 1 and catage eq 5) weight4=1.428.
    if (gender eq 1 and catage eq 6) weight4=1.634.
    if (gender eq 2 and catage eq 1) weight4=0.722.
    if (gender eq 2 and catage eq 2) weight4=0.766.
    if (gender eq 2 and catage eq 3) weight4=0.800.
    if (gender eq 2 and catage eq 4) weight4=1.264.
    if (gender eq 2 and catage eq 5) weight4=1.353.
    if (gender eq 2 and catage eq 6) weight4=1.684.
    compute weight5=(weight3*weight4).
    compute weight6=(weight5*.55556).
    recode weight6 (0=1).



The final step in the weighting process was designed to correct for the fact that the total number of cases in the weighted sample was larger than the unweighted sample size because of the use of the number of eligibles weight. In order to avoid misinterpretation of sample size, the total number of cases in the unweighted sample was divided by the total number of cases in the weighted sample to yield a sample size weight. When this weight is applied, the size of the weighted sample is identical to the size of the unweighted sample.

The final weight (WEIGHT6) incorporates all of the intermediate weighting steps described above. The final weight adjusts the total completed interviews in the achieved sample to correct for known sampling and participation biases, while maintaining the unweighted sample size.


Precision of Sample Estimates

The objective of the sampling procedures used on this study was to produce a random sample of the target population. A random sample shares the same properties and characteristics of the total population from which it is drawn, subject to a certain level of sampling error. This means that with a properly drawn sample we can make statements about the properties and characteristics of the total population within certain specified limits of certainty and sampling variability.

The confidence interval for sample estimates of population proportions, using simple random sampling without replacement, is calculated by the following formula:

The sample sizes for the surveys are large enough to permit estimates for subsamples of particular interest. Table 5, on the next page, presents the expected size of the sampling error for specified sample sizes of 8,000 and less, at different response distributions on a categorical variable. As the table shows, larger samples produce smaller expected sampling variances, but there is a constantly declining marginal utility of variance reduction per sample size increase.




TABLE 5
Expected Sampling Error (Plus or Minus)
At the 95% Confidence Level
(Simple Random Sample)

Percentage of the Sample or Subsample Giving
A Certain Response or Displaying a Certain
Characteristic for Percentages Near:

Size of
Sample or
Subsample
10 or 90 20 or 80 30 or 70 40 or 60 50
8,000 0.7 0.9 1.0 1.1 1.1
4,000 0.9 1.2 1.4 1.5 1.5
3,000 1.1 1.4 1.6 1.8 1.8
2,000 1.3 1.8 2.0 2.1 2.2
1,500 1.5 2.0 2.3 2.5 2.5
1,300 1.6 2.2 2.5 2.7 2.7
1,200 1.7 2.3 2.6 2.8 2.8
1,100 1.8 2.4 2.7 2.9 3.0
1,000 1.9 2.5 2.8 3.0 3.1
900 2.0 2.6 3.0 3.2 3.3
800 2.1 2.8 3.2 3.4 3.5
700 2.2 3.0 3.4 3.6 3.7
600 2.4 3.2 3.7 3.9 4.0
500 2.6 3.5 4.0 4.3 4.4
400 2.9 3.9 4.5 4.8 4.9
300 3.4 4.5 5.2 5.6 5.7
200 4.2 5.6 6.4 6.8 6.9
150 4.8 6.4 7.4 7.9 8.0
100 5.9 7.9 9.0 9.7 9.8
75 6.8 9.1 10.4 11.2 11.4
50 8.4 11.2 12.8 13.7 14.0

NOTE: Entries are expressed as percentage points (+ or -)



However, the sampling design for this study included a separate, concurrently administered oversample of youth and young adults (age 16-39). Both the cross-sectional sample and the oversample of the youth/younger adult population were drawn as simple random samples; however, the disproportionate sampling of the age 16-39 population introduces a design effect that makes it inappropriate to assume that the sampling error for total sample estimates will be identical to those of a simple random sample.

In order to calculate a specific interval for estimates from a sample, the appropriate statistical formula for calculating the allowance for sampling error (at a 95% confidence interval) in a stratified sample with a disproportionate design is:

Although Table 5 above provides a useful approximation of the magnitude of expected sampling error, precise calculation of allowances for sampling error requires the use of this formula. To assess the design effect for sample estimates, we calculated sampling errors for the disproportionate sample for a number of key variables using the above formula. These estimates were then compared to the sampling errors for the same variables, assuming a simple random sample of the same size. The two strata (h1 and h2) in the disproportionate sample were all respondents age 16-39 and all respondents age 40 and over respectively. The proportion for the 16-39 year old stratum (w1) was 45.7 percent while the proportion for the 40 and over stratum (w2) was 54.3 percent.

As shown in Table 6 below, the disproportionate sampling increases the confidence interval by about 2 percent, compared to a simple random sample of the same size. This means that sample design introduces almost no measurable loss in sampling precision for total population estimates, while increasing the precision of sampling estimates for the target population aged 16-39 years old. Since the difference in sampling precision between the stratified disproportion sample and a simple random sample is less than one tenth of percentage point in each case, the sampling error table for a simple random sample will provide a reasonable approximation of the precision of sampling estimates in the survey.




TABLE 6
Design Effect on Confidence Intervals for Sample Estimates
Between Disproportionate Sample Used in Occupant Protection Survey
And a Proportionate Sample of Same Size

-------------------- CONFIDENCE INTERVALS ------------------------
PERCENTAGE POINTS + AT 95% CONFIDENCE LEVEL

HYPOTHETICAL
PROPORTIONATE
SAMPLING*
CURRENT DIS-
PROPORTIONATE
SAMPLING
DIFFERENCE IN
CONFIDENCE
INTERVALS ABOUT
ESTIMATES
USE NEW VARIABLES
Driven in the past year .61 .63 +3.2%
Drunk alcohol in past year 1.39 1.37 -1.3%
Always use safety belt .93 .94 +0.7%
Dislike seat belts 1.55 1.61 +3.4%
Always use passenger belt (front) 1.40 1.40 0.0%
Favor (a lot) seat belt laws 1.45 1.48 +2.0%
Secondary enforcement 1.41 1.44 +2.0%
Ever ticketed by police for seatbelt .85 .83 - 2.6%
Recall Crash dummies 1.11 1.17 +5.0%
Ever injured in vehicle accident .94 .97 +2.9%
Drives a car for work almost every day 2.64 2.76 +4.3%
Set a good example for others
(reason for using seat belts)
1.43 1.47 +2.6%
Driver-side only Air Bag in vehicle 2.04 2.08 +1.6%
Race: Black/African American 0.66 0.65 -0.5%
Ethnicity: Hispanic 0.63 0.61 -4.0%
Male/Female 1.08 1.10 +2.2%
AVERAGE DIFFERENCE IN CONFIDENCE INTERVALS +1.94%

* Total sample proportions using SRS formula



Estimating Statistical Significance

The estimates of sampling precision presented in the previous section yield confidence bands around the sample estimates, within which the true population value should lie. This type of sampling estimate is appropriate when the goal of the research is to estimate a population distribution parameter. However, the purpose of some surveys is to provide a comparison of population parameters estimated from independent samples (e.g. annual tracking surveys) or between subsets of the same sample. In such instances, the question is not simply whether or not there is any difference in the sample statistics which estimate the population parameter, but rather is the difference between the sample estimates statistically significant (i.e., beyond the expected limits of sampling error for both sample estimates).

To test whether or not a difference between two sample proportions is statistically significant, a rather simple calculation can be made. Call the total sampling error (i.e., var (x) in the previous formula) of the first sample s1 and the total sampling error of the second sample s2. Then, the sampling error of the difference between these estimates is sd which is calculated as:

Any difference between observed proportions that exceeds sd is a statistically significant difference at the specified confidence interval. Note that this technique is mathematically equivalent to generating standardized tests of the difference between proportions.

An illustration of the pooled sampling error between subsamples for various sizes is presented in Table 7. This table can be used to indicate the size of difference in proportions between drivers and non-drivers or other subsamples that would be statistically significant.



TABLE 7. Pooled Sampling Error Expressed as Percentages For Given Sample Sizes (Assuming P=Q)

Sample

Size

4000 14.1 10.0 7.1 5.9 5.1 4.7 4.3 4.0 3.8 3.6 3.5 3.0 2.7 2.5 2.4 2.3 2.2
3500 14.1 10.0 7.1 5.9 5.2 4.7 4.3 4.1 3.8 3.7 3.5 3.0 2.7 2.6 2.4 2.3
3000 14.1 10.0 7.2 5.9 5.2 4.7 4.4 4.1 3.9 3.7 3.6 3.1 2,8 2.7 2.5
2500 14.1 10.0 7.2 6.0 5.3 4.8 4.5 4.2 4.0 3.8 3.7 3.2 2.9 2.8
2000 14.2 10.1 7.3 6.1 5.4 4.9 4.6 4.3 4.1 3.9 3.8 3.3 3.1
1500 14.2 10.2 7.4 6.2 5.5 5.1 4.7 4.5 4.3 4.1 4.0 3.6
1000 14.3 10.3 7.6 6.5 5.8 5.4 5.1 4.8 4.7 4.5 4.4
900 14.4 10.4 7.7 6.5 5.9 5.5 5.2 4.9 4.8 4.6
800 14.4 10.4 7.8 6.6 6.0 5.6 5.3 5.1 4.9
700 14.5 10.5 7.9 6.8 6.1 5.7 5.5 5.2
600 14.6 10.6 8.0 6.9 6.3 5.9 5.7
500 14.7 10.8 8.2 7.2 6.6 6.2
400 14.8 11.0 8.5 7.5 6.9
300 15.1 11.4 9.0 8.0
200 15.6 12.1 9.8
100 17.1 13.9
50 19.8
50 100 200 300 400 500 600 700 800 900 1000 1500 2000 2500 3000 3500 4000

Sample Size



Statistical Comparisons Between Samples

In order to permit statistical comparisons between the two samples, the data sets from the two separate samples were merged together on like questions. The sample versions (1 for Safety Belt Usage and 2 for Child Safety Seats) were crosstabulated with each of the survey questions which had been asked in an equivalent fashion in the two samples. A chi square test was conducted for each of these crosstabulations to test for the independence of samples.

An exact test of independence was calculated to test the differences between the two samples. Pearson's chi square is a widely used statistic to test the hypothesis that the row and column variables are independent. It is calculated by summing over all cells the squared residuals divided by the expected frequencies. The calculated chi-square is compared to the critical points of the theoretical chi-square distribution to produce an estimate of how likely (or unlikely) this calculated value is, if the two variables are in fact independent. This probability is also known as the observed significance level of the test. If the probability is small (usually less than 0.05), the hypothesis that the two variables are independent is rejected.

No statistically significant difference (at the .05 level) was found between the two samples on most demographic characteristics, including race, ethnicity (Hispanic), educational attainment, and marital status. There is no difference between samples on most vehicle characteristics (e.g., airbags, type of seatbelts) or driver behaviors (e.g., drive everyday, always wear seatbelts).

There are, nonetheless, a limited set of differences large enough to be statistically significant with samples of this size. The proportion of van/minvan drivers in Version 2 is lower (8.2%) than in Version 1 (10.8%). The proportion of primary vehicles with airbags is higher in Version 1 (53.9%) compared to Version 2 (50.6%). The proportion of those reporting they had been injured in an accident was higher in Version 2 (25.4%) than in Version 1 (22.9%); and those who said their injuries prevented them from performing activities for at least one week were higher in Version 1 (55.5%) than in Version 2 (49.8%). The reported income between the two survey versions was also found to be significantly different. All these differences are likely due to the large sample size.

Finally, although the proportion of past month drinkers is the same in the two surveys (48.8% to 47.6), the proportion who report driving after drinking in the past month is significantly lower in the Child Safety Survey (20.9%) than the Seatbelt Survey (27.3%). This last difference probably reflects a contextual effect (willingness to report drinking and driving after an intense discussion of child safety in cars) rather than sample differences.



TOP | NEXT | TABLE OF CONTENTS |