Model answers to examination questions 2003

Part A - Basic concepts and principles

Question 1

Select a disease for each of the following a), b) and c) and write brief notes on the important features of its epidemiology and its control, management or eradication:

Question 2

Briefly describe the essential features and application of THREE (3) of the following:

Question 3

Write brief notes to demonstrate your understanding of THREE (3) of the following:

Question 4

Using examples, write brief notes on THREE (3) of the following:


Part B - Practice and applications

Question 1

You have been asked to provide some assistance with the analysis of data from a pilot cross-sectional study of cattle from a number of herds and a number of geographic regions. The objective of the study was to identify possible risk factors for a particular disease. The investigators plan to use the results of the study as the basis for a more searching and expensive case-control study.

Explain the steps you would follow in analysing these data. Include in your answer the assumptions of any models, and reasons why you would choose them.

Outcome variable: Disease Present or absent Explanatory variables: Sex Male (bulls and steers) or Female Breed Poll Hereford, Black Angus, Santa Gertrudis, crossbred Age Months Weight Kilograms Herd size Number of head of cattle Region Temperate, Subtropical, Semi-arid

Answer provided by Jenny Weston.

It is important to carry out initial exploratory data analysis to check that all are valid values and to assess the type of distribution and whether they may need transformation for further assessment. A good case definition for the disease needs to be established and whether it is possible for there to be repeat episodes of a case as this may influence further analysis. There is likely to be clustering effects for animals within the same farm as management practices and disease status on the farm are likely to have effects above and beyond those attributed to other variables.

The following explanatory variables are categorical and can be used to calculate measures of association (risk of disease according to explanatory variable): sex, breed and region. The other explanatory variables (age, weight and herd size) are continuous variables and could be considered as such or determined as categorical variables though this would lose some of the precision of the data.

After we have put the data through this screening process it would be prudent to develop Null and alternate hypotheses and then run some simple analyses such as Chi-squared testing of variables such as gender and breed (is there any difference between observed and expected values). Similarly the student's t-test can be used to compare the age distribution between diseased and non-diseased animals. At this stage it would be best to use a generous p-value such as 0.1 or 0.2 before eliminating variables of interest.

Would be necessary to test for interaction and confounding by stratifying and using the Mantel-Haenszel technique. There is likely to be interaction between some of these variables (such as age and weight). Confounding can be assessed (though there is no statistical test) by looking at how coefficients change as further variables are added to the model. As a rule of thumb, if there is a change of more than 10-20% in a coefficient due to the inclusion of other variables in he model this is an indication that confounding is occurring.

It would be best to use multivariate logistic regression analysis to consider a variety of variables at once as we can't assume that there will always be a linear relationship between the variables and the output variable. This is a more complex method than the simpler analytical measures of association and requires an appropriate computer programme to "fit" the best model to the data. Will need to include parameters for the possible interactions and then assess whether they are significant or not. An assessment of how well the model fits the data is performed by analysis of the residuals.

Assumptions:

Although the outcome variable that is measured is dichotomous (disease present or absent) so 1 or 0. The result that comes from this analysis will be a risk or probability of disease among animals with that set of explanatory variables.

Finally, it would be worthwhile to consider Hill's criteria for causation as all the data has shown so far is an association which is not the same thing! Need to consider:

Question 2

Virulent footrot (VFR) of sheep is considered by many to be a significant disease that is of economic importance to flock owners and the sheep industry as a whole. However, there are others who consider it to be of no importance. You have been asked to provide advice to the Animal Health authorities on the merits of proceeding with a control program. Describe the factors that you would consider and outline any activities you would implement to assist in reaching a recommendation.

Answer provided by Jenny Weston:

The merits of a disease control program depend upon: (1) the frequency of disease in the population, (2) the economic impact of the disease, and (3) the affect of the disease on the region's ability to trade.

The answers to these questions may not be straightforward because:

  1. The disease may occur at varying severity.
  2. The disease may occur more than once during the lift of the animal.
  3. In some flocks recognition of the condition may be poor.
  4. The condition may be mistaken for other conditions.

Firstly, a sound case definition must be described. What is virulent footrot? What isn't virulent footrot? Methods of diagnosing the condition must be described. That is, what test are we using? For example:

Once a case definition and test with known performance (Se and Sp) are defined there are a number of options. Depending on the current state of knowledge:

  1. A descriptive study of outbreak farms (farms known or thought to have a problem). This would illuminate the natural history of the condition. Additional questions include: what host factors are important (e.g. age, sex, breed), what agent factors are important (e.g. are different strains of the organism having different virulence/pathogenicity characteristics present, how well does the organism survive in the environment, can recurrent infections occur), what environmental factors are present (e.g. under what conditions does infection survive and spread, is the disease seasonal?), what management factors are important? An understanding of some of the patterns of disease occurrence will hopefully follow.
  2. A cross sectional survey may be possible either by mail, from farmer or veterinarian records the objective could be to determine farm level (between farm prevalence) of the disease. This could also shed light upon spatial patterns of disease.
  3. For within farm prevalence and measures of incidence and production effects the most appropriate study would be to follow cohorts of sheep from weaning to slaughter on individual randomly selected affected properties ie a cohort study.
  4. Presence of virulent footrot (yes/no) would be the dichotomous exposure variable and a quantitative measure of production the outcome variable.
  5. Once measures of association/ production difference are quantified it may be possible to further define the financial measure of disease and/or disease control eg by using a partial budget.

Question 3

Babesia gibsoni has recently been found in Pit Bull terriers in Victoria. However, little is known about its distribution or prevalence in Australia. You have therefore been asked to design a study to identify the prevalence of Babesia gibsoni in the Australian dog population.

Answer provided by Ian Langstaff.

Type of study: observational study, cross sectional survey of dog population.

Strengths:

Weaknesses:

Study design: Observational study, cross sectional survey of dog population.

Study objective: Estimate prevalence of infection in the Australian dog population and detect the distribution across Australian states and territories.

Hypothesis: The prior estimate of prevalence is x. The infection is absent from x, y, z states of Australia.
Unit of interest: Australian dogs.

Reference population: All Australian dogs.

Study population: Chosen as dogs registered with a veterinary clinic and dogs registered at a pound/animal refuge.

Design and sampling methods: Multistage method.

Sample size is calculated using prior estimates of prevalence, required confidence and precision of the estimate. In order to determine geographic distribution across states and territories (presence/absence) the sample number for each state/territory needs to exceed the number required to detect infection at an estimated level of disease. Equations exist for calculating sample size at the 3 stages and detecting infection above a minimum expected prevalence.

Data collection is suggested to occur when healthy animals are sampled for screening e.g. at vaccination (clinics) and rehoming (pounds) and when presented for illnesses (clinics). The advantage exists that samples are easy and cheap to acquire at these times as other samples are being collected. The disadvantage is that opportunity for bias exists when dogs are selected to be sampled. The method suggested is systematic random however little control is possible to ensure this occurs. Alternatively blood samples reaching laboratories could be used for testing however significant bias would exist with this method.

Bias is predominately selection bias and can be controlled with careful study design. Other bias includes misclassification bias if test sensitivity and specificity is not 100% and response bias if selected dogs choose not to participate. Knowledge of test sensitivity and specificity can be used to estimate true prevalence from actual prevalence observed. Response bias may be controlled if knowledge of factors for poor response are known and can be mitigated.

Data analysis would include calculation of proportions of infected dogs. Proportions and their confidence intervals can be calculated for regions, rural/urban settings and clinic/pound. Exceeding a cut point number of positives is each state would indicate if infection was detected at or above the minimum prevalence chosen, thus rejecting the null hypothesis.

Question 4

You have been asked to consider lamb mortalities on a property divided by a roadway into two blocks - one block is hilly and the other is flatter and prone to flood. The owner believes that lambs raised on the flatter block are more likely to die than lambs raised on the hilly block. Because he believes them to be hardier, the owner tends to put more wethers on the flatter block than ewe lambs.

Part A (5 marks) - The owner has a total of 100 lambs, evenly split between the two blocks. Of these, he has observed 15 dead on the flatter block and 10 dead on the hilly block. Using these data and a contingency table, calculate the relative risk of mortality for lambs born on the flatter (versus hilly) block.

  Dead Alive Total
Flat block 15 35 50
Hilly block 10 40 50
Total 25 75 100

Relative risk of death = (15/50) ÷ (10/50)
Relative risk of death = 0.3 ÷ 0.2
Relative risk of death = 1.5

This means that there is a 1.5 times greater risk of lambs dying in the flatter paddocks. However when a 95% confidence interval for this RR is calculated it is 0.75 - 3.01 this overlaps 1.0 so this is not a significant result and a larger sample size or more marked difference between the two areas would need to be occurring to show that there was an association between grazing area and mortality.

Part B (15 marks) - Explain how stratified analysis can be used to determine whether a third dichotomous variable (for example, sex of lamb) might confound or otherwise modify the effect of one dichotomous variable on another. Include in your answer, how you would use stratified analysis to delineate between confounding and effect modification. If only confounding is occurring, how might the data be re-analysed?

Stratified analysis can be used to determine whether there is a mixing of effects when more than one explanatory variable is being considered. By pooling the data into homogeneous subgroups it is possible to derive an adjusted measure of association. Most commonly the Mantel-Haenszel method is used to calculate a single unconfounded measure of association. This method assumes that the measures of association within strata strata are uniform and this allows us to combine strata-specific measures to form a single summary measure of association. By stratifying the results we can calculate a weighted average of the strata-specific risk ratios based on the precision and importance of the data within each stratum.

In this example it might be that the sex of the lamb has some effect on survival and so the number of deaths might be due to more than just the terrain of the paddocks. Female lambs may have a lower birthweight or poorer survival. The farmer had already stated that he believed the male lambs to be hardier and so put more of them on the area that he considered "dangerous". By taking any gender differences into account (i.e. by analysing them separately) we can see if there is something about gender that affects survival and therefore mortality is due to factors other than paddock terrain.

When the strata-specific variables are calculated it is possible to simply "eyeball" the results to see if there appears to be a difference in the rates between the genders. This gives us a hint as to whether there is some effect of that variable but it is necessary to proceed to calculate the MH summary Relative Risk and a 95% confidence interval for this. It is also necessary to test that the assumption of heterogeneity holds for the data and this can be calculated. If the measure of association is not homogeneous (there is allowed to be some statistical variation) across the strata then this is reported and there is no point in carrying on with any attempt at stratified analysis.

Part C (10 marks) - The owner now tells you that 30 of his 55 wethers were sent to the flatter block. Of these, 6 subsequently died, whilst only 4 wethers from the hilly block died. He also tells you that 9 of the ewe lambs on the flatter block died. Use a stratified analysis and the data above to determine whether the sex of the lambs is likely to be confounding or otherwise modifying the effect of block on lamb mortality.

Stratify by sex of the lamb:

Wethers Dead Alive Total
Flat block 9 24 30
Hilly block 4 21 25
Total 10 45 55

Ewes Dead Alive Total
Flat block 9 11 20
Hilly block 6 19 25
Total 15 30 45

RR of death for wethers = (6/30) ÷ (4/25)
RR of death for wethers = 0.20 ÷ 0.16
RR of death for wethers = 1.25

RR of death for ewes = (9/20) ÷ (6/25)
RR of death for wethers = 0.45 ÷ 0.24
RR of death for wethers = 1.87

There appears to be effect modification due to gender (although the confidence intervals overlap) however gender also affected allocation to the two different blocks so gender is confounding. If there is only confounding then there are 2 methods of analysing the data: adjust for the confounding variable e.g. use adjusted rates specific to the confounder or produce a summary odds ratio for the combined odds ratio of each level of confounder (the Mantel-Haenszel method). Re-run the trial and match the two groups during the design of the study.

Question 5

Cattle farmers in an agricultural area are asking local veterinarians about the need for selenium supplementation. The area has a history of clinical selenium deficiency but trials have been inconsistent in regard to the benefits of supplementation. The range of products available to supplement with selenium has, in recent years, included fertilizer additives that have been widely adopted. However, some farmers have not supplemented at all and report no ill effects. As the local epidemiologist you have been asked to provide advice on the recommendations that should be promoted to cattle producers. Describe what you would do to enable you to provide that advice.

Answer provided by Jenny Weston:

The question, which requires answering, is: is there a need for Se supplementation? There is reported to be a history of clinical supplementation. Before accepting this, we need to know:

Other descriptive information may be useful eg review the natural history of the disease - from this it may be possible to determine which age groups of animals are most susceptible and under what conditions they are most likely to show disease.

As fertilizer supplementation is the recognized method of addition of the element it may be possible to design a randomized clinical trial:

  1. Specify an aim: to determine if Se supplementation is beneficial.
  2. Specify a null hypothesis eg there is no difference in the outcome (clinical (or subclinical) Se deficiency) between the exposed (fertilized) and unexposed (non supplemented) farms.
  3. Select farms in an area known to have deficient soil. These are the reference population.
  4. Randomly select farms (experimental population) from that area.
  5. Select enough farms to allow a predefined level of power e.g. 80%.
  6. Add fertilizer to all farms, half of the farms are selected at random to have Se added to the fertilizer this is done in a blinded fashion.
  7. The outcome to be measured is the number of experimental units (animals of the predefined host factor status), which develop signs of selenium deficiency, which fulfills our case definition eg clinical, and/or laboratory positives.

The results will allow us to either accept or reject our initial null hypothesis ie to determine if there is an association between se supplementation and the development of clinical Se deficiency.