Measuring of correlation communication




Standardization is the method of calculation of conditional (standardized) indices.

The essence of  standardization method of indices consists in the calculation of conditional (standardized) indices, which substitute the intensive or other quantities in those cases, when comparison of these indices is complicated through the impossibility of comparison of groups structure.

The standardized indices are conditional, because they indicate, what these indices were, if the influence of this factor that interferes their comparison, was absent accessory removing the influence of this or that factor on the veritable (real) indices. The standardized indices can be used only with the purpose of comparison, because they don’t give imagination about the real sizes of the phenomenon.

There are different methods of calculation of the standardized indices. The most widespread method is the direct one.

The direct method of standardization is used at:

а) considerable divergences of levels of group indices (for example, different levels of lethality in hospitals or departments, different levels of morbidity for men and women, and others);

б) considerable heterogeneity of totalities, which are compared.

The standardized indices show, what were the veritable indices, if the influence of some certain factor was not present. They allow to level any influence on the indices.

Name the stages of direct method of standardization.

I -st stage is the calculation of general intensive indices (or averages) for the pair of totalities, which are compared;

ІІ-nd stage – choice and calculation of standard. As a standard they most frequently use the half-sum of two groups (totalities), which are compared;

ІІІ-rd stage is the calculation of „expected quantities” in every group of standard;

The forth ІV stage is the determination of the standardized indices;

The fifth V stage is the comparison of groups according to the intensive and standardized indices.


         In the conclusions it must be noted that the standardized index - is the conditional index, which answer only the question – what was the level of the phenomenon that is studied, if the conditions of its origin were standard.

The ordinary intensive indices, characterizes the level, frequency of the phenomenon, because they are true and may change depending on the size of the taken standard.



Standardization of indexes



Methods of Standardization














The stages of direct method of standardization:

1. Calculation of general intensive (or average) indices in compared groups.

2. Choice and calculation of the standard.

3. Calculation of “expected” figures in every group of the standard.

4. Determination of standardized indices.

5. Comparation of simple intensive and standardized indices. Conclusions.


Usage of standardized indices:

1.     Comparative evaluation of demographic indices in different age and social groups.

2.     Comparative analysis of morbidity in different age and social groups.

3.     Comparative evaluation of treatment quality in hospitals with different content of patients in departments.


The types of values which exist in science:

·        absolute – the absolute size of the phenomenon, environment are represented

·        average    the variant type of signs distribution are represented

·        relative  – the alternative type of signs distribution are represented

As we see, average term of treatment in the hospital №2 is much lower in comparison with hospital №1. But the analysis of these parameters in separate branches testifies to an inaccuracy of this conclusion.

In hospital №1 therapeutic patients prevail, and in hospitals №2 - gynecologic, which terms of treatment essentially differ.

One of the fundamentals of health situation analysis (HSA) is the comparison of basic health indicators. Among other objectives of HSA, this allows to identify risk areas, define needs, and document inequalities in health, in two or more populations, in subgroups of a population, or else in a single population at different points in time. Crude rates, whether they represent mortality, morbidity or other health events, are summary measures of the experience of populations that facilitate this comparative analysis. However, the comparison of crude rates can sometimes be inadequate, particularly when the population structures are not comparable for factors such as age, sex or socioeconomic level. Indeed, these and other factors influence the magnitude of crude rates and may distort their interpretation in an effect called confounding (box 1).(1 ,2 ,3)

The calculation of specific rates in well defined subgroups of a population is a way of avoiding certain confounding factors. For example, specific rates calculated by age groups are often used to examine how diseases affect people differently depending on their age. However, although this uncovers the patterns of health events in the population and allows for more rigorous comparison of rates, it can sometimes be impractical to work with a large number of subgroups.(4) Furthermore, if the subgroups consist of small populations, the specific rates can be very imprecise. The process of standardization (or adjustment) of rates is a classic epidemiological method that removes the confounding effect of variables that we know — or think — differ in populations we wish to compare. It provides an easy to use summary measure that can be useful for information users, such as decision-makers, who prefer to use synthetic health indices in their activities.

In practice, age is the factor that is most frequently adjusted for. Age-standardization is particularly used in comparative mortality studies, since the age structure has an important impact on a population’s overall mortality. For example, in situations with levels of moderate mortality, as in the majority of the countries of the Americas, an older population structure will always present higher crude rates than a younger population.

There are two main standardization methods, characterized by whether the standard used is a population distribution (direct method) or a set of specific rates (indirect method). The two methods are presented below.


Direct method

In the direct standardization method, the rate that we would expect to find in the populations under study if they all had the same composition according to the variable which effect we wish to adjust or control (such as age, socioeconomic group, or other characteristics) is calculated. We use the structure of a population called “standard”, stratified according to the control variable, and to which we apply the specific rates of the corresponding strata in the population under study. We thus obtain the number of cases “expected” in each stratum if the populations had the same composition. The adjusted or “standardized” rate is obtained by dividing the total of expected cases by the standard population. An example is presented in

box 2

An important step in the direct standardization method is the selection of a standard population. The value of the adjusted rate depends on the standard population used, but to a certain extent this population can be chosen arbitrarily, because there is no significance in the calculated value itself. Indeed adjusted rates are products of a hypothetical calculation and do not represent the exact values of the rates. They serve only for comparisons between groups, not as a measure of absolute magnitude. However, some aspects should be taken into account in the selection of the standard population. The standard population may come from the study population (sum or average for example). In this case however, it is important to ensure that the populations do not differ in size, since a larger population may unduly influence the adjusted rates. The standard population may also be a population without any relation to the data under study, but in general, its distribution with regard to the adjustment factor should not be radically different from the populations we wish to compare.

The comparative study of adjusted rates may be carried out in different ways: we can calculate the absolute difference between the rates, their ratio, or the percentage difference between them. Obviously, this comparison is valid only when the same standard was used to calculate the adjusted rates. When the national standards change (as in the United States in 1999 for example, when a new standard was adopted based on the 2000 population instead of the 1940 standard), the time series have to be recalculated at all levels. Updating the standard populations provides a more current common standard. For comparison of rates from different countries, the standard population used by WHO and PAHO is the so-called “old” standard population defined by Waterhouse. The age distribution of this population is shown in Box 3.

The direct method is most often used. However, it requires rates specific to population strata corresponding to the variable of interest in all the populations we wish to compare, which are sometimes not available. Even when these specific rates are available for all the subgroups, they are sometimes calculated from very small numbers and can be very imprecise. In this case, the indirect standardization method is recommended.


Rates and standardization

Since epidemiology is concerned with the distribution of disease in populations, summary measures are required to describe the amount of disease in a popula­tion. There are two basic measures, incidence and prevalence.

The period of time is specified in the units in which the rate is expressed. Often the rate is multiplied by a base such as 1000 or 1000 000 to avoid small decimal fractions. For example, there were 280 new cases of cancer of the pancreas in men in New South Wales in 1997 out of a population of 3-115 million males. The incidence was 280/3-115 = 90 per million per year.

Prevalence, denoted by P, is a measure of the frequency of existing disease at a given time, and is defined as

Both incidence and prevalence usually depend on age, and possibly sex, and sex- and age-specific figures would be calculated.

The prevalence and incidence rates are related, since an incident case is, immediately on occurrence, a prevalent case and remains as such until recovery or death (disregarding emigration and immigration). Provided the situation is stable, the link between the two measures is given by

P = It,                                                       (19.4)

where t is the average duration of disease. For a chronic disease from which there is no recovery, t would be the average survival after occurrence of the disease.


Problems due to confounding arise frequently in vital statistics and have given rise to a group of methods called standardization. We shall describe briefly one or two of the most well-known methods.

Mortality in a population is usually measured by an annual death rate — for example, the number of individuals dying during a certain calendar year divided by the estimated population size midway through the year. Frequently this ratio is multiplied by a convenient base, such as 1000, to avoid small decimal fractions; it is then called the annual death rate per 1000 population. If the death rate is calculated for a population covering a wide age range, it is called a crude death rate.

In a comparison of the mortality of two populations, say, those of two different countries, the crude rates may be misleading. Mortality depends strongly on age. If the two countries have different age structures, this contrast alone may explain a difference in crude rates (just as, in Table 15.6, the contrast between the ‘crude’ proportions with factor A was strongly affected by the different sex distributions in the disease and control groups). An example is given in Table 19.1 which shows the numbers of individuals and numbers of deaths separately in different age groups, for two countries: A, typical of highly industrialized countries, with a rather high proportion of individuals at the older ages; and B, a developing country with a small propor­tion of old people. The death rates at each age (which are called age-specific death rates) are substantially higher for B than for A, and yet the crude death rate is higher for A than for B.

Sometimes, however, mortality has to be compared for a large number of different populations, and some form of adjust­ment for age differences is required. For example, the mortality in one country may have to be compared over several different years; different regions of the same country may be under study; or one may wish to compare the mortality for a large number of different occupations. Two obvious generalizations are: (i) in standardizing for factors other than, or in addition to, age—for example, sex, as in Table 15.6; and (ii) in morbidity studies where the criterion studied is the occurrence of a certain illness rather than of death. We shall discuss the usual situation—the standardization of mortality rates for age.

The basic idea in standardization is that we introduce a standard population with a fixed age structure. The mortality for any special population is then adjusted to allow for discrepancies in age structure between the standard and special populations. There are two main approaches: direct and indirect methods of standardization. The following brief account may be supplemented by refer­ence to Liddell (1960), Kalton (1968) or Hill and Hill (1991).

The following notation will be used.


Direct method

In the direct method the death rate is standardized to the age structure of the standard population. The directly standardized death rate for the special popu­lation is, therefore,

It is obtained by applying the special death rates, pi, to the standard popula­tion sizes, N,. Alternatively, p' can be regarded as a weighted mean of the p,, using the N, as weights. The variance of p' may be estimated as

where q, = 1 — pt; if, as is often the case, the pi are all small, the binomial variance of pi, piq,/ni, maybe replaced by the Poisson term pi/ni (= ri/n^2), giving

To compare two special populations, A and B, we could calculate a standardized rate for each (p'A and p'B), and consider

From (19.5),

which has exactly the same form as (15.15), with Wi = Ni, and di = pAipBi as in (15.14). The method differs from that of Cochran’s test only in using a different system of weights. The variance is given by

with var(d,) given by (15.17). Again, when the p0i are small, qo, can be put approximately equal to 1 in (15.17).

If it is required to compare two special populations using the ratio of the standardized rates, p'a/p'b, then the variance of the ratio may be obtained using (19.6) and (5.12).

The variance given by (19.7) may be unsatisfactory for the construction of confidence limits if the numbers of deaths in the separate age groups are small, since the normal approximation is then unsatisfactory and the Poisson limits are asymmetric. The standardized rate (19.5) is a weighted sum of the Poisson counts, ri. Dobson et al. (1991) gave a method of calculating an approximate confidence interval based on the confidence interval of the total number of deaths.



Surveys to investigate associations

A question commonly asked in epidemiological investigations into the aetiology of disease is whether some manifestation of ill health is associated with certain personal characteristics or habits, with particular aspects of the environment in which a person has lived or worked, or with certain experiences which a person has undergone. Examples of such questions are the following.

1.     Is the risk of death from lung cancer related to the degree of cigarette smoking, whether current or in previous years?

2.     Is the risk that a child dies from acute leukaemia related to whether or not the mother experienced irradiation during pregnancy?

3.     Is the risk of incurring a certain illness increased for individuals who were treated with a particular drug during a previous illness?

Sometimes questions like these can be answered by controlled experiment­ation in which the presumptive personal factor can be administered or withheld at the investigator’s discretion; in example 3, for instance, it might be possible for the investigator to give the drug in question to some patients and not to others and to compare the outcomes. In such cases the questions are concerned with causative effects: ‘Is this drug a partial cause of this illness?’ Most often, however, the experimental approach is out of the question. The investi­gator must then be satisfied to observe whether there is an association between factor and disease, and to take the risk which was emphasized in §7.1 if he or she wishes to infer a causative link.

These questions, then, will usually be studied by surveys rather than by experiments. The precise population to be surveyed is not usually of primary interest here. One reason is that in epidemiological surveys it is usually admin­istratively impossible to study a national or regional population, even on a sample basis. The investigator may, however, have facilities to study a particular occupational group or a population geographically related to a particular med­ical centre. Secondly, although the mean values or relative frequencies of the different variables may vary somewhat from one population to another, the magnitude and direction of the associations between variables are unlikely to vary greatly between, say, different occupational groups or different geograph­ical populations.

There are two main designs for aetiological surveys — the case-control study, sometimes known as a case-referent study, and the cohort study. In a case- control study a group of individuals affected by the disease in question is compared with a control group of unaffected individuals. Information is obtained, usually in a retrospective way, about the frequency in each group of the various environmental or personal factors which might be associated with the disease. This type of survey is convenient in the study of rare conditions which would appear too seldom in a random population sample. By starting with a group of affected individuals one is effectively taking a much higher sampling fraction of the cases than of the controls. The method is appropriate also when the classification by disease is simple (particularly for a dichotomous classifica­tion into the presence or absence of a specific condition), but in which many possible aetiological factors have to be studied. A further advantage of the method is that, by means of the retrospective enquiry, the relevant information can be obtained comparatively quickly.

In a cohort study a population of individuals, selected usually by geograph­ical or occupational criteria rather than on medical grounds, is studied either by complete enumeration or by a representative sample. The population is classified by the factor or factors of interest and followed prospectively in time so that the rates of occurrence of various manifestations of disease can be observed and related to the classifications by aetiological factors. The prospective nature of the cohort study means that it will normally extend longer in time than the case- control study and is likely to be administratively more complex. The correspond­ing advantages are that many medical conditions can be studied simultaneously and that direct information is obtained about the health of each subject through an interval of time.

Case-control and cohort studies are often called, respectively, retrospective and prospective studies. These latter terms are usually appropriate, but the nomenclature may occasionally be misleading since a cohort study may be based entirely on retrospective records. For example, if medical records are available of workers in a certain factory for the past 30 years, a cohort study may relate to workers employed 30 years ago and be based on records of their health in the succeeding 30 years. Such a study is sometimes called a historical prospective study.

A central problem in a case-control study is the method by which the controls are chosen. Ideally, they should be on average similar to the cases in all respects except in the medical condition under study and in associated aetiological factors. Cases will often be selected from one or more hospitals and will then share the characteristics of the population using those hospitals, such as social and environmental conditions or ethnic features. It will usually be desirable to select the control group from the same area or areas, perhaps even from the same hospitals, but suffering from quite different illnesses unlikely to share the same aetiological factors. Further, the frequencies with which various factors are found will usually vary with age and sex. Comparisons between the case and control groups must, therefore, take account of any differences there may be in the age and sex distributions of the two groups. Such adjustments are commonly avoided by arranging that each affected individual is paired with a control individual who is deliberately chosen to be of the same age and sex and to share any other demographic features which may be thought to be similarly relevant.

The remarks made in §19.2 about non-sampling errors, particularly those about non-response, are also relevant in aetiological surveys. Non-responses are always a potential danger and every attempt should be made to reduce them to as low a proportion as possible.