Cohort studies, also called longitudinal studies, are an important tool of epidemiology and medical research.
This chapter contains the following sections:
- Incidence and Prevalence
- The Basics of Cohort Studies
- Case Control Studies
- Association and Causation
- Further Reading
- Site Index
These studies help us to find what is good or bad for us in such matters as smoking, diet, exercise and life-styles. They are not simple and must be done diligently if they are to be of value. Furthermore, just because two things are apparently linked, does not mean that one causes the other. We shall look at association and causation too. There are examples from some of the classic studies including the ones that elucidated the risks of smoking, the male doctors study, the million women study and the Framingham Study
Incidence and Prevalence
Two terms which describe the frequency of any condition are incidence and prevalence. They are not the same.
- Incidence is the number of new cases per year
- Prevalence is the number of people with that condition at any moment in time
It may refer to an infectious disease or to a chronic condition such as asthma, heart disease or osteoarthritis. It may be expressed as per cent, per 1,000, per 100,000 or per million, depending upon how common it is. For a short-lived infectious disease, incidence will be higher than prevalence, but for chronic conditions that last for many years, prevalence will be higher. For the common cold, the incidence may be around 100% if most people get at least one a year and those who do not suffer are matched by those who get more than one cold a year. Prevalence of many diseases increases with age. Examples are osteoarthritis and diabetes.
The Basics of Cohort Studies
This type of study takes a group of people and follows them over a period of time. The name “cohort” means a representative group of people. A cohort was a tenth part of a Roman legion, representing between 300 and 600 infantry. Longitudinal is because it follows them in time. This is an epidemiological study but epidemiology consists of rather more than this. Epidemiology deals with the causes, distribution, and control of disease in populations.
A cohort study may examine criteria such as smoking habits, blood pressure or blood cholesterol and see how this predicts events such as heart attacks or death. They may also predict good outcomes such as a long and healthy life. Studies may start at the current time and look backwards. These are called retrospective studies. They may start at the current time and follow up the subjects over a number of years. These are called prospective studies and are usually better respected as there is less risk of bias creeping in. Bias is not the same as fraud. It is usually unintentional and the researchers may be unaware of it but it does make results less reliable.
Good studies should have large numbers. The subjects should be either a homogenous or representative cohort. A homogenous group is a very even one. A landmark study is Sir Richard Doll’s follow up of male doctors which started in the 1950s. A representative cohort is a group that contains a good cross section of the society to be studied. However, there can be advantages to taking a narrow group to avoid confounding factors that will give false correlations. During the study few should be lost to follow up. Follow up should be over a reasonable length of time.
Sir Richard Doll (left) and Sir Austin Bradford Hill (right), two of the great doyens of epidemiological
Sir Richard Doll chose to study a large group of male doctors, looking particularly at the effect of smoking habits. Choosing just doctors gave a close socio-economic group. He selected only male doctors, but at the time female doctors amounted to only about 10% of the profession and they may have been more difficult to follow up with possible change of name and taking time out of practice to raise a family. Nowadays “equal opportunities” has produced a medical school intake that is up to 70% female. Equal has a new meaning. He started with 34,439 doctors in 1951. Very few of Sir Richard’s study were lost and his last publication in 2004 described 50 years of follow up. By 2001, only 5,902 were known to have been alive and 25,346 were known to have died. Only 248 subjects were untraced. This is quite exceptional. 1Mortality in relation to smoking: 50 years’ observations on male British doctors. Unfortunately we can expect no further work from Sir Richard Doll as he died in 2005 at the age of 92.
An example of a retrospective study came after the allegation that the MMR vaccine was associated with autism. It examined more than 500,000 children born in Denmark from January 1991 to December 1998 of whom 82% had received the MMR vaccine. It concluded that there was no association between the MMR vaccine and autism.2A population-based study of MMR vaccination and autism. This cohort study of more than half a million children is far more impressive that a case report of a mere dozen and yet few people know about it. Since then, mostly the same team has produced a prospective study of 650,000 children.3MMR Vaccination and Autism: A Nationwide Cohort Study This was not published until 2019 as it took time to conduct the study. It really is astounding that some people are more impressed by one small and fraudulent report than by such diligent research but some people will only believe what they want to believe.
Another excellent example is the “million women study”. This was an incredible achievement in terms of scale and examined the effects of hormone replacement therapy (HRT). One paper looked at the risk of cancer of the body of uterus (endometrial carcinoma). There were 716,738 postmenopausal women in the UK who had not had previous cancer or a previous hysterectomy who were recruited into the Million Women Study in 1996-2001. This is rather short of a million on this occasion but still an impressive number. They provided information about their use of HRT and other personal details, and were followed up for an average of 3.4 years. During that time 1,320 cases of endometrial cancers were diagnosed. 4Endometrial cancer and hormone-replacement therapy in the Million Women The Million Women Study also looked at HRT and breast cancer and this time there were more than a million participants. There were 1,084,110 British women aged 50 to 64 years who were recruited into the Million Women Study between 1996 and 2001. They provided information about their use of HRT and other personal details and were followed up for cancer incidence and death. 5Breast cancer and HRT in the Million Women Study
If you look up this last reference on the Internet, the page shows some interesting features that illustrate how opinion is formed and good scientific practice. One says: Erratum in Lancet. 2003 Oct 4;362(9390):1160. What this means is that after the paper had been published the authors realised that they had made a mistake. It was probably only a slight mistake but they made a point of admitting it and getting it corrected. The other point is that there are seven links to the Lancet in the following weeks where various people have made comments about the paper, possibly criticisms, and the authors have replied. This open forum of discussion enables all readers to see how expert opinion is formed.
Even cohort studies can be subject to meta-analysis, giving a large and powerful number. One such analysis examined studies of HRT on breast cancer from studies between the beginning of 1997 and the end of 2017.6Type and timing of menopausal hormone therapy and breast cancer risk It concluded that if these associations are largely causal, then for women of average weight in developed countries, 5 years of HRT, starting at age 50 years, would increase breast cancer incidence at ages 50–69 years by about one in every 50 users of oestrogen plus daily progestagen preparations; one in every 70 users of oestrogen plus intermittent progestagen preparations; and one in every 200 users of oestrogen-only preparations. The corresponding excesses from 10 years of HRT would be about twice as great.
Another interesting but later piece of research from the Million Women Study was to look at those who ate organic food to ask if they had any less cancer than the rest of the group. 7Organic food consumption and the incidence of cancer Other than a possibly lower incidence of non-Hodgkin’s lymphoma, there was no difference.
It was a cohort study that first showed the relationship between smoking and lung cancer as early as 19508 Doll R, Hill AB. Smoking and carcinoma of the lung; preliminary report. although this did not become common knowledge until the first report of the Royal College of Physicians called Smoking and Health in 1962.9Smoking and Health The relationship between smoking and other diseases, especially heart disease, came from the doctors’ study published in 1956.10Lung cancer and other causes of death in relation to smoking; a second report on the mortality of British doctors.
A classic early piece from the USA was the Framingham study. 11Framington Heart Study The researchers recruited 5,209 men and women between the ages of 30 and 62 from the town of Framingham in Massachusetts in 1948. They had extensive physical examinations and lifestyle interviews that would later be analysed for risk factor for cardiovascular disease. This includes angina, heart attacks and stroke. Since 1948, the subjects have returned to the study every two years for a detailed medical history, physical examination and laboratory tests. In 1971 the researchers enrolled a second generation. They were 5,124 of the original participants’ adult children and their spouses and they participated in similar examinations.
The Framingham Study led to the identification of the major risk factors for heart disease and strokes. They are high blood pressure, high blood cholesterol, smoking, obesity, diabetes, and physical inactivity. The Framingham cohort is mostly of European descent but the risk factors identified have been shown in other studies to apply to other racial groups although perhaps with slight variation. Over the years the Study has produced approximately 1,200 articles in leading medical journals. When studying several potential risk factors simultaneously the mathematics gets rather complex with multivariate analysis. I shall make no attempt to explain this. The Framingham Heart Study has added new diagnostic technologies, such as echocardiography (an ultrasound examination of the heart), carotid artery ultrasound (detecting narrowing of the arteries to the brain), magnetic resonance imaging of the heart and brain, CT scans of the heart and its vessels and bone densitometry (for monitoring osteoporosis).
The Framingham Study has contributed enormously to our understanding of the many risk factors involved, especially for coronary heart disease. General practitioners have charts or computer programmes in which they can enter the patient’s height, weight, blood pressure, cholesterol and smoking status to give a risk of heart attack or stroke over the next 10 years. The heart attack risk is usually about 10 times the risk of stroke but I have found that it is the stroke risk that worries patients most. They no longer use the American Framingham risk assessment, but one designed for British patients by the joint British societies. 12Cardiovascular risk They are the British Cardiac Society, British Hypertension Society, Diabetes UK, HEART UK, Primary Care Cardiovascular Society and The Stroke Association. You may even like to try it yourself. 13QRISK3 calculator
A great problem of population studies, as opposed to randomised controlled trials, is that whereas in the latter people are allocated to a group, in the former they choose their lifestyle regarding smoking, exercise, obesity or whatever is being examined. This can cause bias.
Most studies of the risk of cancer amongst those who eat organic foods have found no significant benefit as concluded in this review. 14Effects of organic food consumption on human health However, a study from Paris did find a significant reduction in cancer amongst those who ate organic food. 15Frequency of Organic Food Consumption with Cancer Risk Unlike the Million Women Study, I can find no evidence that they accounted for smoking status or social class. Organic food is rather more expensive and only those with a commitment to their health would buy it. They are also likely to be more affluent and so of higher social class. Hence those who choose organic food are probably less likely to smoke, less likely to be obese and less likely to consume alcohol to excess. As smoking and obesity are the two strongest predisposing factors for cancer in the developed world they could easily account for bias.
Case Control Studies
Something that is fairly similar to cohort studies but does not involve the flow of time is case controlled studies. In one group are people with a certain outcome or disease whilst the control group does not have that outcome or disease. They must be carefully selected to match the other group in as many ways as possible except for that outcome or disease. Then information is obtained on whether the subjects have been exposed to the factor under investigation such as smoking or exposure to asbestos.
This is a rather quicker and cheaper alternative to cohort studies. It may be the only feasible method for very rare disorders or those with a long lag between exposure and outcome. It needs fewer subjects than cross-sectional studies. However, there are problems.
Selection of the control group is difficult to avoid bias. There may be what is called “recall bias”. If there has been an adverse outcome such as disease or an imperfect baby from a pregnancy, there is a greater tendency to recall matters such as drugs taken in pregnancy than if the outcome was good.
Association and Causation
With all these techniques it is important to appreciate that what they show is an association between something such as smoking and something else such as heart disease or lung cancer. Association and causation are not the same. Just because A and B are associated does not mean that A causes B. It may be that B causes A or there may be an innocent reason for their association. People who smoke are more likely to have lung cancer. There are chemicals in tobacco smoke that are carcinogenic (cause cancer). Smoke is inhaled. Therefore there is a logical link between smoking and lung cancer.
An example of getting it the wrong way round came from the excellent television series Yes Minister or Yes Prime Minister. Sir Humphrey pointed out that the areas of the country with the most deprivation and social problems had the most social workers. Therefore, to reduce the amount of deprivation and social problems, the number of social workers should be reduced.
The incidence of death by drowning is greatest when ice cream sales are high. Therefore ice cream causes drowning. Of course the real reason is that when it is very hot, people tend to do stupid things in water, often having drunk too much alcohol and this causes drowning. Also when it is very hot ice cream sales rise. There is no rational reason why eating ice cream should lead to drowning, and hence no reason why ice cream should be banned to prevent drowning. The association is fortuitous and innocent.
Between about 1950 and 1980 the incidence of childhood leukaemia seemed to rise. It was easy to attribute this to the increasing amount of radioactivity in the atmosphere due to the testing of nuclear bombs. However, the rise may have been spurious and due to antibiotics. This does not mean that antibiotics cause leukaemia, but before we had antibiotics, children or others with leukaemia would present with overwhelming infection which would be fatal. The death certificate would state pneumonia or whatever sort of infection it was. With antibiotics this infection could be overcome and then the diagnosis of leukaemia would become apparent.
When I first read about the association between smoking and cancer of the cervix back in the 1970s, I wondered if this was really cause or just association. The link is now well established. 16Multi-centre Cervical Cancer Study Group. Smoking and cervical cancer It is well known that cervical cancer is associated with early onset of sexual intercourse and multiple partners. 17Risk factors for invasive cervix cancer in young women I wondered if the lifestyle and gullibility of those who could be persuaded into early and promiscuous sexual activity may be similar to those who are gullible enough to smoke. However, it seems that carcinogens from tobacco smoke can reach the cervical mucus and so this offers an explanation for this association. 18Identification of tobacco-specific carcinogen in the cervical mucus
Usually we find that studies showing association are followed by research that elucidates causation. Failure to show a plausible cause may mean that the association is innocent. The great doyen of early epidemiological studies, Sir Austin Bradford Hill, produced criteria for showing if association made causation feasible. 19Causation and Hill’s criteria The original paper was from 1965. 20The Environment and Disease: Association or Causation?
We can look at this with relation to smoking and lung cancer:
- Strength of association. Of the couple of dozen other possible candidates, none showed so strong a correlation as cigarette smoking. The link was strong and clear.
- Consistency of findings. Since the original research, many others in many other parts of the world have conducted similar studies and the results consistently show a strong association between smoking and lung cancer.
- Specificity of the association. No other risk factor matches smoking nor seems to contribute to the risk anything like as strongly.
- Temporal sequence of association. This means that the exposure must precede the outcome. It takes years for the risk to mature.
- Biological gradient. This is sometimes called dose dependency. People who smoke less than 10 cigarettes a day have a lower risk of lung cancer than those who smoke 10 to 20 a day whilst the risk for those who smoke more than 20 a day is higher still.
- Biological plausibility. There is abundant evidence that tobacco smoke contains many carcinogens. They can cause malignancies in animal experiments and more recent work has shown that the disruption to DNA is in keeping with carcinogenesis.
- Coherence. This means that the relationship agrees with the current knowledge of the natural history of the disease. It does. Lung cancer used to be a mainly male disease but as more women took up smoking in the 20th century, the incidence of lung cancer rose towards that of men after the expected interval.
- Experiment. This means that removal of the exposure alters the frequency of the outcome. Giving up smoking does reverse the risk and after a number of years it may well revert to the risk of non-smokers.
- Analogy. Some people include this criterion which was not in the original. It means that if the suggestion is that a virus is responsible for a type of cancer, for example, then the suggestion of another virus causing another cancer is more reasonable. There are many chemicals found in tobacco smoke that are well recognised carcinogens.
If you try the same with the association between the sales of ice cream and the incidence of drowning, it is easy to dismiss the idea of causation.
The evidence linking smoking to lung cancer and coronary heart disease started in the 1950s and became stronger and stronger. There was also a link to heart disease that had been demonstrated in the 1950s. However, this was not generally known until early 1963 when the Royal College of Physicians produced a report called Smoking and Health9Smoking and Health and there was much publicity. I remember that it was on the front page of all the Sunday newspapers.
This should have been the end of the tobacco industry. Its product was dangerous. However, they were not done yet. They had no intention of ceasing to peddle their deadly product. Their main weapon was to suggest uncertainty in the science. It was just statistics. As we have seen from the Hill criteria it was far more than that.
The struggle against tobacco continues and more than 55 years on it is far from won. Only in recent years has the overall number of adults who smoke fallen below 20% and the cohort with the highest rate of smoking is the age group 15 to 25, who have absolutely no excuse for having started in the first place.
Smoking is now linked to far more than just lung cancer.
- It increases the risk of cancer of the mouth, throat, larynx, oesophagus, bladder, bowel, cervix, kidney, liver, stomach and pancreas.
- It is very much associated with chronic bronchitis and emphysema which are nowadays lumped together as COPD (Chronic obstructive pulmonary disease).
- As well as increasing the risk of coronary heart disease it increases the risk of stroke and peripheral vascular disease which is the narrowing of arteries, usually the terminal aorta or arteries to the legs and this can lead to amputation.
- It increases the risk of type 2 diabetes.
- It has a marked adverse effect on reproduction, increasing the risk of miscarriage, premature delivery, stillbirth and low weight babies.
- They are also at risk in a smoky environment and the babies of smoking mothers have delayed milestones in reading.
The battle against the tobacco industry is an interesting and instructive story. It will be told in greater detail in the section about electronic or e-cigarettes. The battle is far from won. An NHS website give a figure of 78,000 premature deaths each year from smoking.21NHS. What are the health risks of smoking? The figure used to be 100,000.
Epidemiology has shown us where the risk factors for disease lie and then it is up to the public, informed by expert opinion, to make the changes. An article in The Times stated that the USA has recorded the biggest drop in cancer deaths in history as it reaps the dividends of plummeting rates of smoking. 22 Non-smokers drive record drop in US cancer deaths Since its peak in the early 1990s, the death rate from cancer in the US has fallen by 29%. As well as lung cancer, some of the biggest improvements have come in breast cancer, prostate cancer and colorectal cancer. Improved treatment has played a part but the mortality rate for each has halved, in part because of healthier lifestyles.
However, not all aspects of lifestyles are getting healthier and as obesity rises, so do cancers that are associated with it. 23 Obesity tops smoking as main cause of cancers Cancer Research UK states that excess weight is a bigger cause of bowel, kidney, ovarian and liver cancer than tobacco. In the UK, about 29% of adults are obese, compared with 14% who smoke. There are 13 cancers caused by obesity, including oesophageal, gallbladder, pancreatic, body of uterus, thyroid, gastric cardia (a type of stomach cancer), meningioma (a type of brain cancer), multiple myeloma (a cancer of the white blood cells) and breast cancer in post-menopausal women.
- Centre for Evidence Based Medicine. University of Oxford. Study designs. http://www.cebm.net/index.aspx?o=1039
May be a little difficult as designed for professional use
- Health Knowledge. Public health textbook https://www.healthknowledge.org.uk/public-health-textbook
An excellent online resource but designed for practitioners in the field.
- Health Knowledge. Introduction to study designs for cohort studies. https://www.healthknowledge.org.uk/e-learning/epidemiology/practitioners/introduction-study-design-cs
An excellent online resource but designed for practitioners in the field.
- Health Knowledge. Case control studies. https://www.healthknowledge.org.uk/e-learning/epidemiology/practitioners/introduction-study-design-ccs
An excellent online resource but designed for practitioners in the field.
- Health Knowledge. Causation in epidemiology: association and causation. https://www.healthknowledge.org.uk/e-learning/epidemiology/practitioners/causation-epidemiology-association-causation
An excellent online resource but designed for practitioners in the field.
- NHS. What are the health risks of smoking?
Again, the NHS offers authoritative advice on the true risks of smoking.
- Doll R, Peto R, Boreham J, Sutherland I. Mortality in relation to smoking: 50 years’ observations on male British doctors. BMJ. 2004 Jun 26;328(7455):1519. Epub 2004 Jun 22. [full text]
- Madsen KM, Hviid A, Vestergaard M, et al. A population-based study of measles, mumps and rubella vaccination and autism. New England Journal of Medicine 2002 347: 1477-1482. [full text]
- Hviid A, Hansen JV, Frisch M, Melbye M. Measles, Mumps, Rubella Vaccination and Autism: A Nationwide Cohort Study. Ann Intern Med. Annals of Internal Medicine 5th March 2019.
- Beral V, Bull D, Reeves G; Endometrial cancer and hormone-replacement therapy in the Million Women Study.;Lancet. 2005 Apr 30-May 6;365(9470):1543-51.[abstract]
- Beral V; Million Women Study Collaborators. Breast cancer and hormone-replacement therapy in the Million Women Study. Lancet. 2003 Aug 9;362(9382):419-27.
- Collaborative Group on Hormonal Factors in Breast Cancer. Type and timing of menopausal hormone therapy and breast cancer risk: individual participant meta-analysis of the worldwide epidemiological evidence. Lancet. 2019 Aug 29. pii: S0140-6736(19)31709-X [full text]
- Bradbury KE, Balkwill A, Spencer EA, Roddam AW, Reeves GK, Green J, et al. Organic food consumption and the incidence of cancer in a large prospective study of women in the United Kingdom. Br J Cancer. 2014 Apr 29;110(9):2321-6.
- Doll R, Hill AB. Smoking and carcinoma of the lung; preliminary report. Br Med J. 1950 Sep 30;2(4682):739-48. [full text] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2038856/
- Royal College of Physicians. Smoking and Health. 1962.
- Doll R, Hill AB. Lung cancer and other causes of death in relation to smoking; a second report on the mortality of British doctors. Br Med J. 1956 Nov 10;2(5001):1071-81. [full text]
- The Framingham Heart Study. https://www.framinghamheartstudy.org/fhs-about/history/
- Tidy C. Cardiovascular Risk Assessment from PatientInfo (Said to be for professional use) 2016.
- QRISK3 calculator.
- Barański M, Rempelos L, Iversen PO, Leifert C. Effects of organic food consumption on human health; the jury is still out! Food Nutr Res 2017; 61(1): 1287333. [full text]
- Baudry J, Assman KE, Touvier M et al.Association of Frequency of Organic Food Consumption with Cancer Risk. JAMA Intern Med. Published online October 22, 2018. doi:10.1001.
- Plummer M, Herrero R, Franceschi S, Meijer CJ, Snijders P, Bosch FX, de Sanjosé S, Muñoz N; IARC Multi-centre Cervical Cancer Study Group.Smoking and cervical cancer: pooled analysis of the IARC multi-centric case–control study. Cancer Causes Control. 2003 Nov;14(9):805-14.
- Cuzick J, Sasieni P, Singer A Risk factors for invasive cervix cancer in young women. Eur J Cancer. 1996 May;32A(5):836-41.
- Prokopczyk B, Cox JE, Hoffmann D, Waggoner SE. Identification of tobacco-specific carcinogen in the cervical mucus of smokers and nonsmokers. J Natl Cancer Inst. 1997 Jun 18;89(12):868-73.
- Crislip M. Causation and Hill’s Criteria. Science-based Medicine 2010.
- Austin Bradford Hill. The Environment and Disease: Association or Causation? Proc R Soc Med 1965 May; 58(5): 295–300. [full text]
- NHS. What are the health risks of smoking?
- Non-smokers drive record drop in US cancer deaths. The Times 9 January 2020
- Obesity tops smoking as main cause of cancers. The Times 3 July 2019
This website is now completed, although I shall continue to do updates. The following list shows the sections or chapters. Just click on the topic in blue to go to that part of the site.