Canadian Medical Association Journal 1997; 156: 193-199
At the time of the review Dr. Bailar was Chair of the Department of Epidemiology and Biostatistics, McGill University, Montreal, Que., and Dr. MacMahon was Henry Pickering Walcott Professor of Epidemiology Emeritus at the Harvard School of Public Health and formerly chairman of that department.
This article has been peer reviewed.
Paper reprints may be obtained from: Dr. Robert A. Phillips, Executive Director, National Cancer Institute of Canada, 20010 Alcorn Ave., Toronto ON M4V 3B1
© 1997 Canadian Medical Association (text and abstract/résumé)
[Table of contents]
The review was prompted by charges that the study may have failed to demonstrate lower rates of death from breast cancer among the women assigned to the mammography arm because the randomization of enrolled women was compromised. In particular, it has been suggested that examining nurses or coordinators at some of the NBSS centres may have arranged for particular women to be enrolled in a study arm (mammography or usual care [control]) other than the one designated in the study design. Rumours of such misallocations have been fuelled by repeated communications in published papers, presentations at meetings and personal letters to Dr. J. David Beatty, then executive director of the NCIC, and to us from Dr. Daniel B. Kopans, associate professor of radiology and director of breast imaging at the Massachusetts General Hospital, and in a publication by Dr. Norman F. Boyd.
Dr. Robert E. Tarone, of the Biostatistics Branch of the US National Cancer Institute, published a paper pointing to the apparent excess of cases of advanced breast cancer among the NBSS subjects aged 4049 assigned to the mammography group relative to those in the control group. This paper is frequently cited by Kopans and other critics of the NBSS. Tarone's criticism has to do with entry criteria for the study and not with randomization, but we comment briefly on this at the end of this article. There have been other areas of criticism of the NBSS (e.g., the quality of the mammographic imaging[1,4]), but they were not part of our charge and are not addressed in this report.
There has been particular interest in the NBSS findings pertaining to women aged 4049. Although the benefits of screening mammography for women over 50 are well established, the few data from randomized studies for women under 50 have not established any benefit. We therefore focused most of our attention on the NBSS data for women aged 4049.
[Table of contents]
The NBSS randomization strategy
Women were recruited into the study by a variety of means, including publicity in the news media, notification of health care professionals in the community, and direct mailings to women of appropriate age who were identified from employee lists in major institutions, members of professional associations, and recipients of family-allowance checks. When confidentiality would not be violated, the personal mailings were followed up with telephone calls.
Fifteen screening centres were established across Canada in institutions with both the facilities necessary to conduct the screening and "considerable expertise in the diagnosis of breast cancer." At the reception desk it was confirmed that a potential study subject was in the right age group (4059 years), had had no mammogram in the previous 12 months and was not pregnant. The reception clerk recorded identifying information, and the woman was given a preliminary questionnaire to complete and an informed consent form to sign. The consent form specifically stated that the woman would be assigned, at random, to either mammography or no mammography. The woman was then examined by a specially trained nurse (or, in the 3 centres in Quebec, by a physician) and taught breast self-examination.
Information on the actual timing of the randomization is not consistent. The originally published protocol stated that after the woman completed the questionnaire and signed the consent form the form was to be "passed to the center coordinator for randomization. The allocated regimen would not, however, be reported until after the physical examination was completed to avoid bias in decisions over possible abnormalities." However, it is clear from other publications that, except in one centre (identified as centre 03), randomization occurred after the clinical examination: the material completed at the registration desk accompanied the patient to the examination, and, after the information from the examination by the nurse or physician had been added, the folder was given to the coordinator for randomization. We are unclear (as seemingly were the study centre staff at the time of our visit) as to why this procedure became the normal practice in all but one of the centres. We believe that questions about whether clinical findings prompted circumvention of the randomization strategy would have been avoided if the strategy as originally specified had been followed.
It was the task of the coordinator to enter each woman's name and identification number on the next available line in an "allocation book." These books were specific for centre and for 5-year age group. The lines were randomly allocated between mammography (MA) and usual care (UC). If the line on which the woman's name was placed indicated MA, the procedure was usually undertaken at the first visit.
If we understand the process correctly, except in centre 03 the nurses (and probably also the coordinators) were aware of the findings of the clinical examination when the allocation was made. Herein lies the basis of the charge that examiners who thought that a woman should or should not have a mammogram, because of findings at clinical examination or personal information obtained during the examination (e.g., risk factors for breast cancer), may have compromised the randomization. It should be noted that in each centre, a nurse or physician finding an abnormality, regardless of the woman's allocation, would so inform the coordinator, and the case would usually be referred to a special review clinic for examination by the project surgeon. The woman might then have a mammogram even if her allocation was to the UC arm. However, because referral would not have ensured mammography, the charge has been made that there remained a motive for an examiner or a coordinator to subvert the randomization if for clinical or other reasons he or she believed that the subject should or should not have a mammogram.
To avoid subversion of randomization of this type, it is current practice to conceal the allocation from both the study subject and the person doing the randomization until or after the commitment of the subject to a particular arm of the study. Randomization by telephone through a central study office is one method currently employed. This may not have been possible in a study the size of the NBSS, but a simple procedure (involving, for example, removal of labels identifying the study arm after the patient's name had been entered) would have strengthened the credibility of the process.
[Table of contents]
On two occasions we met with David Beatty, Arnold Aberman, dean of the Faculty of Medicine at the University of Toronto, and the NBSS senior staff, including Anthony B. Miller, Cornelia J. Baines and Claus Wall. In two respects, we did not fill the terms of reference given to us. First, we did not interview any of the field staff of the study, even though a few are still working in the participating centres. We felt that any "steering" of the randomization was likely to have been highly specific to location (centre) and possibly time, and that any examiner or coordinator who participated in or knew of active subversion of the randomization but did not come forward at the time would have been unlikely to admit it to us, more than 10 years later, even if he or she remembered the details. Second, we wrote to the individual (a radiology technician) quoted by Kopans as having "personal knowledge" of such details, but she did not respond to our letter, even though the letter assured her that her response would be kept confidential. Beatty reported to us that, before our review, he had, after several attempts, spoken by phone to this person, who told him that on one social occasion (Kopans was not present) she had made idle comments on this subject but was unaware of any substance to the charges. She declined, however, to put any of her statements in writing, despite Beatty's assurance of confidentiality. She had not been employed at the centre until about 2 years after the close of randomization.
At our request, the NCIC arranged for document specialists to review certain aspects of the records. The company selected, KPMG Investigation and Securities Inc., Toronto, has special expertise and experience in the examination of documents, particularly for the detection of fraud. Members of the company have worked extensively with the Royal Canadian Mounted Police, Interpol and many private companies. The résumés of the investigators from this company who were assigned to review the NBSS documents are on record in the office of the NCIC executive director.
The KPMG investigators examined the allocation books for evidence of alteration or substitution of names, one of the methods in which randomization could have been compromised. Because the goal of mammography is to prevent death (and morbidity) from breast cancer rather than to prevent the disease itself, and because time and expense were important issues, the investigators' review was limited to the three centres where there was an excess of deaths in the mammography arm compared with the control arm among women 4049. These were centre 02 (6 and 2 deaths respectively), centre 03 (7 and 4 deaths respectively) and centre 11 (4 and 2 deaths respectively). Although these centres were selected on the basis of data for the group aged 4049, the allocation books for women 5059 at the same centres were also examined. In addition, the books were examined for limited periods in two centres where the NBSS central office had suspected administrative problems: centre 01 (from October 1980 to January 1982) and centre 04 (from May 1981 to August 1982).
Clerical errors are to be expected in any study, and the procedure for correcting them had been specified by the central office: when a name had to be changed (e.g., the original name had been entered in the wrong age book) the existing name was to be crossed out with a single line (so that the original name could still be read) and the correct name written above it. However, staff at the centres frequently used white-out or correcting tape to block out the original name. Fortunately, the KPMG investigators were able, in most instances, to decipher both the original name and the replacement. The original name was then sought in the database, by both the KPMG investigators and the NBSS central staff. Information from the database sometimes provided an explanation for the change -- for example, if the original name was later found in the book for a different age group in the same centre with the same date of recruitment, or there was confusion between a maiden name and a married name. However, this process was incomplete: in some cases the original name was not found, and in others a difference in recruitment dates implied that the names referred to two different women with identical names, such occurrences being not uncommon in so large a database, especially, according to Wall, in the francophone Quebec centres, where the range of names was smaller than in the anglophone centres.
The KPMG investigators stated that the likelihood of successful cover-up of an alteration would be near zero under their examination. Entries made in pencil could be erased well enough to obscure the original entry, although evidence would be left that a change had been made. The investigators added that they found no evidence of any deliberate attempt to conceal alterations.
The list of names given to KPMG did not include outcome data. This was provided to us by the NBSS central staff for both original names and the replacements. The main outcome of interest, as indicated earlier, was death from breast cancer, but the occurrence of breast cancer was also noted. The results of this review are detailed in 3 reports (2 from KPMG,[10,11] and 1 from Wall), which are held by the executive director of the NCIC and the NBSS director. All of the reports contain personal confidential information and can be made available only if, in the opinion of the NBSS director, the assurance regarding confidentiality given to the participants can be met. The copies provided to us will be returned to the executive director of the NCIC.
[Table of contents]
What the allocation books revealed
We examined all 3 reports and found them highly consistent insofar as they considered the same issues. The results of the review are most clearly presented in the report by Wall, which, as previously noted, was the only source of information on outcome for the women whose names were either erased or overwritten. The numbers of alterations in the Wall report differed slightly from those in the KPMG reports because the NBSS applied a stricter definition of similarity of date of entry in order to declare a name a "match."
A total of 30 182 records were inspected, of which 467 (1.5%) required investigation. Of the 467 records 219 (47%) indicated clerical errors (e.g., given and family names in reverse order) and involved no change in the identity of the woman entered on the allocation line. A search of the database for the remaining 248 records of women, whose names had been covered and substituted, revealed 147 with a "credible match" (i.e., the same name and date of entry, within specified ranges described by Wall), and 101 whose names were not found elsewhere. When an uncovered name differed from that superimposed, the two names were listed as "pairs" in Wall's report. In our report, record 1 refers to the name now visible on the allocation line and record 2 to the name uncovered by the KPMG investigators.
For 86 (59%) of the 147 women for whom a match for record 2 was found elsewhere in the database, the matched woman was in a different age group from the woman in record 1. For these, it is likely that the original entry was made in the wrong age book and was subsequently corrected.
Regarding the remaining 61 subjects matched with another woman in the same age group, it is important to recognize that the items randomized in the study were not women but lines in the allocation books. Which study arm a woman was assigned to depended on which line she occupied in the allocation book. If there was subversion of the randomization it had to have affected the woman's placement in the book, because the allocation assigned to the line could not be changed. Recognition of this rather obvious point leads to the understanding that any credible pattern of subversion (e.g., to assure that specific individuals were assigned to mammography) would be apparent in record 1, and that any effect on record 2 would be secondary. Thus, we focused on the data in record 1.
It is suspicious that there was an excess of mammography assignments among the record 1 files. This was evident both when no match was found for record 2 (57 mammography, 44 control) and when a match was found (40 mammography, 21 control). Thus, after we eliminated obvious clerical errors and instances in which a woman might have been first entered in the wrong age book, we discovered alterations on 97 lines allocated to mammography and on 65 lines allocated to usual care.
Assuming an equal probability of allocation to either group, the split of 97/65 is highly unlikely (chi2 = 6.3, 1 degree of freedom; p = 0.01, by McNemar's test). However, it must be considered that the women allocated to mammography returned to the centre annually for repeat visits, and so there would be more opportunity to correct originally incorrect entries and to make changes, such as a name. In addition, Miller stated that there was more interaction between the centre staff and the women assigned to mammography than between the staff and those assigned to usual care in the period immediately after allocation because of the mammography procedure itself and other procedures associated with it (e.g., biopsy, review, clinic visits and documentation about patient outcome). Changes to the pages of the allocation books could have been made at the centre before the pages were sent to the NBSS central office, or they could have been made at the central office at a later date. It is therefore not surprising that there would be more changes to the allocation pages of the mammography group.
The difference in the number of changes in the lines allocated to the two study arms may have been due to greater interaction between centre staff and patients in the mammography group or to chance, but it is also possible that some of the alterations may have been made to free up a line allocated to mammography to make room for an improper allocation. However, the logistics of such a manoeuvre would have been challenging. By the time a name would have been covered, the woman first entered would have probably been told of her allocation and her identification number would have been entered on several study forms. Seventeen (18%) of the 97 women whose names were entered onto the mammography lines were referred by the nurse (or physician) for surgical review, as compared with 8 (12%) of the 65 whose names were entered onto the usual care lines. This difference is well within the bounds of chance.
Among the 97 women in the mammography arm, there were 4 deaths, only 1 of which was attributed to breast cancer. Among the 65 in the control arm, there were 2 deaths, neither of which was attributed to breast cancer. One other woman in the mammography group had breast cancer and is still living. There was no such case in the control group.
The Wall report included a useful listing of 28 records that indicated an alteration had occurred and that involved a death or a case of breast cancer in either the woman whose name was originally entered or the one whose name was substituted. In 9 of these records the alteration involved a trivial change and the individual's identity was not changed. Of the remaining records, there was only 1 death from breast cancer: the same woman allocated to mammography referred to in the previous paragraph. Clearly, whatever misallocation might have occurred by the overwriting of names could have had only a trivial effect on the results as published in 1992.
[Table of contents]
Other possible methods of subversion
We considered two ways in which the randomization could have been subverted other than the overwriting of names in the allocation books. One method would have been to enter a woman's name in the next line allocated to the procedure desired, even if the line was not the next one in the appropriate book. This would have involved considerable risk, since another eligible woman of the same age group would have had to appear at the same centre on the same day to fill the gap in the page of the allocation book. The date when a line was filled was noted, but not the time, so women could have been listed out of time sequence. The pages from the allocation books were sent to the NBSS central office every month and were checked for gaps and other errors. When asked in the meeting what was done if a gap was found, Baines responded that there never were any. It is conceivable that a false name could have been entered temporarily to conceal a gap, but that demands a level of deceitfulness and collusion among the study staff that, in the absence of evidence to the contrary, we find not credible.
Another method of subversion would have been, in collusion with the study subject, to have her wait or come in on another day when the coordinator knew that a line allocated to the desired procedure would be available. We can think of no way of checking this possibility, but again it would require a level of knowledge and deviousness of which there is no evidence.
The central office became aware of rumours that the coordinator at one of the study centres was subverting the randomization to ensure mammography for some of her friends. When confronted, the coordinator firmly denied the allegations. However, after examining the allocation books the study director deemed it sufficiently likely that the rumours were true, and the coordinator was promptly removed from her position.
Records from this centre were reviewed for the 14-month period when this coordinator was in charge. Of the 4945 records, 34 (0.69%) indicated insignificant alterations (i.e., a change that did not result in a different name appearing on the allocation line). The proportion of insignificant alterations is similar to that among the records reviewed from the other four centres (0.73%). Of these 34 insignificant alterations 25 were on lines allocated to mammography and 9 on lines allocated to usual care. This ratio (25:9) is significantly different from the ratio of 17:17 that one might expect (p = 0.01). However, we have noted earlier in this report the reasons why such a discrepancy might exist. Only 1 of the patients with an insignificant alteration in her record died; her death occurred 7 years after entry into the study and was not attributed to breast cancer.
During the 14 months at this centre, there was only 1 significant alteration (i.e., a name substitution); it was on a line allocated to mammography. The woman was not found to have breast cancer. Overall, among the women aged 4049 enrolled at this centre, 8 died of breast cancer: 4 in each study group (Dr. Anthony B. Miller: personal communication, 1995).
In addition, we explored the question of whether, in the centre where the coordinator was removed, the pattern of allocation itself was unusual, irrespective of whether any unusual allocation affected the results of the study. During the total study period at this centre, 4111 women aged 4049 years were allocated to mammography and 4120 to usual care. The corresponding figures for women aged 5059 were 3208 and 3199. We requested data for the 4945 women for whom records were reviewed by KPMG (i.e., the women entered into the study during the 14 months when this coordinator was in office). The number of allocations to the mammography and control arms were virtually identical in both age groups: 1397 and 1394 in the 4049 age group, and 1055 and 1060 in the 5059 age group. There were 39 refusals: 22 among the women assigned to the mammography group and 17 among those assigned to the control group. It does not appear that the activities of the coordinator in question influenced either the pattern of allocation or the mortality results from this centre as included in the data reported in 1992.[6,7]
Although we cannot exclude the possibility that one or the other of these methods of subversion (or other methods we are unaware of) were used, the only credible evidence we found or learned about was the one instance of possible subversion by the coordinator at one of the centres. Otherwise we found no evidence that any of these methods of subversion was used. Even if they were used, we believe that the they would have affected very few individuals, that they would have been more likely to have been made on a personal basis than on the basis of risk for breast cancer and that they they would have been unlikely to have affected the results of the study. In short, these other ways of subverting the randomization would have been more difficult, carried greater risk of detection and required a higher level of sophistication than the overwriting of names in the allocation books. As we have already noted, the KPMG investigators commented that the alterations they detected had been made by means that did not suggest any deliberate effort at concealment.
[Table of contents]
The effect of entry criteria
Tarone pointed to the apparent excess of cases of advanced breast cancer among the NBSS subjects aged 4049 in the mammography arm relative to those in the control arm. Although we believe that this apparent excess is not strictly a consequence of defects in randomization but, rather, of criteria for entry into the study, it is sufficiently close to the topic of randomization that we chose to discuss it briefly here.
Tarone recommended that, because there was a higher proportion of patients with 4 or more positive lymph nodes among the subjects assigned to mammography than among those assigned to usual care, mortality analyses should have been undertaken after elimination of "advanced cases detected by physical examination at the initial screening visit" (emphasis added). We agree with Tarone but note that, as Miller has pointed out, allocation to mammography may itself lead to surgery and the discovery of nodal involvement; therefore, elimination of, for example, patients with 4 or more positive nodes may introduce a bias favouring survival for the patients with no advanced disease in the mammography group. The subjects eliminated from the mortality analysis should, as Tarone stated, have been those in whom an abnormality was detected at physical examination, regardless of the group to which they were assigned. The most appropriate way to identify such women would have been to note whether or not they had been referred by the examining nurse or physician for surgical evaluation. This referral would have occurred before mammography and, therefore, should be the least biased with respect to allocation. We discussed this issue with Miller, who stated that such analyses had been done and the results did not substantially affect the initial study findings, but that the NBSS group had decided to delay publication of these results until the 10-year follow-up data were available.
We believe that there would be two advantages to publishing the 7-year follow-up data (now available) as soon as possible (perhaps in the form of a brief letter for speed of publication). First, this criticism of the study would end. Second, the understanding of the methodology of trials such as this one would be improved -- would randomization after clinical examination really have affected the outcome for patients enrolled in the study, or is this only a theoretical objection? To address this question the analysis must, of course, be done using the follow-up data in the 1992 report that gave rise to the criticism in the first place. There is no reason why it could not be repeated on the 10-year follow-up data when they become available.
[Table of contents]
We noted several ways in which the randomization in the NBSS study could have been subverted, but a thorough search has not uncovered any credible evidence to support the charge. If there were subversions, they were very few.
If there was subversion by the method of overwriting one name with another in the allocation books, the occurrence of only 1 death from breast cancer among the women whose names were altered indicates that the alterations could have had only a trivial effect on the study findings as reported in 1992.
[Table of contents]