Design Gisborne Cervical Screening Inquiry
[CSI Report]
[Media Releases]
[Proceedings]
[Progress Reports]
[Background]
[About the Inquiry]
[Getting Help]
[Other Reports]

 

Report of the Ministerial Inquiry into the Under-reporting of Cervical Smear Abnormalities in the Gisborne Region

4. Term of Reference One

Has there been an unacceptable level of under-reporting in consequence of misreading and/or mis-reporting of abnormalities in cervical smears in the Gisborne region?

4.1 The Committee of Inquiry is satisfied that there has been an unacceptable level of under-reporting of abnormalities in cervical smear tests in the Gisborne region during the period from 1991 to March 1996. The Committee has only heard evidence in regard to cervical smear test readings by Gisborne Laboratories Limited. It has heard no evidence which would have allowed it to determine whether or not there had been under-reporting of cervical smear tests read in the laboratory at Gisborne Hospital and therefore it is unable to comment on the performance of that laboratory’s reading of cervical smear tests during the relevant period. Its finding on the presence and the level of under-reporting of cervical smear tests in the Gisborne region is based only upon an analysis of the performance of Gisborne Laboratories.

4.2 Because the terms of reference directed the Committee to look into the reading of abnormalities in cervical smear tests in the Gisborne region prior to March 1996 it has not heard sufficient evidence on this topic post March 1996 to be able to comment on laboratory performance since then. It has heard no evidence to suggest that there has been an unacceptable level of under-reporting of cervical smear tests from the Gisborne region since March 1996. However, as a comprehensive evaluation of the performance of the National Cervical Screening Programme has never been completed and laboratory cervical smear test reporting is still not routinely monitored the Committee considers that the quality of cervical smear test reporting for this later period is unknown.

4.3 Dr Bottrill read most of the cervical smear tests that were carried out in Gisborne Laboratories. There were times when either due to Dr Bottrill’s absence on leave or because extra help was needed locum pathologists were used to read the cervical smear tests. However, the evidence shows that the reading of cervical smear tests by these persons can not account for the under-reporting which has occurred.

4.4 By the end of the inquiry hearings there was clear evidence before the Committee that among cervical smear tests that were carried out in Gisborne Laboratories, during the period under consideration positive tests had had been under-reported to an unacceptable extent. Initially, the task of determining whether or not there had been an unacceptable level of under-reported cervical smear tests in the Gisborne Region seemed intractable. The reading of smear tests is based upon a microscopic evaluation of the smear test by one or more observers. Evaluation is prone to human error for a number of reasons, chief amongst them being the difficulty in consistently maintaining the high level of concentration needed to detect the abnormal cells and also because the interpretation of the abnormal cell is somewhat subjective. Some under-reporting of cervical smear tests in consequence of misreading and/or misreporting is therefore inevitable. Even in well run laboratories with state-of-the-art technology and appropriate quality control systems cervical smear tests can be under-reported.

4.5 Failure of the laboratory to detect pre-cancer or cervical cancer cell changes when the abnormal cells are actually present in the smear test is referred to as a false negative result. A false negative result is defined in a number of ways, and consequently the false negative rate can be measured in a number of ways. One approach is to measure how many high-grade lesions confirmed by biopsy had a negative smear test report 6 months prior to that biopsy. Another is to re-read all or a proportion of a laboratory’s negative smears to measure how many were actually abnormal. While there are published standards for false negative rates using these definitions in other countries, from the evidence given, the Committee understood that the false negative rate of any laboratory could only be compared to another laboratory or a published standard if the methodologies for measuring the false negative rate were the same. While the Committee was not specifically charged with investigating the over-reporting of cervical smear abnormalities in the Gisborne region, this form of laboratory error, ie the reporting of a cellular abnormality when none is present in the test, did come into evidence during the Inquiry. This type of error is also called a false positive result and similar to a false negative result can be defined and measured in a number of ways. Published standards are also in existence in some countries. The same caution must be used when comparing false positive rates from different laboratories and published standards as is used when comparing false negative rates.

4.6 The Committee’s task was made even more difficult by the fact that standards did not exist for New Zealand and the methodologies used by the Health Funding Authority to determine the false negative rate of Gisborne Laboratories were not comparable to those of published methodologies. In some countries with established screening programmes quantitative standards for reporting cervical smear tests have been set to provide a measurement of laboratory performance. During the time that Dr Bottrill was in practice the National Cervical Screening Programme imposed no quantitative standards on laboratory performance. Apart from extreme cases of under-reporting, which on any view would be unacceptable, without clearly set standards against which to measure a laboratory’s performance it is not easy to distinguish unacceptable under-reporting from the accepted level of under-reporting that is inherent in cervical smear evaluation.

4.7 The absence of quantitative standards over the relevant period and the inevitability of some under-reporting have meant that the Committee of Inquiry has had to determine for itself what is an unacceptable level of under-reporting of cervical smear tests. The Committee was advised by more than one expert witness of the need for it to take a common sense view of the matter. The Committee agrees with this advice. In the end it has chosen to consider the combined effect of a number of indicators to assist it to report on term of reference one. The Committee recognises that no single indicator may be sufficient to reach a conclusion on the level of under-reporting, however, it considers that the combined effect of these indicators convinced it that there had been an unacceptable level of under-reporting. The Committee considered that to reach a common sense view it would adopt the test the common law uses to determine civil issues: namely the balance of probabilities. However, having heard all the evidence the Committee was in no doubt whatsoever that there had been unacceptable under-reporting.

4.8 The Committee has received evidence from more than one source which shows that at Gisborne Laboratories there was a failure to read correctly the cervical smear tests of a large number of women in the Gisborne region and that many of these women went on to develop cervical cancer which could have been prevented if their pre-cancer had been detected earlier on. When the results of the re-examination of Gisborne cervical smear tests by Douglass Hanly Moir Pathology (the Sydney re-read) are compared with the original smear test reports from Gisborne Laboratories the high level of under-reporting becomes apparent. In total 22,976 slides were sent to Sydney for re-examination. Of these slides Dr Bottrill had originally read 20,860 and the locums, employed by Gisborne Laboratories, had read 2,116. From these figures, which appear in exhibit TM/HFA/097, it can be seen that the impact of the locums’ reading at Gisborne Laboratories was negligible.

4.9 The Committee has had the benefit of hearing from a number of expert witnesses whose evidence on this term of reference has been of great assistance to the Committee. The witnesses included:

Dr Annabelle Farnsworth MB BS(Hons), the director of cytopathology at Douglass Hanly Moir Pathology;

Dr Euphemia McGoogan MB ChB, member of the Royal College of Pathologists. Her area of special expertise is cervical cytopathology. She is currently Pathology Patient Services Director for the Lothian University Hospitals NHS Trust in Edinburgh and as such is responsible for the largest combined morbid anatomy, histopathology and cytopathology service in the UK;

(ii) Professor David Skegg BMedSc; ChB (Otago); DPhil (Oxon); FFPHM; FAFPHM; FRSNZ, Professor of Preventive and Social Medicine at the University of Otago Medical School. He has carried out extensive research on the causes and control of cancer.

(iii) Dr Brian Cox BSc (Hons) MB ChB PhD, specialist in public health medicine and an epidemiologist. He is employed by the University of Otago as a Senior Research Fellow and he is the director of the Hugh Adam Cancer Epidemiology Unit, Department of Preventive and Social Medicine, University of Otago Medical School. He is a Fellow of the Australasian Faculty of Public Health Medicine and he is registered as a specialist in public health medicine.

(iv) Dr George Wain MB BS, Fellow of the Royal Australian College of Obstetricians and Gynaecologists. He holds the Certificate of Gynaecological Oncology of the Royal College of Obstetricians and Gynaecologists. He is the Director of Gynaecological Oncology at Westmead Hospital in Sydney and a Senior Lecturer in Gynaecological Oncology at the University of Sydney.

4.10 The Health Funding Authority provided for the Committee a report titled the Action Update Report, (received as exhibit TM/HFA/087). This report updated the results of the Sydney re-read as compared with the results of Gisborne Laboratories. In the course of her evidence to the Committee Dr Farnsworth produced a document (exhibit AF/HFA/004) which set out her analysis of the data in the Action Update Report. She elaborated on this analysis when questioned by the Committee. The analysis Dr Farnsworth provided in exhibit HF/HFA/004 produced three discrete indicators of the two laboratories’ performance. These three indicators were enough to satisfy Dr Farnsworth that there had been an unacceptable level of under-reporting of cervical smear tests at Gisborne Laboratories.

4.11 For the purpose of understanding the first indicator it is important to note that in the interchange between Dr Farnsworth and the Committee the term "false positive reporting" was defined as the percentage of smear tests which were not confirmed by the biopsy or for which there were no biopsy results. However, when Dr Farnsworth came to give evidence on the second indicator the definition of false positive changed from that used in the first indicator. Here the term "false positive" referred to the percentage of women with normal histology who had been reported as having high-grade/cancer cytology. To arrive at these percentages the denominator used to calculate the first indicator included all the women with a high-grade/cancer cytology result recorded in tables 5.3 and 5.4 of exhibit TM/HFA/087 regardless of whether or not they had histology results recorded as well. The denominator used to calculate the second indicator only included those women recorded in tables 5.3 and 5.4 who had histology results and was restricted to women with negative histology. Women who did not have histology results were not included. Similarly, for the third indicator the group of women being considered, and the denominator used to calculate the percentages, is different to the other two indicators. It follows that because the denominators for each group are different each indicator must be viewed discretely.

First Indicator

4.12 The first indicator is taken from the proportion of women with high-grade/cancer cytology who were later confirmed on biopsy as having high-grade/cancer histology. It is a measure of the accuracy of high-grade/cancer cytology reporting. This indicator is derived from data set out in tables 5.3 and 5.4 of exhibit TN/HFA/087. The data in table 5.3 refers to the original reading by Gisborne Laboratories and in 5.4 to the re-reading by Douglass Hanly Moir Pathology. Table 5.3 comprises 3 sub-tables (5.3(a) to (c)) of data which set out the histology results from initial colposcopy in relation to the highest original smear test result, for all women over three time periods. The three time periods were 1991 to February 1996, March 1996 to April 1999, and May 1999 to the present. Evidence before the Committee explained that the data was presented in this format in order to allow for the effect of time on the analysis and interpretation. The importance of this related to the fact that cervical pre-cancer can over time regress to normal or a lesser pre-cancerous lesion, persist unchanged, or progress to a more severe pre-cancerous lesion or cancer. Dr Farnsworth explained that for the purposes of the inter-laboratory comparison, ie the comparison of Gisborne Laboratories with Douglass Hanly Moir Pathology, the impact of time would be the same for both laboratories and would not need to be allowed for. The results in exhibit AF/HFA/004 relate to aggregated data from the three time periods.

4.13 From table 5.3 the proportion of women who had high-grade/cancer cytology reports from Gisborne Laboratories and who were subsequently confirmed by biopsy as having high-grade/ cancer histology can be seen. Table 5.4 also comprises three sub-tables (5.4.(a) to (c)) which set out the histology results from initial colposcopy, in relation to the highest smear result read by Douglass Hanly Moir Pathology, for all women over the same three time periods as in table 5.3. From table 5.4 the proportion of women who had high-grade cytology including cancer reports from Douglass Hanly Moir Pathology and who were subsequently confirmed by biopsy as having high-grade/ cancer histology can be seen.

4.14 When table 5.3 is compared with table 5.4 two points emerge. The first is that both laboratories had approximately the same proportion of high-grade/cancer cytology confirmed as high-grade/cancer by histology. The original smear test results showed that 37 out of 72 women who were reported as having high-grade/cancer cytology were confirmed as high-grade/cancer on biopsy. This makes the confirmation rate for high-grade/cancer cytology reported at Gisborne Laboratories 51.3%. The results of the Sydney re-read showed that 132 out of 260 women who Douglass Hanly Moir Pathology reported as having high-grade/cancer cytology were later confirmed as high-grade/cancer on biopsy. This makes the confirmation rate for high-grade/cancer cytology reported at Douglass Hanly Moir Pathology 50.7%. These results indicate that each laboratories’ confirmation of their smear results of high-grade/cancer at approximately 50% was much the same. The remainder, were either not confirmed by the histology or there was no histology result yet available. Dr Farnsworth gave evidence that some of these unconfirmed high-grade/cancer cytology results could be due to false positive reporting or the reporting could be correct as their disease status was unknown until they had undergone a biopsy. The Committee understood from this evidence that the 50% confirmation rate of each laboratory was a minimum rate and that the inclusion of additional histology results might increase the confirmation rate of one or both laboratories, but would not decrease it.

4.15 Because the re-read exercise had been carried out to ascertain if women whose cervical smear tests had been read at Gisborne Laboratories were at risk there was a concern that when the smear tests were re-read at Douglass Hanly Moir Pathology the screeners, who would know that the smear tests were being re-read, would be overly cautious in their approach. If the screeners at Douglass Hanly Moir Pathology had been overly cautious this could lead them to over-report smear tests as high-grade/ cancer. In this case the results of the re-reading would not provide a fair basis for comparison with the original results of the readings at Gisborne Laboratories. For this reason doubts had been raised about the usefulness to the Committee of the information coming from the Sydney re-read. However it became clear to the Committee, when it heard the evidence of Dr Farnsworth of Douglass Hanly Moir Pathology, that both laboratories had a similar rate of accuracy in reporting high-grade/cancer. If Douglass Hanly Moir pathology had over-reported the smear tests relative to Gisborne Laboratories, its confirmation rate of high-grade/cancer cytology would have been less than Gisborne Laboratories. The similarity in their rate of accuracy was enough to allay any doubts the Committee might otherwise have had about using the results from the Sydney re-read as a basis for comparison with the original results from Gisborne Laboratories. Hence, the Committee was confident about using the information from the Sydney re-read results for the purposes of determining if there had been under-reporting of cervical smear tests at Gisborne Laboratories.

4.16 The second point to emerge from a comparison of table 5.3 with table 5.4, is the more significant. When the original results are compared with the results of the Sydney re-read a wide discrepancy between the laboratories in the number of reported high-grade/cancer cytology results becomes readily apparent. At Douglass Hanly Moir Pathology 132 smear tests had been read and confirmed by biopsy as high-grade/cancer which is 3.5 times more than the 37 smears read as high-grade/cancer by Gisborne Laboratories. This wide discrepancy between the number of cervical smear tests recognised by Douglass Hanly Moir Pathology as showing high-grade/cancer abnormalities, and the number recognised by Gisborne Laboratories shows that at Gisborne Laboratories there was a frequent failure to recognise the presence of high-grade/cancer abnormalities. Dr Farnsworth’s evidence on this point was:

Question by Professor Duggan: I’m going to put this statement to you and perhaps you can comment on it. What these calculations [in exhibit TM/HFA/87] indicate to me is that the confirmation by the biopsy of a smear called cancer or high-grade for both laboratories over the three time periods are essentially the same?

A That’s right.

Q However, the number of smears confirmed by Sydney [Douglass Hanly Moir Pathology] as high-grade is 3.5 times more than the number of smears confirmed as high-grade by Dr Bottrill’s laboratory?

A That’s right.…

Q What does that result mean to you?

A It actually means to me that …both confirmation rates are essentially the same, but it would confirm to me that the extra or the % of extra high-grades that we found were in fact true high-grades.

Q At the same rate as Dr Bottrill?

A At the same rate as Dr Bottrill’s.

Q Dr Farnsworth, you may recall that yesterday one of the very first points I inquired of you was in relation to the histology.

A Yes.

Q Was the reading of the histology for the period for both laboratories the same?

A It would be very much the same.

Q It’s the same. And any regression of disease would be the same for both laboratories?

A Exactly.

Q And thereafter is it correct to say that false positive reporting – ie the 50% that weren't recognised or confirmed by the biopsy, some of that may be due to false positive reporting or some may be due to disease that is yet to be detected?

A Yes, that’s also possible.

Q but this would apply to both Dr Bottrill’s results and to your results?

A That’s exactly right.

Q so there is an internal standard, in terms of the histopathology and the regression of disease for both laboratories because you are comparing the same variables?

A Exactly.

Q And the only difference between the two re-reads is that your laboratory detected 3.5 times more biopsy confirmed high-grade disease than Dr Bottrill’s laboratory?

A That’s exactly right.

Q Now could this represent under-reporting?

A Yes.

Q By Dr Bottrill?

A Yes.

Second Indicator

4.17 The second indicator is taken from the proportion of women (in tables 5.3 and 5.4) with normal histology who had been reported as having high-grade/cancer cytology. This indicator gives a measure of the false positive reporting of each laboratory. Dr Farnsworth told the Committee that the usual denominator used to calculate the rate of false positive reporting is the number of normal histologies on biopsy. The number of normal histologies on biopsy in tables 5.3 and 5.4 was 76 so this became the common denominator for calculating the false positive reporting rate of Gisborne Laboratories and Douglass Hanly Moir Pathology. Analysis of the data in tables 5.3 and 5.4 of exhibit TM/HFA/87 shows that over the three time periods out of 76 women Gisborne Laboratories reported three of them as having high-grade cytology and they were later found on biopsy to have normal histology. Whereas, Douglass Hanly Moir Pathology reported 22 out of the same group of women as having high-grade cytology and they were later confirmed by biopsy to have normal histology. This means that there was a wide discrepancy between the false positive reporting rates of the two laboratories in relation to the data. The false positive reporting rate of Gisborne Laboratories was 3.9% whereas the false positive reporting rate of Douglass Hanly Moir Pathology was 28.9%.

4.18 In cervical smear reading there is always a trade off between the sensitivity and specificity of a test. In the context of high-grade/cancer detection, sensitivity is the proportion of all people who have the disease who are correctly identified as such by the test. Anyone with the disease who is not identified by the test is a false negative. Specificity is the proportion of all people who do not have the disease who are correctly identified as such by the test. Anyone who does not have the disease but whose test is positive is a false positive result. Pathologists would agree that some degree of false positive reporting due to over-reporting, (sometimes referred to as over-calling) is acceptable as that increases the probability of high-grade lesions being detected and reduces the potential for under-reporting a cervical smear test. When viewed against the 28.9% rate of Douglass Hanly Moir Pathology the Gisborne Laboratories false positive reporting rate of 3.9% appears to be extremely low and likely to carry with it a greater risk of under-reporting. Dr Farnsworth’s evidence on this point was:

Question by Professor Duggan: Dr Farnsworth, what does this mean?

A It means that Dr Bottrill had a very low false positive rate, especially compared to the Sydney re-read.

Q Now the Sydney re-read was geared towards ensuring that women would have the best treatment?

A That’s right, yes.

Q And with that background, is it likely that you over-called?

A It is perceived as over-calling on the straight numbers. The appearances that we used to make the …reports of high-grades are appearances that we use in our everyday laboratory, and it may be that we do it in our normal day to day work. …By increasing your sensitivity, which means increasing your false positive rate, you do lower …specificity, …And I have heard it colloquially put [as] where one sets the bar. But in a screening population where the Pap smear is designed to…sort out women who need to be further investigated from women who can then return to their normal screening interval, it is an accepted practice to in fact increase one’s sensitivity at the expense of specificity for that purpose. And it is an accepted screening technique to in fact have a higher false positive rate so that one can in fact detect as many …high-grade lesions as possible.

Q If I’ve heard you correctly then, you have said that it is accepted in cervical screening practices that the specificity will be compromised in order to attain a better sensitivity –

 A that’s right.

Q - and you are not surprised at the false positive rate [of Douglass Hanly Moir Pathology]?

A Exactly. And although a false positive rate is something that needs to be continually assessed and looked at as part of a normal laboratory’s processes, it would be of great concern if your false positive rate was extremely low because it would mean that you are therefore missing a large number of the high-grade lesions that you're in fact looking for.

CHAIR Could that mean if you had a very low false positive rate that there was a greater likelihood that you may be under-reporting?

A Absolutely, …If you have, …, a very high … false positive rate, …it means that…some specificity will be lost. But that is acceptable, and in fact, arguably, it’s the way Pap smears should be read.

Q Therefore, if you were looking for indicators of under-reporting could one possible indicator of under-reporting be a very low false positive rate?

A Yes…. By the way, it’s important that any one indicator is not taken alone.

Q No.

A Absolutely critical.

Q But taken with other indicators a low false positive rate would be a factor that would suggest under-reporting.

A …They should never be taken in isolation but yes in a group, but one would …look at the false positive rate and then go straight to the false negative rate…they should balance, …and in fact one would probably get more concerned if they didn’t balance.

Q And the false positive rate that you’ve just given in this exhibit working through with Dr Duggan for Dr Bottrill’s laboratory, do you consider that to be high, low, acceptable. I know that we don’t have standards here.

A The false positive rate that we’ve just talked about of 3.9%?

Q Yes, what's your opinion of it.

A Well it's extremely low.

Q Right so it would be permissible to take a false positive rate of 3.9% together with other factors as an indicator of under-reporting.

A In isolation arguably it means that the cytology that was seen was in fact spot on. …However if one is talking about a population screening exercise and one saw a very low false positive rate in association with a high false negative rate, one would be very concerned for that screening population.

Third Indicator

4.19 From the data in tables 5.3 and 5.4 the third indicator is taken from the proportion of women with high-grade/cancer histology whose cytology had been reported as abnormal. It is a measure of true positive reporting by the laboratories. Dr Farnsworth described the third indicator as showing under-reporting in the sense of failing to recognise an abnormal smear and under-reporting in the sense of failing to recognise the appropriate category of abnormality:

"… what we’re looking at here is in fact under-reporting not just in the yes/no separation but under-reporting within the categorisation of those [abnormal] appearances."

4.20 The third indicator has two parts: First, it takes the proportion of women with high-grade/cancer histology whose cervical smear tests had been reported as high-grade/cancer. Across all the time periods, table 5.3 showed that out of 216 women with cancer/high-grade histology, Gisborne Laboratories had reported 37 of them as having high-grade/cancer cytology. Whereas table 5.4 showed that for the same group of women Douglass Hanly Moir Pathology had reported 132 of them as having high-grade/cancer cytology. These calculations show Gisborne Laboratories to have a rate of 17% for detecting high-grade/cancer abnormalities whereas Douglass Hanly Moir Pathology has a rate of 61%.

Dr Farnsworth’s comments on the wide variation between the 17% reporting rate for Gisborne Laboratories and the 61% reporting rate for Douglass Hanly Moir Pathology these rates were:

CHAIR: Is the rate of 17% for Dr Bottrill’s laboratory in the third indicator, I know we don’t have benchmark standards in New Zealand but nevertheless, in your experience as a pathologist would you describe that as a very low rate, low, medium, high, whatever.

A It's extremely low.

Q Would you say was unacceptably low?

A Yes I would.

Q And can you say why?

A Back to my comments about cervical cancer remember that we are actually screening for these lesions, we are screening high-grade lesions both the Australian Government and the New Zealand Government spend a large amount of money trying to look after the women of their countries. These are the lesions we are actually looking for because it's these lesions that by finding them at this stage you can remove and actually prevent cancer. It would seem to me that if you are picking up such a small percentage of the actual disease that exists in that community of screened women, then basically you shouldn’t have a screening programme at all because it's not doing any good.

4.21 The second part of the third indicator looked at the proportion of women shown in tables 5.3 and 5.4 with high-grade/cancer histology whose cervical smear tests had been reported as abnormal but to a lesser degree than high-grade or cancer. The data in the table 5.3 showed that out of 216 women with high-grade/cancer histology Gisborne Laboratories had read 111 of them as having abnormal cytology. Table 5.4 showed that out of the same group of women Douglass Hanly Moir Laboratories had read 85 of them as having abnormal cytology. The reporting rate for Gisborne Laboratories for the three time periods was 51% whereas the rate for Douglass Hanly Moir Laboratories was 40%. Dr Farnsworth’s evidence, when asked to comment on these rates, was that they showed that Douglass Hanly Moir Pathology had more accurately read the cytology of the 216 women whose results were given in tables 5.3 and 5.4:

Q Now what does this indicator mean in terms of Dr Bottrill’s reporting and the Sydney laboratory reporting?

A It is in fact a more specific marker of false negative cytology if one takes it globally. …if we actually did or organised a screening programme so that one had either an abnormal category v’s a normal category this particular additional data would show that in fact Sydney would have separated all the correct results into the need investigation group whereas the original laboratory would have not identified a significant percentage of women…

 Q So which laboratory is better?

A The Sydney re-read would in terms of screening programmes be much more accurate because the whole purpose …is to separate out … the women that did deserve to have further investigation whereas [ in the case of Gisborne Laboratories’s reporting]there would have been 32% of women in this particular population who had high-grade lesions who would have then been returned to the screening pool and said that they don’t actually have to have another smear for 3 years.

CHAIR INTERJECTS

CHAIR: Would you just say why that is? Could you just say how you come to that conclusion?

A Again, I’m using the very simple concept of a screening programme, talking about sensitivity and leaving aside specificity, and if we take the example that a screening programme should be designed …to detect abnormalities that are present in the screened population or the potentially screened population, and if one takes a very simplistic premise that you call that group perfectly okay, they can return and come back for their next Pap smear in 3 years as opposed to the group that needs to have something further done - and arguably that is the whole purpose of the screening programme - then the Sydney re-read would have …put all the women who had abnormalities present and high-grade significant abnormalities, which is the one we’re trying to detect, …into the "correct" basket, for want of a better word. Whereas in the original re-read, …, there would have been 68 women who were arguably falsely reassured that they had nothing wrong with their cervix and could just return for a further smear.…

Q Yes. So these 68 women are women who would have [been] read … as normal, [ were] put back into the screening population, therefore, when in fact they should have gone on to colposcopy?

A Yes, exactly, which is about one third of the women.

Dr Farnsworth was questioned by the Chair on this aspect of the third indicator:

Q it seems that the third indicator falls in to two parts, this is the second part –

 A that’s right.

Q - which we hadn't considered before.

A That’s right,…but it is further evidence.

...

Q - further evidence of –

 A Of significant under-reporting.

4.22 Dr Farnsworth acknowledged that each indicator on its own was not sufficient to support the conclusion that Gisborne Laboratories had an unacceptable level of under-reporting. Indeed she was careful in her evidence to point out the dangers of relying on one indicator in isolation. She also acknowledged that the calculations from tables 5.3 and 5.4 of exhibit TM/HFA/087 only allowed a comparison between the performance of the two laboratories in relation to their reporting on the results given in those tables. However, the combination of the three indicators signified to her that there had been an unacceptable level of under-reporting by Gisborne Laboratories:

Q And if we could just go back over to summarise, we’ve gone through the three indicators, if we take each of these three indicators and look at them as a group, do the three of them together go someway to providing an indication that Dr Bottrill was under-reporting?

A Yes they do.

Q And on a 10 point scale if you can, can you tell me how far does the combination of these three indicators take you?

A You want an under-reporting, 10 is high and 0 is low?

Q Yes.

A They indicate a very high level of under-reporting, a very high level and if one wanted to grade it from 10 being the highest level of under-reporting you could have v’s 0 to no under-reporting I’d give him an 8.

Q Right. Would you say that was unacceptable under-reporting?

A Absolutely.

Q Now the other point I’d like to know is, you’ve come to this opinion on the basis of these three indicators. Are they sufficient to come to a view on under-reporting or do you need to take other factors into account. In other words, can you reliably say on the basis of these three indicators, there has been under-reporting to a level of an 8 which you would say is unacceptable?

A These three indicators would allow me to say that but there are other factors that I am aware of which would also influence, if you wanted to ask me again, from other points of view but alone these three indicators would indicate…an 8 level of under-reporting.

Other Evidence Showing Unacceptable Under-reporting

4.23 Other witnesses also gave evidence which supported the conclusion that the level of under-reporting was unacceptable. Professor David Skegg suggested that the Committee should consider the number of women who had developed invasive cervical cancer despite being screened regularly. Since the purpose of a cervical screening programme is to identify those women with pre-cancerous abnormalities and to offer them early treatment before the abnormalities develop into cervical cancer a successful screening programme should prevent pre-cancerous abnormalities from developing into invasive cervical cancer. If, therefore, in a population of women who are screened regularly there are a substantial number of cases of cervical cancer which could have been prevented if detected at the pre-cancerous stage, that indicates an unacceptable level of under-reporting.

4.24 Professor Skegg said that the three indicators which Dr Farnsworth presented had demonstrated to him that there had been "a substantial under-reporting." For him a "striking" factor, which he derived from the data in table 5.6 of exhibit TM/HFA/87, was that in the case of 16 women who developed cervical cancer Gisborne Laboratories had read their cervical smear tests as normal whereas Douglass Hanly Moir Pathology had read the same cervical smear tests as cervical cancer or high-grade/cancer abnormality. Professor Skegg considered that, even when the high reporting rate of Douglass Hanly Moir Pathology, which was high in comparison with New Zealand laboratories overall, and other limitations on the use of the data in TM/HFA/87 was taken into account, this difference in reporting high-grade abnormalities or cervical cancer was significant and showed Gisborne Laboratories to have been reporting at an unacceptable level:

A Just returning to this table [5.6 exhibit TM/HFA/87] for a moment, even though I believe one must temper one’s conclusions with the awareness that the Sydney laboratory was reporting at a much higher level than any NZ laboratory, I still think these two observations, the first is that there were 17 women who developed cervical cancer after having 1 or more normal smears is striking, and even though we may have to set aside 6 of those 17 as possibly being diagnostic, and also the dichotomy from the Sydney results, the fact that in the second two periods which I think– one can put the most reliance on, that 16 had all been reported as either normal or low-grade or ASCUS by Dr Bottrill and all [were] reported as high-grade or cancer by Sydney, I believe that that does indicate a substantial level of under-reporting.

Q You’ve said there is a substantial level of under-reporting. Would you be prepared to grade it on a scale from 1 to 10, 10 being the worst case of under-reporting and 1 being the least serious case of under-reporting. Where would you say this level of under-reporting fell?

A I’m sorry to be unhelpful but I think that will be very subjective and I would be unwilling to do it. All I can say is that it seems to me very substantial.

Q When you say it’s very substantial would you say that it was unacceptable?

A Yes, I would.

4.25 Dr Cox used the data in table 5.6 of exhibit TM/HFA/87 to calculate the sensitivity of the reporting of the two laboratories. He concluded that Gisborne Laboratories had a sensitivity of 43.5% whereas Douglass Hanly Moir Pathology had a sensitivity of 95%. He described the sensitivity of Gisborne Laboratories as being unacceptably low:

A I’d like to start, if I may, on 5.6 because I believe that this table is very crucial to the term of reference 1 as has been identified yesterday. I would like to use this table to estimate the laboratory sensitivity for the detection of high-grade or cancer of both Dr Bottrill’s laboratory and the Sydney laboratory. And to do that I would like to invoke an assumption that of those who’ve developed cancer right through to beyond May 1999 that they had either cancer or high-grade throughout the entire period.

CHAIR: What period’s that?

A From 91 right through. Now I realise that it is possible, although I think a relatively small probability, that high-grade or worse has not been present throughout, and for many of these it may have been high-grade and then subsequently developed cancer. And if I invoke that, the original laboratory or Dr Bottrill’s laboratory, which is 5.6b, we end up …with an original laboratory sensitivity of 35.9% in my calculations …which is 14 over 39, and if you [do] a similar thing for the re-read at the Sydney laboratory and I’m not including ASCUS H in at this time …you end up with 37 out of 39 being positive which would give a sensitivity for that laboratory of 95%. Now I realise that I would also like to invoke a benchmark of say 85% laboratory sensitivity. Now I know normally in terms of Dr McGoogan’s evidence that has been calculated in a very different manner to do with rereading of slides within the laboratory but if I invoke that then Dr Bottrill’s sensitivity as I measure [it] is statistically significantly lower than that benchmark. Moreover the benchmark would have to be 51% for the difference between the benchmark and Dr Bottrill’s laboratory to not be statistically significant and I believe that even under the assumptions I need to invoke if you like to calculate these sensitivities, a figure of 51% would not be agreed on by anybody.

PROFESSOR DUGGAN: Could I just ask you to clarify one thing. For Dr Bottrill’s laboratory you are accepting as a predictor of the cancer his 6 diagnoses of cancer in the first row, the 6 of high-grade in the second row and the 5 low-grade.

A Sorry I have missed that. I take that back.

A I can recalculate things but I still don’t think and I’m pretty sure –

CHAIR INTERJECTS

CHAIR: Could you please recalculate so we’ve got something.

A 43.5%. And I therefore need to do something a little different. In which case the benchmark cut off that I mentioned before would not be 51% it would be 59% and I still believe that would not be a level which would be acceptable.

PROFESSOR DUGGAN: Just for the committee how did you calculate that benchmark of 59%.

A I believe the variants for a binomial proportion which is what the laboratory sensitivity is what’s called PQ/N. P which is this probability here of .435 x 1 minus that figure divided by the number overall which is 39 and the square root of that figure is the standard deviation. By taking that standard deviation and multiplying it by 1.96 which is a standard figure in the normal distribution table for 95% confidence interval or limit you get a figure of something like .15. You then have to add that to your original .435 because when you just multiply the standard deviation by 1.96 you get the difference between a benchmark and this particular figure then you have to add that difference to the figure so from that I calculate that the benchmark would need to be 59% for there not to be a statistical significant difference between Dr Bottrill’s sensitivity invoking the assumptions I did and the benchmark. Obviously the re-read laboratory has a figure and I hope I got this right of 95% sensitivity and is obviously – would be very acceptable.

Q So the Sydney reporting is acceptable?

A On the basis of table 5.6 and the assumptions that I invoked except in terms of it’s estimated sensitivity. There are other issues with the Sydney laboratory but not related to the sensitivity.

Q What about Dr Bottrill’s result.

A Dr Bottrill’s result I believe is unacceptably low.

CHAIR INTERJECTS

CHAIR: You said you’ve used as a reliable benchmark a figure of 85% where did you get that from?

A …I just said I would invoke it partly because in Dr McGoogan’s evidence in calculating the laboratory sensitivity a very different way which was by relooking at slides, their range of laboratory sensitivities .85 - .09, 85% or 95% for their standard as you like.

Q So your using it as a rule of thumb here.

A I was trying to use that as a rule of thumb as a starting point. I realise the benchmark and the way this is calculated is quite different and so I actually prefer to calculate what the benchmark would need to be.

Q And on that basis then you have a benchmark of 59% and in your view that would be too low by anyone’s standards.

A Yes.

PROFESSOR DUGGAN: Dr Cox even if you were to evaluate this data without using the 85% benchmark put forward by Dr McGoogan, a sensitivity of 95% for Sydney versus a sensitivity of 43.5% for Dr Bottrill, could you comment on those just approaching it as an inter-laboratory comparison where variables for each laboratory are essentially controlled except for the reporting of the smear?

A Well obviously that difference is even greater than the benchmark I invoked and is highly statistically significant. The issue here is that laboratories set their own trade-off between sensitivity and specificity, which is a technical term. I think they’ve been defined to the Inquiry earlier. And each laboratory is probably different in the balance between sensitivity and specificity they choose. Unfortunately in some laboratories it occurs by default rather than by intent. I think here we have a situation where we have if you like, two opposite extremes where the Sydney laboratory has a high sensitivity in terms of laboratory reporting and Dr Bottrill’s laboratory has a relatively low sensitivity. ….

…A And the Sydney laboratory has a high specificity but it’s lower than Dr Bottrill’s. So we have this contrast and the trade-off is that if the Sydney laboratory had been, if you like, reading the smears through to the time period of 1991 to 1996 then we would most likely detect something like twice as many cancers and we would have had about 3., or maybe 3 times the amount of referral for colposcopy or having a repeat smear. I must say that in these calculations I have to acknowledge that there is a combination of both screening smears and diagnostic smears within the series, but I would expect that the presence of diagnostic smears to actually increase the sensitivity because most times I would expect an indication or signs or symptoms on the request form which would heighten the readers index of suspicion when reading the smears in the first place.

Q the assumption you have made that the women concerned were likely to have cancer or high-grade abnormality between 91 and 99, how comfortable are you with making that assumption – in other words, is there a high probability that that was so, a low probability, in the middle – what?

A I believe there’s a high probability that great majority of those people who developed the cancer during the period will have had high-grade or as I’ve said earlier, low-grade or cancer present on their cervix all the way through.

Q So if you were doing this as an epidemiological study you would feel scientifically comfortable about making that assumption?

A I would feel some nervousness about making the assumption, and in a way I am disappointed in the sense that from the way the tables are created, you expect that the individual record data would allow this to be calculated in a different way that might be much more informative and reduce that possibility. So I have some nervousness about the assumption but I think, in terms of comparative purposes, it applies to both.

Yes, thank you.

4.26 A subsequent audit of the data in exhibit TM/HFA/087 by the Health Funding Authority revealed that it had wrongly recorded data in some of the tables. An audit of table 5.6, which Professor Skegg and Dr Cox had each relied upon to reach their separate conclusions that Gisborne Laboratories had under-reported at an unacceptable level, could not confirm the diagnosis of one of the 39 women recorded as having cancer. Dr Cox was asked to provide additional expert evidence to the Committee on the epidemiological impact of the one unconfirmed diagnosis in table 5.6 on the conclusions which he had reached. His evidence, which was given to the Committee in the form of an unsworn written statement, was that:

" …reducing the number of women with invasive cervical cancer by one, to 38 would not appear to be sufficient to alter the conclusion that there was a significant level of under-reporting of cervical cytology in Gisborne."

4.27 Dr Wain, was another expert witness who considered that the statistical data contained in exhibit "TM/HFA/87" showed there had been an unacceptable level of inder-reporting. Of all the women diagnosed with invasive cervical cancer Gisborne Laboratories Limited had reported only 30% of this group as having either a high-grade/cancer abnormality or had abnormal cells suspicious but not conclusive of HSIL (ASCUS-H) whereas Douglass Hanly Moir Pathology had reported every one in the group as having either a high-grade abnormality or cancer:

Q Would you agree with this summary, that all of the women who developed cancer were re-read by Sydney as cancer high-grade or ASCUS-H?

A Yes.

Q Whereas only 12, which is 30% of the women who developed cancer had their smears read by Dr Bottrill as cancer or high-grade?

A I would agree with that.

Q What do those rates mean to you?

A I think that number 1 it confirms to me that the Sydney re-read is likely to be correct in those women since they’ve all been subsequently shown to have cancer and number 2 that Dr Bottrill wasn’t very good at picking up women with definite abnormalities on their cervix.

Q Could this be under-reporting by Dr Bottrill?

A I think it is almost certainly under-reporting.

Q Could it be anything else?

A When you compare the two I can’t think of anything else that it could be.

CHAIR INTERJECTS

CHAIR: From that table alone are you able to give an indication of the level of under-reporting?

A It's extreme.

CHAIR: On a 10 point scale, with 1 being the lowest, 10 being the highest, where would you put the level of under-reporting on the basis of that table which is table 5.6 in the exhibit 87 of Mellor’s supplementary?

A I feel like an olympic judge! I’ve heard you ask that question yesterday and thought it was a very difficult question I think this is as bad as it gets.

Q So where would you put it.

A: 10.

Q You’d give it a 10. And would you say that was unacceptable under-reporting?

A Completely unacceptable.

PROFESSOR DUGGAN: Dr Wain I have one further question about this table. You have already mentioned that in your practice the women who present with invasive cancer have not been screened and it's rare for you to manage a woman with invasive cancer who has had a Pap smear. Looking at these two tables here what can you say about these women who have developed invasive cancer in the Tairawhiti region?

A It certainly doesn’t match with my clinical experience and they have been very unlucky to have developed cervical cancer despite the fact that they’ve gone through the process of having Pap smears. They’re a screened population but they’ve got no benefit from screening.

Q Thank you.

4.28 Dr Ron Jones was a part of the HFA advisory group for the Sydney re- read and was involved in providing follow up colposcopy services. The data from colposcopy is complicated (as he explained) because colposcopy is, like cytology, not an exact science. Accepting that limitation on the data, however, Dr. Jones’ evidence was that the colposcopy follow up data also tended to support the accuracy of the Sydney re-read because a number of women with non symptomatic invasive cervical cancer were detected as a result of the re-read. There were more cancers than he expected to see

4.29 Because some false negative results are expected a cervical screening programme depends on women having cervical smear tests at regular intervals so that an abnormality which a laboratory misses on one occasion will be less likely to be missed on a subsequent occasion. Although Dr. Wain only considered the records of a small group of women he was struck by the number of what appeared to him to be repeat misreads. After considering the cases of more than one misread, and some cases of women with 5 and even one with 6 apparently misread slides he said:

"I am not a gambler but if you work out the probability of that happening, it must be extraordinarily rare…almost unbelievable."

4.30 The impression Dr Wain had from looking at the patient files was consistent with his expectation of the natural progression of the disease in the absence of a screening programme. Since the population seemed to him to have been well screened (meaning that there were a high number of enrolments) it was his view that:

"somewhere along the way things were going wrong very badly"

4.31 There were other factors which, on their own are not be reliable indicators of under-reporting, however when considered together with the above evidence they support the conclusion that there was an unacceptable level of under-reporting at Gisborne Laboratories: First, there is a marked difference between the reporting rates for high-grade abnormalities when Dr Bottrill was in practice and when he retired, and the business of Gisborne Laboratories was sold to Med Lab Hamilton. The Committee is aware that there are issues surrounding the question of whether reporting rates of abnormal test results are in themselves a reliable indicator of laboratory performance, nevertheless, it considers that the difference in the level of reporting of abnormalities before and after Dr Bottrill’s retirement is so great that the Committee can take note of it.

4.32 Secondly statistics which were prepared jointly by the Ministry of Health and the Health Funding Authority and produced in evidence to the Committee, show that a regional analysis of cervical cancer incidence between 1990 and 1997 puts the Gisborne region at the second highest rate of cervical cancer in New Zealand. The analysis of these statistics included the calculation of the ratio of observed numbers of cases to expected numbers of cases expressed as a percentage. This percentage was called the standardised registration ratio. The national average was expressed as 100% and standardised registration ratios higher than 100% were above the national average and conversely percentages lower than 100% were below the national average. The Gisborne region had a standardised registration ratio of 181.3% or almost twice the national average. Therefore, one would expect to see a higher rate of abnormalities being reported from this region. However, the reporting rate of abnormalities in the period from 1990 to March 1996 was low. In contrast the reporting rates for abnormalities after March 1996 when Medlab Hamilton took over the business of Gisborne Laboratories seem to fit better with the region’s significantly high rate of cervical cancer.

4.33 The Committee is drawn to the conclusion that it is difficult to think of any convincing explanation for the sharp increase in the number of abnormalities being reported other than that after the sale of Gisborne Laboratories Dr Bottrill had stopped reading the cervical cytology of women in the region. Further support for this conclusion can be obtained from the anecdotal observations made by the local programme co-ordinator Ms Reid in June 1997 in her report which appears in exhibit "JMG/MOH 62" that there seemed to me more high-grade abnormalities being diagnosed than previously.

4.34 Thirdly, there is the evidence of Ms Tracy Mellor of the Health Funding Authority on the rate of abnormality reporting since 1991 which is the time from when women were recording their first smear on the National Cervical Screening Register. This information is to be found in exhibit "TM/HFA/85". It shows that the reporting rates of Gisborne Laboratories for abnormalities remained about the same despite the fact that by 1994 and 1995 over half of the women enrolled on the National Cervical Screening Programme were having their second or a subsequent smear. If screening were providing a benefit one would expect to see a drop in the abnormality rates. The fact that rates did not drop over time can also be seen as an indication of under-reporting.

4.35 Fourthly there is the evidence of Mr. Jim du Rose on 116 smear tests reported as high-grade or cancer by Douglass Hanly Moir Pathology in TM/HFA/87 at p51, but which were originally reported as normal by Gisborne Laboratories. More than half (53.4%) of these false negative smear tests from Gisborne Laboratories were subsequently confirmed as high-grade or cancer by histology.

4.36 Finally the evidence the Committee heard from Dr Ron Jones, Dr Teague and Dr Tie is consistent with under-reporting. Moreover it is significant that the Committee has not heard any evidence to suggest that the rate of reporting abnormalities at Gisborne Laboratories was acceptable. Indeed Dr Bottrill himself accepted that his level of under-reporting was unacceptable.

Q: Do you now accept, from what you have seen, read of the evidence that has been given that during the period 1991 to March 1996, there has been an unacceptable level of under-reporting of cervical smears in the Gisborne Region as a consequence of your misreading and/or misreporting of those smears?

A: Regretfully yes (B3079/24).

Conclusion

4.37 In view of the evidence the Committee has heard on term of reference one it has no difficulty in concluding that there has been an unacceptable level of under-reporting in the Gisborne region in the period to which this term of reference relates. The Committee has been able to reach this conclusion even though during the relevant period there were no performance standards in place against which the performance of Gisborne Laboratories could be measured. Although at an early stage in the inquiry hearings there was evidence to suggest that the Committee might not be able to answer this term of reference without the assistance of an audit of the cases of cervical cancer, in the end on the evidence available the conclusion which the Committee has reached was inevitable.

 

Back
To Top