Has there been an unacceptable level of under-reporting
in consequence of misreading and/or mis-reporting of abnormalities in
cervical smears in the Gisborne region?
4.1 The Committee of Inquiry is satisfied that there
has been an unacceptable level of under-reporting of abnormalities in
cervical smear tests in the Gisborne region during the period from 1991
to March 1996. The Committee has only heard evidence in regard to cervical
smear test readings by Gisborne Laboratories Limited. It has heard no
evidence which would have allowed it to determine whether or not there
had been under-reporting of cervical smear tests read in the laboratory
at Gisborne Hospital and therefore it is unable to comment on the performance
of that laboratory’s reading of cervical smear tests during the relevant
period. Its finding on the presence and the level of under-reporting
of cervical smear tests in the Gisborne region is based only upon an
analysis of the performance of Gisborne Laboratories.
4.2 Because the terms of reference directed the Committee
to look into the reading of abnormalities in cervical smear tests in
the Gisborne region prior to March 1996 it has not heard sufficient
evidence on this topic post March 1996 to be able to comment on laboratory
performance since then. It has heard no evidence to suggest that there
has been an unacceptable level of under-reporting of cervical smear
tests from the Gisborne region since March 1996. However, as a comprehensive
evaluation of the performance of the National Cervical Screening Programme
has never been completed and laboratory cervical smear test reporting
is still not routinely monitored the Committee considers that the quality
of cervical smear test reporting for this later period is unknown.
4.3 Dr Bottrill read most of the cervical smear tests
that were carried out in Gisborne Laboratories. There were times when
either due to Dr Bottrill’s absence on leave or because extra help was
needed locum pathologists were used to read the cervical smear tests.
However, the evidence shows that the reading of cervical smear tests
by these persons can not account for the under-reporting which has occurred.
4.4 By the end of the inquiry hearings there was clear
evidence before the Committee that among cervical smear tests that were
carried out in Gisborne Laboratories, during the period under consideration
positive tests had had been under-reported to an unacceptable extent.
Initially, the task of determining whether or not there had been an
unacceptable level of under-reported cervical smear tests in the Gisborne
Region seemed intractable. The reading of smear tests is based upon
a microscopic evaluation of the smear test by one or more observers.
Evaluation is prone to human error for a number of reasons, chief amongst
them being the difficulty in consistently maintaining the high level
of concentration needed to detect the abnormal cells and also because
the interpretation of the abnormal cell is somewhat subjective. Some
under-reporting of cervical smear tests in consequence of misreading
and/or misreporting is therefore inevitable. Even in well run laboratories
with state-of-the-art technology and appropriate quality control systems
cervical smear tests can be under-reported.
4.5 Failure of the laboratory to detect pre-cancer
or cervical cancer cell changes when the abnormal cells are actually
present in the smear test is referred to as a false negative result.
A false negative result is defined in a number of ways, and consequently
the false negative rate can be measured in a number of ways. One approach
is to measure how many high-grade lesions confirmed by biopsy had a
negative smear test report 6 months prior to that biopsy. Another is
to re-read all or a proportion of a laboratory’s negative smears to
measure how many were actually abnormal. While there are published standards
for false negative rates using these definitions in other countries,
from the evidence given, the Committee understood that the false negative
rate of any laboratory could only be compared to another laboratory
or a published standard if the methodologies for measuring the false
negative rate were the same. While the Committee was not specifically
charged with investigating the over-reporting of cervical smear abnormalities
in the Gisborne region, this form of laboratory error, ie the reporting
of a cellular abnormality when none is present in the test, did come
into evidence during the Inquiry. This type of error is also called
a false positive result and similar to a false negative result can be
defined and measured in a number of ways. Published standards are also
in existence in some countries. The same caution must be used when comparing
false positive rates from different laboratories and published standards
as is used when comparing false negative rates.
4.6 The Committee’s task was made even more difficult
by the fact that standards did not exist for New Zealand and the methodologies
used by the Health Funding Authority to determine the false negative
rate of Gisborne Laboratories were not comparable to those of published
methodologies. In some countries with established screening programmes
quantitative standards for reporting cervical smear tests have been
set to provide a measurement of laboratory performance. During the time
that Dr Bottrill was in practice the National Cervical Screening Programme
imposed no quantitative standards on laboratory performance. Apart from
extreme cases of under-reporting, which on any view would be unacceptable,
without clearly set standards against which to measure a laboratory’s
performance it is not easy to distinguish unacceptable under-reporting
from the accepted level of under-reporting that is inherent in cervical
smear evaluation.
4.7 The absence of quantitative standards over the
relevant period and the inevitability of some under-reporting have meant
that the Committee of Inquiry has had to determine for itself what is
an unacceptable level of under-reporting of cervical smear tests. The
Committee was advised by more than one expert witness of the need for
it to take a common sense view of the matter. The Committee agrees with
this advice. In the end it has chosen to consider the combined effect
of a number of indicators to assist it to report on term of reference
one. The Committee recognises that no single indicator may be sufficient
to reach a conclusion on the level of under-reporting, however, it considers
that the combined effect of these indicators convinced it that there
had been an unacceptable level of under-reporting. The Committee considered
that to reach a common sense view it would adopt the test the common
law uses to determine civil issues: namely the balance of probabilities.
However, having heard all the evidence the Committee was in no doubt
whatsoever that there had been unacceptable under-reporting.
4.8 The Committee has received evidence from more than
one source which shows that at Gisborne Laboratories there was a failure
to read correctly the cervical smear tests of a large number of women
in the Gisborne region and that many of these women went on to develop
cervical cancer which could have been prevented if their pre-cancer
had been detected earlier on. When the results of the re-examination
of Gisborne cervical smear tests by Douglass Hanly Moir Pathology (the
Sydney re-read) are compared with the original smear test reports from
Gisborne Laboratories the high level of under-reporting becomes apparent.
In total 22,976 slides were sent to Sydney for re-examination. Of these
slides Dr Bottrill had originally read 20,860 and the locums, employed
by Gisborne Laboratories, had read 2,116. From these figures, which
appear in exhibit TM/HFA/097, it can be seen that the impact of the
locums’ reading at Gisborne Laboratories was negligible.
4.9 The Committee has had the benefit of hearing from
a number of expert witnesses whose evidence on this term of reference
has been of great assistance to the Committee. The witnesses included:
Dr Annabelle Farnsworth MB BS(Hons), the director
of cytopathology at Douglass Hanly Moir Pathology;
Dr Euphemia McGoogan MB ChB, member of the Royal
College of Pathologists. Her area of special expertise is cervical
cytopathology. She is currently Pathology Patient Services Director
for the Lothian University Hospitals NHS Trust in Edinburgh and as
such is responsible for the largest combined morbid anatomy, histopathology
and cytopathology service in the UK;
(ii) Professor David Skegg BMedSc; ChB (Otago); DPhil
(Oxon); FFPHM; FAFPHM; FRSNZ, Professor of Preventive and Social Medicine
at the University of Otago Medical School. He has carried out extensive
research on the causes and control of cancer.
(iii) Dr Brian Cox BSc (Hons) MB ChB PhD, specialist
in public health medicine and an epidemiologist. He is employed by
the University of Otago as a Senior Research Fellow and he is the
director of the Hugh Adam Cancer Epidemiology Unit, Department of
Preventive and Social Medicine, University of Otago Medical School.
He is a Fellow of the Australasian Faculty of Public Health Medicine
and he is registered as a specialist in public health medicine.
(iv) Dr George Wain MB BS, Fellow of the Royal Australian
College of Obstetricians and Gynaecologists. He holds the Certificate
of Gynaecological Oncology of the Royal College of Obstetricians and
Gynaecologists. He is the Director of Gynaecological Oncology at Westmead
Hospital in Sydney and a Senior Lecturer in Gynaecological Oncology
at the University of Sydney.
4.10 The Health Funding Authority provided for the
Committee a report titled the Action Update Report, (received as exhibit
TM/HFA/087). This report updated the results of the Sydney re-read as
compared with the results of Gisborne Laboratories. In the course of
her evidence to the Committee Dr Farnsworth produced a document (exhibit
AF/HFA/004) which set out her analysis of the data in the Action Update
Report. She elaborated on this analysis when questioned by the Committee.
The analysis Dr Farnsworth provided in exhibit HF/HFA/004 produced three
discrete indicators of the two laboratories’ performance. These three
indicators were enough to satisfy Dr Farnsworth that there had been
an unacceptable level of under-reporting of cervical smear tests at
Gisborne Laboratories.
4.11 For the purpose of understanding the first indicator
it is important to note that in the interchange between Dr Farnsworth
and the Committee the term "false positive reporting" was
defined as the percentage of smear tests which were not confirmed by
the biopsy or for which there were no biopsy results. However, when
Dr Farnsworth came to give evidence on the second indicator the definition
of false positive changed from that used in the first indicator. Here
the term "false positive" referred to the percentage of women
with normal histology who had been reported as having high-grade/cancer
cytology. To arrive at these percentages the denominator used to calculate
the first indicator included all the women with a high-grade/cancer
cytology result recorded in tables 5.3 and 5.4 of exhibit TM/HFA/087
regardless of whether or not they had histology results recorded as
well. The denominator used to calculate the second indicator only included
those women recorded in tables 5.3 and 5.4 who had histology results
and was restricted to women with negative histology. Women who did not
have histology results were not included. Similarly, for the third indicator
the group of women being considered, and the denominator used to calculate
the percentages, is different to the other two indicators. It follows
that because the denominators for each group are different each indicator
must be viewed discretely.
4.12 The first indicator is taken from the proportion
of women with high-grade/cancer cytology who were later confirmed on
biopsy as having high-grade/cancer histology. It is a measure of the
accuracy of high-grade/cancer cytology reporting. This indicator is
derived from data set out in tables 5.3 and 5.4 of exhibit TN/HFA/087.
The data in table 5.3 refers to the original reading by Gisborne Laboratories
and in 5.4 to the re-reading by Douglass Hanly Moir Pathology. Table
5.3 comprises 3 sub-tables (5.3(a) to (c)) of data which set out the
histology results from initial colposcopy in relation to the highest
original smear test result, for all women over three time periods. The
three time periods were 1991 to February 1996, March 1996 to April 1999,
and May 1999 to the present. Evidence before the Committee explained
that the data was presented in this format in order to allow for the
effect of time on the analysis and interpretation. The importance of
this related to the fact that cervical pre-cancer can over time regress
to normal or a lesser pre-cancerous lesion, persist unchanged, or progress
to a more severe pre-cancerous lesion or cancer. Dr Farnsworth
explained that for the purposes of the inter-laboratory comparison,
ie the comparison of Gisborne Laboratories with Douglass Hanly Moir
Pathology, the impact of time would be the same for both laboratories
and would not need to be allowed for. The results in exhibit AF/HFA/004
relate to aggregated data from the three time periods.
4.13 From table 5.3 the proportion of women who had
high-grade/cancer cytology reports from Gisborne Laboratories and who
were subsequently confirmed by biopsy as having high-grade/ cancer histology
can be seen. Table 5.4 also comprises three sub-tables (5.4.(a) to (c))
which set out the histology results from initial colposcopy, in relation
to the highest smear result read by Douglass Hanly Moir Pathology, for
all women over the same three time periods as in table 5.3. From table
5.4 the proportion of women who had high-grade cytology including cancer
reports from Douglass Hanly Moir Pathology and who were subsequently
confirmed by biopsy as having high-grade/ cancer histology can be seen.
4.14 When table 5.3 is compared with table 5.4 two
points emerge. The first is that both laboratories had approximately
the same proportion of high-grade/cancer cytology confirmed as high-grade/cancer
by histology. The original smear test results showed that 37 out of
72 women who were reported as having high-grade/cancer cytology were
confirmed as high-grade/cancer on biopsy. This makes the confirmation
rate for high-grade/cancer cytology reported at Gisborne Laboratories
51.3%. The results of the Sydney re-read showed that 132 out of 260
women who Douglass Hanly Moir Pathology reported as having high-grade/cancer
cytology were later confirmed as high-grade/cancer on biopsy. This makes
the confirmation rate for high-grade/cancer cytology reported at Douglass
Hanly Moir Pathology 50.7%. These results indicate that each laboratories’
confirmation of their smear results of high-grade/cancer at approximately
50% was much the same. The remainder, were either not confirmed by the
histology or there was no histology result yet available. Dr Farnsworth
gave evidence that some of these unconfirmed high-grade/cancer cytology
results could be due to false positive reporting or the reporting could
be correct as their disease status was unknown until they had undergone
a biopsy. The Committee understood from this evidence that the 50% confirmation
rate of each laboratory was a minimum rate and that the inclusion of
additional histology results might increase the confirmation rate of
one or both laboratories, but would not decrease it.
4.15 Because the re-read exercise had been carried
out to ascertain if women whose cervical smear tests had been read at
Gisborne Laboratories were at risk there was a concern that when the
smear tests were re-read at Douglass Hanly Moir Pathology the screeners,
who would know that the smear tests were being re-read, would be overly
cautious in their approach. If the screeners at Douglass Hanly Moir
Pathology had been overly cautious this could lead them to over-report
smear tests as high-grade/ cancer. In this case the results of the re-reading
would not provide a fair basis for comparison with the original results
of the readings at Gisborne Laboratories. For this reason doubts had
been raised about the usefulness to the Committee of the information
coming from the Sydney re-read. However it became clear to the Committee,
when it heard the evidence of Dr Farnsworth of Douglass Hanly Moir
Pathology, that both laboratories had a similar rate of accuracy in
reporting high-grade/cancer. If Douglass Hanly Moir pathology had over-reported
the smear tests relative to Gisborne Laboratories, its confirmation
rate of high-grade/cancer cytology would have been less than Gisborne
Laboratories. The similarity in their rate of accuracy was enough to
allay any doubts the Committee might otherwise have had about using
the results from the Sydney re-read as a basis for comparison with the
original results from Gisborne Laboratories. Hence, the Committee was
confident about using the information from the Sydney re-read results
for the purposes of determining if there had been under-reporting of
cervical smear tests at Gisborne Laboratories.
4.16 The second point to emerge from a comparison of
table 5.3 with table 5.4, is the more significant. When the original
results are compared with the results of the Sydney re-read a wide discrepancy
between the laboratories in the number of reported high-grade/cancer
cytology results becomes readily apparent. At Douglass Hanly Moir Pathology
132 smear tests had been read and confirmed by biopsy as high-grade/cancer
which is 3.5 times more than the 37 smears read as high-grade/cancer
by Gisborne Laboratories. This wide discrepancy between the number of
cervical smear tests recognised by Douglass Hanly Moir Pathology as
showing high-grade/cancer abnormalities, and the number recognised by
Gisborne Laboratories shows that at Gisborne Laboratories there was
a frequent failure to recognise the presence of high-grade/cancer abnormalities.
Dr Farnsworth’s evidence on this point was:
Question by Professor Duggan: I’m going to put this
statement to you and perhaps you can comment on it. What these calculations
[in exhibit TM/HFA/87] indicate to me is that the confirmation by
the biopsy of a smear called cancer or high-grade for both laboratories
over the three time periods are essentially the same?
A That’s right.
Q However, the number of smears confirmed
by Sydney [Douglass Hanly Moir Pathology] as high-grade is 3.5 times
more than the number of smears confirmed as high-grade by Dr Bottrill’s
laboratory?
A That’s right.…
Q What does that result mean to you?
A It actually means to me that …both
confirmation rates are essentially the same, but it would confirm
to me that the extra or the % of extra high-grades that we found were
in fact true high-grades.
Q At the same rate as Dr Bottrill?
A At the same rate as Dr Bottrill’s.
…
Q Dr Farnsworth, you may recall that
yesterday one of the very first points I inquired of you was in relation
to the histology.
A Yes.
Q Was the reading of the histology
for the period for both laboratories the same?
A It would be very much the same.
Q It’s the same. And any regression
of disease would be the same for both laboratories?
A Exactly.
Q And thereafter is it correct to
say that false positive reporting – ie the 50% that weren't recognised
or confirmed by the biopsy, some of that may be due to false positive
reporting or some may be due to disease that is yet to be detected?
A Yes, that’s also possible.
Q but this would apply to both Dr
Bottrill’s results and to your results?
A That’s exactly right.
Q so there is an internal standard,
in terms of the histopathology and the regression of disease for both
laboratories because you are comparing the same variables?
A Exactly.
Q And the only difference between
the two re-reads is that your laboratory detected 3.5 times more biopsy
confirmed high-grade disease than Dr Bottrill’s laboratory?
A That’s exactly right.
Q Now could this represent under-reporting?
A Yes.
Q By Dr Bottrill?
A Yes.
4.17 The second indicator is taken from the proportion
of women (in tables 5.3 and 5.4) with normal histology who had been
reported as having high-grade/cancer cytology. This indicator gives
a measure of the false positive reporting of each laboratory. Dr Farnsworth
told the Committee that the usual denominator used to calculate the
rate of false positive reporting is the number of normal histologies
on biopsy. The number of normal histologies on biopsy in tables 5.3
and 5.4 was 76 so this became the common denominator for calculating
the false positive reporting rate of Gisborne Laboratories and Douglass
Hanly Moir Pathology. Analysis of the data in tables 5.3 and 5.4 of
exhibit TM/HFA/87 shows that over the three time periods out of 76 women
Gisborne Laboratories reported three of them as having high-grade cytology
and they were later found on biopsy to have normal histology. Whereas,
Douglass Hanly Moir Pathology reported 22 out of the same group of women
as having high-grade cytology and they were later confirmed by biopsy
to have normal histology. This means that there was a wide discrepancy
between the false positive reporting rates of the two laboratories in
relation to the data. The false positive reporting rate of Gisborne
Laboratories was 3.9% whereas the false positive reporting rate of Douglass
Hanly Moir Pathology was 28.9%.
4.18 In cervical smear reading there is always a trade
off between the sensitivity and specificity of a test. In the context
of high-grade/cancer detection, sensitivity is the proportion of all
people who have the disease who are correctly identified as such by
the test. Anyone with the disease who is not identified by the test
is a false negative. Specificity is the proportion of all people who
do not have the disease who are correctly identified as such by the
test. Anyone who does not have the disease but whose test is positive
is a false positive result. Pathologists would agree that some degree
of false positive reporting due to over-reporting, (sometimes referred
to as over-calling) is acceptable as that increases the probability
of high-grade lesions being detected and reduces the potential for under-reporting
a cervical smear test. When viewed against the 28.9% rate of Douglass
Hanly Moir Pathology the Gisborne Laboratories false positive reporting
rate of 3.9% appears to be extremely low and likely to carry with it
a greater risk of under-reporting. Dr Farnsworth’s evidence on this
point was:
Question by Professor Duggan: Dr Farnsworth,
what does this mean?
A It means that Dr Bottrill had a
very low false positive rate, especially compared to the Sydney re-read.
Q Now the Sydney re-read was geared
towards ensuring that women would have the best treatment?
A That’s right, yes.
Q And with that background, is it
likely that you over-called?
A It is perceived as over-calling
on the straight numbers. The appearances that we used to make the
…reports of high-grades are appearances that we use in our everyday
laboratory, and it may be that we do it in our normal day to day work.
…By increasing your sensitivity, which means increasing your false
positive rate, you do lower …specificity, …And I have heard it colloquially
put [as] where one sets the bar. But in a screening population where
the Pap smear is designed to…sort out women who need to be further
investigated from women who can then return to their normal screening
interval, it is an accepted practice to in fact increase one’s sensitivity
at the expense of specificity for that purpose. And it is an accepted
screening technique to in fact have a higher false positive rate so
that one can in fact detect as many …high-grade lesions as possible.
Q If I’ve heard you correctly then,
you have said that it is accepted in cervical screening practices
that the specificity will be compromised in order to attain a better
sensitivity –
A that’s right.
Q - and you are not surprised at the
false positive rate [of Douglass Hanly Moir Pathology]?
A Exactly. And although a false positive
rate is something that needs to be continually assessed and looked
at as part of a normal laboratory’s processes, it would be of great
concern if your false positive rate was extremely low because it would
mean that you are therefore missing a large number of the high-grade
lesions that you're in fact looking for.
CHAIR Could that mean if you had a
very low false positive rate that there was a greater likelihood that
you may be under-reporting?
A Absolutely, …If you have, …, a very
high … false positive rate, …it means that…some specificity will be
lost. But that is acceptable, and in fact, arguably, it’s the way
Pap smears should be read.
Q Therefore, if you were looking for
indicators of under-reporting could one possible indicator of under-reporting
be a very low false positive rate?
A Yes…. By the way, it’s important
that any one indicator is not taken alone.
Q No.
A Absolutely critical.
Q But taken with other indicators
a low false positive rate would be a factor that would suggest under-reporting.
A …They should never be taken in isolation
but yes in a group, but one would …look at the false positive rate
and then go straight to the false negative rate…they should balance,
…and in fact one would probably get more concerned if they didn’t
balance.
Q And the false positive rate that
you’ve just given in this exhibit working through with Dr Duggan for
Dr Bottrill’s laboratory, do you consider that to be high, low, acceptable.
I know that we don’t have standards here.
A The false positive rate that we’ve
just talked about of 3.9%?
Q Yes, what's your opinion of it.
A Well it's extremely low.
Q Right so it would be permissible
to take a false positive rate of 3.9% together with other factors
as an indicator of under-reporting.
A In isolation arguably it means that
the cytology that was seen was in fact spot on. …However if one is
talking about a population screening exercise and one saw a very low
false positive rate in association with a high false negative rate,
one would be very concerned for that screening population.
4.19 From the data in tables 5.3 and 5.4 the third
indicator is taken from the proportion of women with high-grade/cancer
histology whose cytology had been reported as abnormal. It is a measure
of true positive reporting by the laboratories. Dr Farnsworth described
the third indicator as showing under-reporting in the sense of failing
to recognise an abnormal smear and under-reporting in the sense of failing
to recognise the appropriate category of abnormality:
"… what we’re looking at here
is in fact under-reporting not just in the yes/no separation but under-reporting
within the categorisation of those [abnormal] appearances."
4.20 The third indicator has two parts: First, it takes
the proportion of women with high-grade/cancer histology whose cervical
smear tests had been reported as high-grade/cancer. Across all the time
periods, table 5.3 showed that out of 216 women with cancer/high-grade
histology, Gisborne Laboratories had reported 37 of them as having high-grade/cancer
cytology. Whereas table 5.4 showed that for the same group of women
Douglass Hanly Moir Pathology had reported 132 of them as having high-grade/cancer
cytology. These calculations show Gisborne Laboratories to have a rate
of 17% for detecting high-grade/cancer abnormalities whereas Douglass
Hanly Moir Pathology has a rate of 61%.
Dr Farnsworth’s comments on the wide variation between
the 17% reporting rate for Gisborne Laboratories and the 61% reporting
rate for Douglass Hanly Moir Pathology these rates were:
…
CHAIR: Is the rate of 17% for Dr Bottrill’s
laboratory in the third indicator, I know we don’t have benchmark
standards in New Zealand but nevertheless, in your experience as a
pathologist would you describe that as a very low rate, low, medium,
high, whatever.
A It's extremely low.
Q Would you say was unacceptably low?
A Yes I would.
Q And can you say why?
A Back to my comments about cervical
cancer remember that we are actually screening for these lesions,
we are screening high-grade lesions both the Australian Government
and the New Zealand Government spend a large amount of money trying
to look after the women of their countries. These are the lesions
we are actually looking for because it's these lesions that by finding
them at this stage you can remove and actually prevent cancer. It
would seem to me that if you are picking up such a small percentage
of the actual disease that exists in that community of screened women,
then basically you shouldn’t have a screening programme at all because
it's not doing any good.
4.21 The second part of the third indicator looked
at the proportion of women shown in tables 5.3 and 5.4 with high-grade/cancer
histology whose cervical smear tests had been reported as abnormal but
to a lesser degree than high-grade or cancer. The data in the table
5.3 showed that out of 216 women with high-grade/cancer histology Gisborne
Laboratories had read 111 of them as having abnormal cytology. Table
5.4 showed that out of the same group of women Douglass Hanly Moir Laboratories
had read 85 of them as having abnormal cytology. The reporting rate
for Gisborne Laboratories for the three time periods was 51% whereas
the rate for Douglass Hanly Moir Laboratories was 40%. Dr Farnsworth’s
evidence, when asked to comment on these rates, was that they showed
that Douglass Hanly Moir Pathology had more accurately read the cytology
of the 216 women whose results were given in tables 5.3 and 5.4:
Q Now what does this indicator mean
in terms of Dr Bottrill’s reporting and the Sydney laboratory reporting?
A It is in fact a more specific marker
of false negative cytology if one takes it globally. …if we actually
did or organised a screening programme so that one had either an abnormal
category v’s a normal category this particular additional data would
show that in fact Sydney would have separated all the correct results
into the need investigation group whereas the original laboratory
would have not identified a significant percentage of women…
Q So which laboratory is better?
A The Sydney re-read would in terms
of screening programmes be much more accurate because the whole purpose
…is to separate out … the women that did deserve to have further investigation
whereas [ in the case of Gisborne Laboratories’s reporting]there would
have been 32% of women in this particular population who had high-grade
lesions who would have then been returned to the screening pool and
said that they don’t actually have to have another smear for 3 years.
CHAIR INTERJECTS
CHAIR: Would you just say why that
is? Could you just say how you come to that conclusion?
A Again, I’m using the very simple
concept of a screening programme, talking about sensitivity and leaving
aside specificity, and if we take the example that a screening programme
should be designed …to detect abnormalities that are present in the
screened population or the potentially screened population, and if
one takes a very simplistic premise that you call that group perfectly
okay, they can return and come back for their next Pap smear in 3
years as opposed to the group that needs to have something further
done - and arguably that is the whole purpose of the screening programme
- then the Sydney re-read would have …put all the women who had abnormalities
present and high-grade significant abnormalities, which is the one
we’re trying to detect, …into the "correct" basket, for
want of a better word. Whereas in the original re-read, …, there would
have been 68 women who were arguably falsely reassured that they had
nothing wrong with their cervix and could just return for a further
smear.…
Q Yes. So these 68 women are women
who would have [been] read … as normal, [ were] put back into the
screening population, therefore, when in fact they should have gone
on to colposcopy?
A Yes, exactly, which is about one
third of the women.
Dr Farnsworth was questioned by the Chair on this
aspect of the third indicator:
Q it seems that the third indicator
falls in to two parts, this is the second part –
A that’s right.
Q - which we hadn't considered before.
A That’s right,…but it is further
evidence.
...
Q - further evidence of –
A Of significant under-reporting.
4.22 Dr Farnsworth acknowledged that each indicator
on its own was not sufficient to support the conclusion that Gisborne
Laboratories had an unacceptable level of under-reporting. Indeed she
was careful in her evidence to point out the dangers of relying on one
indicator in isolation. She also acknowledged that the calculations
from tables 5.3 and 5.4 of exhibit TM/HFA/087 only allowed a comparison
between the performance of the two laboratories in relation to their
reporting on the results given in those tables. However, the combination
of the three indicators signified to her that there had been an unacceptable
level of under-reporting by Gisborne Laboratories:
4.23 Other witnesses also gave evidence which supported
the conclusion that the level of under-reporting was unacceptable. Professor
David Skegg suggested that the Committee should consider the number
of women who had developed invasive cervical cancer despite being screened
regularly. Since the purpose of a cervical screening programme is to
identify those women with pre-cancerous abnormalities and to offer them
early treatment before the abnormalities develop into cervical cancer
a successful screening programme should prevent pre-cancerous abnormalities
from developing into invasive cervical cancer. If, therefore, in a population
of women who are screened regularly there are a substantial number of
cases of cervical cancer which could have been prevented if detected
at the pre-cancerous stage, that indicates an unacceptable level of
under-reporting.
4.24 Professor Skegg said that the three indicators
which Dr Farnsworth presented had demonstrated to him that there had
been "a substantial under-reporting." For him a "striking"
factor, which he derived from the data in table 5.6 of exhibit TM/HFA/87,
was that in the case of 16 women who developed cervical cancer Gisborne
Laboratories had read their cervical smear tests as normal whereas Douglass
Hanly Moir Pathology had read the same cervical smear tests as cervical
cancer or high-grade/cancer abnormality. Professor Skegg considered
that, even when the high reporting rate of Douglass Hanly Moir Pathology,
which was high in comparison with New Zealand laboratories overall,
and other limitations on the use of the data in TM/HFA/87 was taken
into account, this difference in reporting high-grade abnormalities
or cervical cancer was significant and showed Gisborne Laboratories
to have been reporting at an unacceptable level:
A Just returning to this table [5.6
exhibit TM/HFA/87] for a moment, even though I believe one must temper
one’s conclusions with the awareness that the Sydney laboratory was
reporting at a much higher level than any NZ laboratory, I still think
these two observations, the first is that there were 17 women who
developed cervical cancer after having 1 or more normal smears is
striking, and even though we may have to set aside 6 of those 17 as
possibly being diagnostic, and also the dichotomy from the Sydney
results, the fact that in the second two periods which I think– one
can put the most reliance on, that 16 had all been reported as either
normal or low-grade or ASCUS by Dr Bottrill and all [were] reported
as high-grade or cancer by Sydney, I believe that that does indicate
a substantial level of under-reporting.
…
Q You’ve said there is a substantial
level of under-reporting. Would you be prepared to grade it on a scale
from 1 to 10, 10 being the worst case of under-reporting and 1 being
the least serious case of under-reporting. Where would you say this
level of under-reporting fell?
A I’m sorry to be unhelpful but I
think that will be very subjective and I would be unwilling to do
it. All I can say is that it seems to me very substantial.
Q When you say it’s very substantial
would you say that it was unacceptable?
A Yes, I would.
4.25 Dr Cox used the data in table 5.6 of exhibit
TM/HFA/87 to calculate the sensitivity of the reporting of the two
laboratories. He concluded that Gisborne Laboratories had a sensitivity
of 43.5% whereas Douglass Hanly Moir Pathology had a sensitivity of
95%. He described the sensitivity of Gisborne Laboratories as being
unacceptably low:
A I’d like to start, if I may, on
5.6 because I believe that this table is very crucial to the term
of reference 1 as has been identified yesterday. I would like to use
this table to estimate the laboratory sensitivity for the detection
of high-grade or cancer of both Dr Bottrill’s laboratory and the Sydney
laboratory. And to do that I would like to invoke an assumption that
of those who’ve developed cancer right through to beyond May 1999
that they had either cancer or high-grade throughout the entire period.
CHAIR: What period’s that?
A From 91 right through. Now I realise
that it is possible, although I think a relatively small probability,
that high-grade or worse has not been present throughout, and for
many of these it may have been high-grade and then subsequently developed
cancer. And if I invoke that, the original laboratory or Dr Bottrill’s
laboratory, which is 5.6b, we end up …with an original laboratory
sensitivity of 35.9% in my calculations …which is 14 over 39, and
if you [do] a similar thing for the re-read at the Sydney laboratory
and I’m not including ASCUS H in at this time …you end up with 37
out of 39 being positive which would give a sensitivity for that laboratory
of 95%. Now I realise that I would also like to invoke a benchmark
of say 85% laboratory sensitivity. Now I know normally in terms of
Dr McGoogan’s evidence that has been calculated in a very different
manner to do with rereading of slides within the laboratory but if
I invoke that then Dr Bottrill’s sensitivity as I measure [it] is
statistically significantly lower than that benchmark. Moreover the
benchmark would have to be 51% for the difference between the benchmark
and Dr Bottrill’s laboratory to not be statistically significant and
I believe that even under the assumptions I need to invoke if you
like to calculate these sensitivities, a figure of 51% would not be
agreed on by anybody.
PROFESSOR DUGGAN: Could I just ask
you to clarify one thing. For Dr Bottrill’s laboratory you are accepting
as a predictor of the cancer his 6 diagnoses of cancer in the first
row, the 6 of high-grade in the second row and the 5 low-grade.
A Sorry I have missed that. I take
that back.
…
A I can recalculate things but I still
don’t think and I’m pretty sure –
CHAIR INTERJECTS
CHAIR: Could you please recalculate
so we’ve got something.
A 43.5%. And I therefore need to do
something a little different. In which case the benchmark cut off
that I mentioned before would not be 51% it would be 59% and I still
believe that would not be a level which would be acceptable.
PROFESSOR DUGGAN: Just for the committee
how did you calculate that benchmark of 59%.
A I believe the variants for a binomial
proportion which is what the laboratory sensitivity is what’s called
PQ/N. P which is this probability here of .435 x 1 minus that figure
divided by the number overall which is 39 and the square root of that
figure is the standard deviation. By taking that standard deviation
and multiplying it by 1.96 which is a standard figure in the normal
distribution table for 95% confidence interval or limit you get a
figure of something like .15. You then have to add that to your original
.435 because when you just multiply the standard deviation by 1.96
you get the difference between a benchmark and this particular figure
then you have to add that difference to the figure so from that I
calculate that the benchmark would need to be 59% for there not to
be a statistical significant difference between Dr Bottrill’s sensitivity
invoking the assumptions I did and the benchmark. Obviously the re-read
laboratory has a figure and I hope I got this right of 95% sensitivity
and is obviously – would be very acceptable.
Q So the Sydney reporting is acceptable?
A On the basis of table 5.6 and the
assumptions that I invoked except in terms of it’s estimated sensitivity.
There are other issues with the Sydney laboratory but not related
to the sensitivity.
Q What about Dr Bottrill’s result.
A Dr Bottrill’s result I believe is
unacceptably low.
CHAIR INTERJECTS
CHAIR: You said you’ve used as a reliable
benchmark a figure of 85% where did you get that from?
A …I just said I would invoke it partly
because in Dr McGoogan’s evidence in calculating the laboratory sensitivity
a very different way which was by relooking at slides, their range
of laboratory sensitivities .85 - .09, 85% or 95% for their standard
as you like.
Q So your using it as a rule of thumb
here.
A I was trying to use that as a rule
of thumb as a starting point. I realise the benchmark and the way
this is calculated is quite different and so I actually prefer to
calculate what the benchmark would need to be.
Q And on that basis then you have
a benchmark of 59% and in your view that would be too low by anyone’s
standards.
A Yes.
PROFESSOR DUGGAN: Dr Cox even if you
were to evaluate this data without using the 85% benchmark put forward
by Dr McGoogan, a sensitivity of 95% for Sydney versus a sensitivity
of 43.5% for Dr Bottrill, could you comment on those just approaching
it as an inter-laboratory comparison where variables for each laboratory
are essentially controlled except for the reporting of the smear?
A Well obviously that difference is
even greater than the benchmark I invoked and is highly statistically
significant. The issue here is that laboratories set their own trade-off
between sensitivity and specificity, which is a technical term. I
think they’ve been defined to the Inquiry earlier. And each laboratory
is probably different in the balance between sensitivity and specificity
they choose. Unfortunately in some laboratories it occurs by default
rather than by intent. I think here we have a situation where we have
if you like, two opposite extremes where the Sydney laboratory has
a high sensitivity in terms of laboratory reporting and Dr Bottrill’s
laboratory has a relatively low sensitivity. ….
…A And the Sydney laboratory has a
high specificity but it’s lower than Dr Bottrill’s. So we have this
contrast and the trade-off is that if the Sydney laboratory had been,
if you like, reading the smears through to the time period of 1991
to 1996 then we would most likely detect something like twice as many
cancers and we would have had about 3., or maybe 3 times the amount
of referral for colposcopy or having a repeat smear. I must say that
in these calculations I have to acknowledge that there is a combination
of both screening smears and diagnostic smears within the series,
but I would expect that the presence of diagnostic smears to actually
increase the sensitivity because most times I would expect an indication
or signs or symptoms on the request form which would heighten the
readers index of suspicion when reading the smears in the first place.
Q the assumption you have made that
the women concerned were likely to have cancer or high-grade abnormality
between 91 and 99, how comfortable are you with making that assumption
– in other words, is there a high probability that that was so, a
low probability, in the middle – what?
A I believe there’s a high probability
that great majority of those people who developed the cancer during
the period will have had high-grade or as I’ve said earlier, low-grade
or cancer present on their cervix all the way through.
Q So if you were doing this as an
epidemiological study you would feel scientifically comfortable about
making that assumption?
A I would feel some nervousness about
making the assumption, and in a way I am disappointed in the sense
that from the way the tables are created, you expect that the individual
record data would allow this to be calculated in a different way that
might be much more informative and reduce that possibility. So I have
some nervousness about the assumption but I think, in terms of comparative
purposes, it applies to both.
Yes, thank you.
4.26 A subsequent audit of the data in exhibit TM/HFA/087
by the Health Funding Authority revealed that it had wrongly recorded
data in some of the tables. An audit of table 5.6, which Professor Skegg
and Dr Cox had each relied upon to reach their separate conclusions
that Gisborne Laboratories had under-reported at an unacceptable level,
could not confirm the diagnosis of one of the 39 women recorded as having
cancer. Dr Cox was asked to provide additional expert evidence to the
Committee on the epidemiological impact of the one unconfirmed diagnosis
in table 5.6 on the conclusions which he had reached. His evidence,
which was given to the Committee in the form of an unsworn written statement,
was that:
" …reducing the number of women
with invasive cervical cancer by one, to 38 would not appear to be
sufficient to alter the conclusion that there was a significant level
of under-reporting of cervical cytology in Gisborne."
4.27 Dr Wain, was another expert witness who considered
that the statistical data contained in exhibit "TM/HFA/87"
showed there had been an unacceptable level of inder-reporting. Of all
the women diagnosed with invasive cervical cancer Gisborne Laboratories
Limited had reported only 30% of this group as having either a high-grade/cancer
abnormality or had abnormal cells suspicious but not conclusive of HSIL
(ASCUS-H) whereas Douglass Hanly Moir Pathology had reported every one
in the group as having either a high-grade abnormality or cancer:
Q Would you agree with this summary,
that all of the women who developed cancer were re-read by Sydney
as cancer high-grade or ASCUS-H?
A Yes.
Q Whereas only 12, which is 30% of
the women who developed cancer had their smears read by Dr Bottrill
as cancer or high-grade?
A I would agree with that.
Q What do those rates mean to you?
A I think that number 1 it confirms
to me that the Sydney re-read is likely to be correct in those women
since they’ve all been subsequently shown to have cancer and number
2 that Dr Bottrill wasn’t very good at picking up women with definite
abnormalities on their cervix.
Q Could this be under-reporting by
Dr Bottrill?
A I think it is almost certainly under-reporting.
Q Could it be anything else?
A When you compare the two I can’t
think of anything else that it could be.
CHAIR INTERJECTS
CHAIR: From that table alone are you
able to give an indication of the level of under-reporting?
A It's extreme.
CHAIR: On a 10 point scale, with 1
being the lowest, 10 being the highest, where would you put the level
of under-reporting on the basis of that table which is table 5.6 in
the exhibit 87 of Mellor’s supplementary?
A I feel like an olympic judge! I’ve
heard you ask that question yesterday and thought it was a very difficult
question I think this is as bad as it gets.
Q So where would you put it.
A: 10.
Q You’d give it a 10. And would you
say that was unacceptable under-reporting?
A Completely unacceptable.
PROFESSOR DUGGAN: Dr Wain I have one
further question about this table. You have already mentioned that
in your practice the women who present with invasive cancer have not
been screened and it's rare for you to manage a woman with invasive
cancer who has had a Pap smear. Looking at these two tables here what
can you say about these women who have developed invasive cancer in
the Tairawhiti region?
A It certainly doesn’t match with
my clinical experience and they have been very unlucky to have developed
cervical cancer despite the fact that they’ve gone through the process
of having Pap smears. They’re a screened population but they’ve got
no benefit from screening.
Q Thank you.
4.28 Dr Ron Jones was a part of the HFA advisory group
for the Sydney re- read and was involved in providing follow up colposcopy
services. The data from colposcopy is complicated (as he explained)
because colposcopy is, like cytology, not an exact science. Accepting
that limitation on the data, however, Dr. Jones’ evidence was that the
colposcopy follow up data also tended to support the accuracy of the
Sydney re-read because a number of women with non symptomatic invasive
cervical cancer were detected as a result of the re-read. There were
more cancers than he expected to see
4.29 Because some false negative results are expected
a cervical screening programme depends on women having cervical smear
tests at regular intervals so that an abnormality which a laboratory
misses on one occasion will be less likely to be missed on a subsequent
occasion. Although Dr. Wain only considered the records of a small group
of women he was struck by the number of what appeared to him to be repeat
misreads. After considering the cases of more than one misread, and
some cases of women with 5 and even one with 6 apparently misread slides
he said:
"I am not a gambler but if you
work out the probability of that happening, it must be extraordinarily
rare…almost unbelievable."
4.30 The impression Dr Wain had from looking at
the patient files was consistent with his expectation of the natural
progression of the disease in the absence of a screening programme.
Since the population seemed to him to have been well screened (meaning
that there were a high number of enrolments) it was his view that:
"somewhere along the way things
were going wrong very badly"
4.31 There were other factors which, on their own are
not be reliable indicators of under-reporting, however when considered
together with the above evidence they support the conclusion that there
was an unacceptable level of under-reporting at Gisborne Laboratories:
First, there is a marked difference between the reporting rates for
high-grade abnormalities when Dr Bottrill was in practice and when
he retired, and the business of Gisborne Laboratories was sold to Med
Lab Hamilton. The Committee is aware that there are issues surrounding
the question of whether reporting rates of abnormal test results are
in themselves a reliable indicator of laboratory performance, nevertheless,
it considers that the difference in the level of reporting of abnormalities
before and after Dr Bottrill’s retirement is so great that the Committee
can take note of it.
4.32 Secondly statistics which were prepared jointly
by the Ministry of Health and the Health Funding Authority and produced
in evidence to the Committee, show that a regional analysis of cervical
cancer incidence between 1990 and 1997 puts the Gisborne region at the
second highest rate of cervical cancer in New Zealand. The analysis
of these statistics included the calculation of the ratio of observed
numbers of cases to expected numbers of cases expressed as a percentage.
This percentage was called the standardised registration ratio. The
national average was expressed as 100% and standardised registration
ratios higher than 100% were above the national average and conversely
percentages lower than 100% were below the national average. The Gisborne
region had a standardised registration ratio of 181.3% or almost twice
the national average. Therefore, one would expect to see a higher rate
of abnormalities being reported from this region. However, the reporting
rate of abnormalities in the period from 1990 to March 1996 was low.
In contrast the reporting rates for abnormalities after March 1996 when
Medlab Hamilton took over the business of Gisborne Laboratories seem
to fit better with the region’s significantly high rate of cervical
cancer.
4.33 The Committee is drawn to the conclusion that
it is difficult to think of any convincing explanation for the sharp
increase in the number of abnormalities being reported other than that
after the sale of Gisborne Laboratories Dr Bottrill had stopped reading
the cervical cytology of women in the region. Further support for this
conclusion can be obtained from the anecdotal observations made by the
local programme co-ordinator Ms Reid in June 1997 in her report which
appears in exhibit "JMG/MOH 62" that there seemed to me more
high-grade abnormalities being diagnosed than previously.
4.34 Thirdly, there is the evidence of Ms Tracy Mellor
of the Health Funding Authority on the rate of abnormality reporting
since 1991 which is the time from when women were recording their first
smear on the National Cervical Screening Register. This information
is to be found in exhibit "TM/HFA/85". It shows that the reporting
rates of Gisborne Laboratories for abnormalities remained about the
same despite the fact that by 1994 and 1995 over half of the women enrolled
on the National Cervical Screening Programme were having their second
or a subsequent smear. If screening were providing a benefit one would
expect to see a drop in the abnormality rates. The fact that rates did
not drop over time can also be seen as an indication of under-reporting.
4.35 Fourthly there is the evidence of Mr. Jim du Rose
on 116 smear tests reported as high-grade or cancer by Douglass Hanly
Moir Pathology in TM/HFA/87 at p51, but which were originally reported
as normal by Gisborne Laboratories. More than half (53.4%) of these
false negative smear tests from Gisborne Laboratories were subsequently
confirmed as high-grade or cancer by histology.
4.36 Finally the evidence the Committee heard from
Dr Ron Jones, Dr Teague and Dr Tie is consistent with under-reporting.
Moreover it is significant that the Committee has not heard any evidence
to suggest that the rate of reporting abnormalities at Gisborne Laboratories
was acceptable. Indeed Dr Bottrill himself accepted that his level of
under-reporting was unacceptable.
Q: Do you now accept, from what you
have seen, read of the evidence that has been given that during the
period 1991 to March 1996, there has been an unacceptable level of
under-reporting of cervical smears in the Gisborne Region as a consequence
of your misreading and/or misreporting of those smears?
A: Regretfully yes (B3079/24).
Conclusion
4.37 In view of the evidence the Committee has heard
on term of reference one it has no difficulty in concluding that there
has been an unacceptable level of under-reporting in the Gisborne region
in the period to which this term of reference relates. The Committee
has been able to reach this conclusion even though during the relevant
period there were no performance standards in place against which the
performance of Gisborne Laboratories could be measured. Although at
an early stage in the inquiry hearings there was evidence to suggest
that the Committee might not be able to answer this term of reference
without the assistance of an audit of the cases of cervical cancer,
in the end on the evidence available the conclusion which the Committee
has reached was inevitable.