Novices May Be Trained to Screen for AAA's Using Ultrasound
Novices May Be Trained to Screen for AAA's Using Ultrasound
During the screening period novices and assessors scanned 215 participants. There were 87 men and 128 women. The median age of men was 64 (range 50–86) years and for women was 62 (range 50–105) years. The average BMI was 28 ± 4.7 kg/m; waist circumference 93.7 ± 13.4 cm; 30% were obese (BMI ≥ 30 kg/m). Regarding fasting status, 74% consumed foods within 4 hours prior to their scans. One novice completed only 182 examinations with absence due to sick leave.
(Enlarge Image)
Figure 1.
Bland-Altman plots. Illustrating variations in performance. Bland-Altman graphs plot the inter-observer differences of the novice-assessors (y-axis) against the averages of those measurements. The dotted lines demarcate the limits of agreement (LOA), which should ideally lie within the clinically acceptable difference (CAD) of 0.5 cm. A demonstrates an acceptable performance by Novice 3 when measuring the maximal coronal diameter, with no measurement bias and the LOA within the CAD. B shows unacceptably high variability in Novice 1 measurements from mid-infrarenal aortic section with outliers in both normal and aneurysmal aorta. C shows a large under-sizing bias of 0.5 cm by Novice 1 when measuring the suprarenal segment of aortae, in addition the variability of measurement exceeded the CAD.
(Enlarge Image)
Figure 2.
Cusum plots demonstrating the progression in scanning efficiency for each novice over the study period. The mean and SD were calculated from a sample of 220 scans with a 90% success rate, using 10,000 iterations in the bootstrapping procedure. Limit lines (dotted grey lines) demarcate when the success rate becomes higher or lower than would be expected if due to chance with 97.5% certainty. The criterion for a success was any scan completed within 10 min and is charted as an increase of 0.21 on the chart. Failure, results in a decrease on the plot of -1.79. Assessors have a higher success rate than 90% and cross the upper boundary limits at scans 56, 118, 171; Novice scanners have lower success rate than expected: Novice 1 crosses lower limits at scans 22, 39, 59, 83, 97,104, 173; Novice 2 crosses lower limits at scans 30, 44, 58, 90, 103; Novice 3 crosses lower limits at scan 81. After scan 110 a plateau in performance occurs in the novices performances.
The two novices recruited subsequent to the initial study and assessed with the same quality control tools described yielded comparable outcomes on the Bland-Altman plots. Cusum was utilized in a similar manner to track learning progression.
Results
During the screening period novices and assessors scanned 215 participants. There were 87 men and 128 women. The median age of men was 64 (range 50–86) years and for women was 62 (range 50–105) years. The average BMI was 28 ± 4.7 kg/m; waist circumference 93.7 ± 13.4 cm; 30% were obese (BMI ≥ 30 kg/m). Regarding fasting status, 74% consumed foods within 4 hours prior to their scans. One novice completed only 182 examinations with absence due to sick leave.
Novice Performance Assessment
Interobserver differences: The mean differences and the variability and LOA for the measurements for each novice at each aortic section are presented in Table 1. A high percentage of novice observations were within the CAD of ± 0.5 cm. The most accurate measurements by the novices were the distal infrarenal 94% (91–97%) measurements and maximal coronal 92% (88–96%), as illustrated by Figure 1A. In contrast, measurements taken in the transverse plane of the mid infrarenal aortic section exceeded the clinically acceptable range of variability as illustrated by Figure 1B. The poorest performance was in suprarenal measurements of the aorta, where novices had an under-sizing bias of 0.5 cm, variability outside of the acceptable limits, with only 62% (56 - 68%) of observations within the CAD, overall a poor performance as illustrated in Figure 1C.
Strength of measurement agreement: Novices had very high levels of agreement with assessors in the diagnosis of infrarenal AAA, with all novices having a Kappa coefficient greater than 0.8. Novices had no agreement with the assessors in the identification of localised suprarenal aortic dilatation.
Aneurysm Detection: The assessors identified ten infrarenal aneurysms, which ranged in size from 3.0 to 4.8 cm. Novice 1 missed three of these aneurysms, while Novice 2 and Novice 3 missed one each. All three novices missed a 3.5 cm AAA from the same patient. Novice 1 missed a second AAA that was measured by an assessor at 4.3 cm. In both these cases all the novices graded the scans as a four on the difficulty scale, which would have resulted in these patients being referred back to a more experienced technician and re-evaluated. The third AAA missed by Novice 1 was a small 3.1 cm eccentric aneurysm with a localized area of dilatation. The examination was rated "not difficult", which was inconsistent with the rating by the assessor and a peer novice (the third novice was not involved in measuring this aneurysm) and thus would not have been referred back for re-evaluation.
(Enlarge Image)
Figure 1.
Bland-Altman plots. Illustrating variations in performance. Bland-Altman graphs plot the inter-observer differences of the novice-assessors (y-axis) against the averages of those measurements. The dotted lines demarcate the limits of agreement (LOA), which should ideally lie within the clinically acceptable difference (CAD) of 0.5 cm. A demonstrates an acceptable performance by Novice 3 when measuring the maximal coronal diameter, with no measurement bias and the LOA within the CAD. B shows unacceptably high variability in Novice 1 measurements from mid-infrarenal aortic section with outliers in both normal and aneurysmal aorta. C shows a large under-sizing bias of 0.5 cm by Novice 1 when measuring the suprarenal segment of aortae, in addition the variability of measurement exceeded the CAD.
Factors Affecting Performance
Scanning Difficulties: Both assessors and novices were able to capture and measure the aortic diameter in 92–100% of images taken. Novices had less confidence in their infrarenal measures; rating on average 18 scans (8.1%) as very difficult, compared with assessors' whom rated only one infrarenal image (0.5%; P < .0001). Conversely, assessors rated more suprarenal scans as very difficult (8.4%) than the novice trainees (4.5%). There were no significant differences in measurement performance observed between scans that were rated as 1 to 3 on the difficulty scale.
Body Habitus: For both assessors and novices patient habitus was a significant factor contributing to scanning difficulty, as obese patients (BMI > 30 kg/m) were more likely to have at least one image in their scan rated as a 4 on the difficulty scale, compared with their normal sized counterparts (41.8 vs. 17.6%; P = .0003). However, obesity and central adiposity did not affect the overall measurement accuracy of the novices.
Aortic size: The measures of variation between each of the novices and the assessors were generally greater for aortas ≥ 2.5 cm than those smaller. At the mid infrarenal level, the absolute mean difference of novice measurements was significantly greater in aortae measured 2.5 cm or greater (0.68 ± 0.48) compared with those taken in aortae with diameters less than 2.5 cm (0.22 ± 0.26; P = .001), this effect was also observed in distal infrarenal measurements (0.40 ± 0.50 vs. 0.19 ± 0.17; P = .0251) and in the maximal coronal measurements (0.29 ± 0.24 vs. 0.16 ± 0.15; P = .0067).
Documenting Progression in Learning
Cusum performance: During the screening period the assessors completed 95% (205/216) of the scans within 5 minutes. Novice trainees achieved this in only 7.4% (48/653) of scans, and therefore this was not a useful learning benchmark. Using the 10-minute criterion with an expected 90% success rate as the benchmark, an improvement in the novice's performance was seen over the course of the scanning period. Initially all novice trainees failed to achieve an acceptable success rate, but after scan 110 a plateau on the learning curve of all trainees was observed. Thereafter one trainee immediately and one at scan 180 showed performances that paralleled the assessors' at this criteria, but the third had not yet reached this even by scan 210 (Figure 2).
Learning progression: After 110 scans the variability of novice measurements from the mid-infrarenal section of the aorta was lower than the first half of the screening period (-0.16 ± 0.34 vs. 0.01 ± 0.44; P < .0001). Little change in variability was observed with measurements in the more consistently measured coronal plane and distal transverse diameters. In the latter half of the screening period novices also had more confidence in their scanning ability, rating fewer images as 3 or 4 on the difficulty scale (29.4% vs. 37.5%; P = .03).
(Enlarge Image)
Figure 2.
Cusum plots demonstrating the progression in scanning efficiency for each novice over the study period. The mean and SD were calculated from a sample of 220 scans with a 90% success rate, using 10,000 iterations in the bootstrapping procedure. Limit lines (dotted grey lines) demarcate when the success rate becomes higher or lower than would be expected if due to chance with 97.5% certainty. The criterion for a success was any scan completed within 10 min and is charted as an increase of 0.21 on the chart. Failure, results in a decrease on the plot of -1.79. Assessors have a higher success rate than 90% and cross the upper boundary limits at scans 56, 118, 171; Novice scanners have lower success rate than expected: Novice 1 crosses lower limits at scans 22, 39, 59, 83, 97,104, 173; Novice 2 crosses lower limits at scans 30, 44, 58, 90, 103; Novice 3 crosses lower limits at scan 81. After scan 110 a plateau in performance occurs in the novices performances.
The two novices recruited subsequent to the initial study and assessed with the same quality control tools described yielded comparable outcomes on the Bland-Altman plots. Cusum was utilized in a similar manner to track learning progression.