↓ Skip to main content

Wiley Online Library

Classification and evaluation strategies of auto-segmentation approaches for PET: Report of AAPM task group No. 211

Overview of attention for article published in Medical Physics, May 2017
Altmetric Badge

About this Attention Score

  • Above-average Attention Score compared to outputs of the same age (52nd percentile)
  • Good Attention Score compared to outputs of the same age and source (72nd percentile)

Mentioned by

3 tweeters


106 Dimensions

Readers on

132 Mendeley
Classification and evaluation strategies of auto-segmentation approaches for PET: Report of AAPM task group No. 211
Published in
Medical Physics, May 2017
DOI 10.1002/mp.12124
Pubmed ID

Mathieu Hatt, John A. Lee, Charles R. Schmidtlein, Issam El Naqa, Curtis Caldwell, Elisabetta De Bernardi, Wei Lu, Shiva Das, Xavier Geets, Vincent Gregoire, Robert Jeraj, Michael P. MacManus, Osama R. Mawlawi, Ursula Nestle, Andrei B. Pugachev, Heiko Schöder, Tony Shepherd, Emiliano Spezi, Dimitris Visvikis, Habib Zaidi, Assen S. Kirov


The purpose of this educational report is to provide an overview of the present state-of-the-art PET auto-segmentation (PET-AS) algorithms and their respective validation, with an emphasis on providing the user with help in understanding the challenges and pitfalls associated with selecting and implementing a PET-AS algorithm for a particular application. A brief description of the different types of PET-AS algorithms is provided using a classification based on method complexity and type. The advantages and the limitations of the current PET-AS algorithms are highlighted based on current publications and existing comparison studies. A review of the available image datasets and contour evaluation metrics in terms of their applicability for establishing a standardized evaluation of PET-AS algorithms is provided. The performance requirements for the algorithms and their dependence on the application, the radiotracer used and the evaluation criteria are described and discussed. Finally, a procedure for algorithm acceptance and implementation, as well as the complementary role of manual and auto-segmentation are addressed. A large number of PET-AS algorithms have been developed within the last 20 years. Many of the proposed algorithms are based on either fixed or adaptively selected thresholds. More recently, numerous papers have proposed the use of more advanced image analysis paradigms to perform semi-automated delineation of the PET images. However, the level of algorithm validation is variable and for most published algorithms is either insufficient or inconsistent which prevents recommending a single algorithm. This is compounded by the fact that realistic image configurations with low signal-to-noise ratios (SNR) and heterogeneous tracer distributions have rarely been used. Large variations in the evaluation methods used in the literature point to the need for a standardized evaluation protocol. Available comparison studies suggest that PET-AS algorithms relying on advanced image paradigms provide generally more accurate segmentation than approaches based on PET activity thresholds, particularly for realistic configurations. However, this may not be the case for simple shape lesions in situations with a narrower range of parameters, where simpler methods may also perform well. Recent algorithms which employ some type of consensus or automatic selection between several PET-AS methods have potential to overcome the limitations of the individual methods when appropriately trained. In either case, accuracy evaluation is required for each different PET scanner and scanning and image reconstruction protocol. For the simpler, less robust approaches, adaptation to scanning conditions, tumor type and tumor location by optimization of parameters is necessary. The results from the method evaluation stage can be used to estimate the contouring uncertainty. All PET-AS contours should be critically verified by a physician. A standard test, i.e., a benchmark dedicated to evaluating both existing and future PET-AS algorithms needs to be designed, in order to aid clinicians in evaluating and selecting PET-AS algorithms and to establish performance limits for their acceptance for clinical use. The initial steps towards designing and building such a standard are undertaken by the task group members. This article is protected by copyright. All rights reserved.

Twitter Demographics

The data shown below were collected from the profiles of 3 tweeters who shared this research output. Click here to find out more about how the information was compiled.

Mendeley readers

The data shown below were compiled from readership statistics for 132 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 132 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 30 23%
Student > Ph. D. Student 26 20%
Student > Master 14 11%
Other 14 11%
Student > Postgraduate 6 5%
Other 18 14%
Unknown 24 18%
Readers by discipline Count As %
Medicine and Dentistry 28 21%
Physics and Astronomy 27 20%
Engineering 16 12%
Computer Science 11 8%
Mathematics 4 3%
Other 15 11%
Unknown 31 23%

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 25 January 2017.
All research outputs
of 12,362,744 outputs
Outputs from Medical Physics
of 5,413 outputs
Outputs of similar age
of 334,698 outputs
Outputs of similar age from Medical Physics
of 175 outputs
Altmetric has tracked 12,362,744 research outputs across all sources so far. This one is in the 42nd percentile – i.e., 42% of other outputs scored the same or lower than it.
So far Altmetric has tracked 5,413 research outputs from this source. They receive a mean Attention Score of 3.0. This one is in the 38th percentile – i.e., 38% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 334,698 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 52% of its contemporaries.
We're also able to compare this research output to 175 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 72% of its contemporaries.