↓ Skip to main content

Wiley Online Library

Classification and evaluation strategies of auto-segmentation approaches for PET: Report of AAPM task group No. 211

Overview of attention for article published in Medical Physics, May 2017
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age
  • Above-average Attention Score compared to outputs of the same age and source (63rd percentile)

Mentioned by

3 tweeters


137 Dimensions

Readers on

163 Mendeley
Classification and evaluation strategies of auto-segmentation approaches for PET: Report of AAPM task group No. 211
Published in
Medical Physics, May 2017
DOI 10.1002/mp.12124
Pubmed ID

Mathieu Hatt, John A. Lee, Charles R. Schmidtlein, Issam El Naqa, Curtis Caldwell, Elisabetta De Bernardi, Wei Lu, Shiva Das, Xavier Geets, Vincent Gregoire, Robert Jeraj, Michael P. MacManus, Osama R. Mawlawi, Ursula Nestle, Andrei B. Pugachev, Heiko Schöder, Tony Shepherd, Emiliano Spezi, Dimitris Visvikis, Habib Zaidi, Assen S. Kirov


The purpose of this educational report is to provide an overview of the present state-of-the-art PET auto-segmentation (PET-AS) algorithms and their respective validation, with an emphasis on providing the user with help in understanding the challenges and pitfalls associated with selecting and implementing a PET-AS algorithm for a particular application. A brief description of the different types of PET-AS algorithms is provided using a classification based on method complexity and type. The advantages and the limitations of the current PET-AS algorithms are highlighted based on current publications and existing comparison studies. A review of the available image datasets and contour evaluation metrics in terms of their applicability for establishing a standardized evaluation of PET-AS algorithms is provided. The performance requirements for the algorithms and their dependence on the application, the radiotracer used and the evaluation criteria are described and discussed. Finally, a procedure for algorithm acceptance and implementation, as well as the complementary role of manual and auto-segmentation are addressed. A large number of PET-AS algorithms have been developed within the last 20 years. Many of the proposed algorithms are based on either fixed or adaptively selected thresholds. More recently, numerous papers have proposed the use of more advanced image analysis paradigms to perform semi-automated delineation of the PET images. However, the level of algorithm validation is variable and for most published algorithms is either insufficient or inconsistent which prevents recommending a single algorithm. This is compounded by the fact that realistic image configurations with low signal-to-noise ratios (SNR) and heterogeneous tracer distributions have rarely been used. Large variations in the evaluation methods used in the literature point to the need for a standardized evaluation protocol. Available comparison studies suggest that PET-AS algorithms relying on advanced image paradigms provide generally more accurate segmentation than approaches based on PET activity thresholds, particularly for realistic configurations. However, this may not be the case for simple shape lesions in situations with a narrower range of parameters, where simpler methods may also perform well. Recent algorithms which employ some type of consensus or automatic selection between several PET-AS methods have potential to overcome the limitations of the individual methods when appropriately trained. In either case, accuracy evaluation is required for each different PET scanner and scanning and image reconstruction protocol. For the simpler, less robust approaches, adaptation to scanning conditions, tumor type and tumor location by optimization of parameters is necessary. The results from the method evaluation stage can be used to estimate the contouring uncertainty. All PET-AS contours should be critically verified by a physician. A standard test, i.e., a benchmark dedicated to evaluating both existing and future PET-AS algorithms needs to be designed, in order to aid clinicians in evaluating and selecting PET-AS algorithms and to establish performance limits for their acceptance for clinical use. The initial steps towards designing and building such a standard are undertaken by the task group members. This article is protected by copyright. All rights reserved.

Twitter Demographics

The data shown below were collected from the profiles of 3 tweeters who shared this research output. Click here to find out more about how the information was compiled.

Mendeley readers

The data shown below were compiled from readership statistics for 163 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 163 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 35 21%
Student > Ph. D. Student 30 18%
Student > Master 15 9%
Other 14 9%
Student > Postgraduate 9 6%
Other 24 15%
Unknown 36 22%
Readers by discipline Count As %
Physics and Astronomy 31 19%
Medicine and Dentistry 30 18%
Engineering 19 12%
Computer Science 13 8%
Business, Management and Accounting 4 2%
Other 24 15%
Unknown 42 26%

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 25 January 2017.
All research outputs
of 22,947,506 outputs
Outputs from Medical Physics
of 7,698 outputs
Outputs of similar age
of 313,576 outputs
Outputs of similar age from Medical Physics
of 145 outputs
Altmetric has tracked 22,947,506 research outputs across all sources so far. This one is in the 37th percentile – i.e., 37% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,698 research outputs from this source. They receive a mean Attention Score of 3.4. This one is in the 39th percentile – i.e., 39% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 313,576 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 44th percentile – i.e., 44% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 145 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 63% of its contemporaries.