Cargando…

Learning the representation of instrument images in laparoscopy videos

Automatic recognition of instruments in laparoscopy videos poses many challenges that need to be addressed, like identifying multiple instruments appearing in various representations and in different lighting conditions, which in turn may be occluded by other instruments, tissue, blood, or smoke. Co...

Descripción completa

Detalles Bibliográficos
Autores principales: Kletz, Sabrina, Schoeffmann, Klaus, Husslein, Heinrich
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Institution of Engineering and Technology 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6952264/
https://www.ncbi.nlm.nih.gov/pubmed/32038857
http://dx.doi.org/10.1049/htl.2019.0077
_version_ 1783486413973487616
author Kletz, Sabrina
Schoeffmann, Klaus
Husslein, Heinrich
author_facet Kletz, Sabrina
Schoeffmann, Klaus
Husslein, Heinrich
author_sort Kletz, Sabrina
collection PubMed
description Automatic recognition of instruments in laparoscopy videos poses many challenges that need to be addressed, like identifying multiple instruments appearing in various representations and in different lighting conditions, which in turn may be occluded by other instruments, tissue, blood, or smoke. Considering these challenges, it may be beneficial for recognition approaches that instrument frames are first detected in a sequence of video frames for further investigating only these frames. This pre-recognition step is also relevant for many other classification tasks in laparoscopy videos, such as action recognition or adverse event analysis. In this work, the authors address the task of binary classification to recognise video frames as either instrument or non-instrument images. They examine convolutional neural network models to learn the representation of instrument frames in videos and take a closer look at learned activation patterns. For this task, GoogLeNet together with batch normalisation is trained and validated using a publicly available dataset for instrument count classifications. They compared transfer learning with learning from scratch and evaluate on datasets from cholecystectomy and gynaecology. The evaluation shows that fine-tuning a pre-trained model on the instrument and non-instrument images is much faster and more stable in learning than training a model from scratch.
format Online
Article
Text
id pubmed-6952264
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher The Institution of Engineering and Technology
record_format MEDLINE/PubMed
spelling pubmed-69522642020-02-07 Learning the representation of instrument images in laparoscopy videos Kletz, Sabrina Schoeffmann, Klaus Husslein, Heinrich Healthc Technol Lett Special Issue: Papers from the 13th Workshop on Augmented Environments for Computer Assisted Interventions Automatic recognition of instruments in laparoscopy videos poses many challenges that need to be addressed, like identifying multiple instruments appearing in various representations and in different lighting conditions, which in turn may be occluded by other instruments, tissue, blood, or smoke. Considering these challenges, it may be beneficial for recognition approaches that instrument frames are first detected in a sequence of video frames for further investigating only these frames. This pre-recognition step is also relevant for many other classification tasks in laparoscopy videos, such as action recognition or adverse event analysis. In this work, the authors address the task of binary classification to recognise video frames as either instrument or non-instrument images. They examine convolutional neural network models to learn the representation of instrument frames in videos and take a closer look at learned activation patterns. For this task, GoogLeNet together with batch normalisation is trained and validated using a publicly available dataset for instrument count classifications. They compared transfer learning with learning from scratch and evaluate on datasets from cholecystectomy and gynaecology. The evaluation shows that fine-tuning a pre-trained model on the instrument and non-instrument images is much faster and more stable in learning than training a model from scratch. The Institution of Engineering and Technology 2019-11-26 /pmc/articles/PMC6952264/ /pubmed/32038857 http://dx.doi.org/10.1049/htl.2019.0077 Text en http://creativecommons.org/licenses/by/3.0/ This is an open access article published by the IET under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/)
spellingShingle Special Issue: Papers from the 13th Workshop on Augmented Environments for Computer Assisted Interventions
Kletz, Sabrina
Schoeffmann, Klaus
Husslein, Heinrich
Learning the representation of instrument images in laparoscopy videos
title Learning the representation of instrument images in laparoscopy videos
title_full Learning the representation of instrument images in laparoscopy videos
title_fullStr Learning the representation of instrument images in laparoscopy videos
title_full_unstemmed Learning the representation of instrument images in laparoscopy videos
title_short Learning the representation of instrument images in laparoscopy videos
title_sort learning the representation of instrument images in laparoscopy videos
topic Special Issue: Papers from the 13th Workshop on Augmented Environments for Computer Assisted Interventions
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6952264/
https://www.ncbi.nlm.nih.gov/pubmed/32038857
http://dx.doi.org/10.1049/htl.2019.0077
work_keys_str_mv AT kletzsabrina learningtherepresentationofinstrumentimagesinlaparoscopyvideos
AT schoeffmannklaus learningtherepresentationofinstrumentimagesinlaparoscopyvideos
AT hussleinheinrich learningtherepresentationofinstrumentimagesinlaparoscopyvideos