Problem statement
A medical startup is building a new product that automatically detects health conditions in x-ray images. To train their algorithm, the startup has thousands of x-rays that must be tagged with any medical conditions (e.g. broken or fractured bones). Design a product to easily collect this data to enable doctors to highlight and tag medical conditions from this large set of x-ray images.
Why annotation
Accuracy of an artificial intelligence algorithm depends on the quality and quantity of training data fed into it. More the attributes attached to the data, more rich the dataset becomes. Annotation of the data ensures that relevant features and tags are associated with the input data. On the quantity side, at-least thousands of such training data pairs are required to train the algorithm. An efficient tool which can help speed up the process can provide monumental benefits.
Medical image annotation vs regular annotation
a. Medical images contain transparencies
This means occlusions must be treated differently. Objects in front of one another may appear behind one another.
See the chest x-ray below: are the lungs behind or in front of the diaphragm? The answer is both. The occluded portion of the lungs cannot be perceived by traditional computer vision methods, however a deep neural network can easily learn to spot it.

b. Different file formats
Most medical imaging will be in Digital Imaging and Communications in Medicine (DICOM) format. DICOM is the international standard for medical images and related information. It defines the formats for medical images that can be exchanged with the data and quality necessary for clinical use. A DICOM file represents a case, which may contain one or more images and are represented as “.dcm.”
DICOM is implemented in almost every radiology, cardiology imaging, and radiotherapy device (X-ray, CT, MRI, ultrasound, etc.- use case for this problem statement) and increasingly in devices in other medical domains such as ophthalmology and dentistry.
In addition to the DICOM format, the radiologist routinely encounters images of several file formats such as JPEG, TIFF, GIF, and PNG.
Knowledge about these formats and their attributes, such as image resolution, image compression, and image metadata, helps the radiologist in optimizing the archival, organization, and display of images.

c. DICOM vs TIFF
A DICOM file consists of a header and image data sets packed into a single file. The information within the header is organized as a constant and standardized series of tags. By extracting data from these tags one can access important information regarding the patient demographics, study parameters, etc. In the interest of patient confidentiality, all information that can be used to identify the patient should be removed before DICOM images are transmitted over a network for educational or other purposes.
Although DICOM images have found wide acceptance in medical practice, they have two disadvantages: file sizes are large and special software is required for viewing them on personal computers. Outside the radiology department, most personal computers run on the Windows operating system, which does not recognize the DICOM file structure. Thus, for incorporating images in PowerPoint presentations, for creating teaching files, or for publishing in Web pages, DICOM images need to be converted into image formats that can be recognized by Windows.
The TIFF format is versatile and supports the full range of image sizes, resolutions, and color depths. Since TIFF images are saved without compression or with a lossless compression scheme they retain the original image quality. TIFF is preferred where high image quality is desired, for example, when the image contains illustrations and line diagram
d. Differing views and volumes
A case may contain 2D or 3D imaging. In both examples, often more than one view is necessary to assess what's happening. For example, the x-ray of a hand may only reveal a fracture when the hand is in certain pose or angle.

The final product
Assumptions
- Data set (TIFF file of the X-ray on which image interpretation needs to happen) is already uploaded in the machine.
- An annotation class named X-ray is created.
- Annotation type is pre-defined as polygon directional vector to limit the scope of the project. There can be other annotation types such as line, box, cuboid, keypoint, tag depending upon the type of data being interpreted.
How it works
- The tool allows the user to manually label things within images by making a polygon mask.

2. However, as an AI platform, it also allows the user to leverage the AI to make these annotations autonomously. This is done using the auto-annotate feature to create the segmentation masks automatically. The tool tries to understand the objects being encapsulated such as index finger in case of hand fracture and paints its pixels.

3. Comments can be added by selecting the segment and adding the appropriate comment in the box for easy collaboration and discuss cases. Tagging the image with attributes such as broken or normal or specific to the case makes the tracking easy.

High fidelity design

Next steps
- Image manipulation feature can be added to reduce the opacity of annotations and see them better in faint medical imaging, increase or decrease image brightness and saturation.
- The class selected for this problem statement is fracture. This can be extended to other radiotherapy procedures such as CT Scan and MRI and defining the classes (kidney, lung, adrenal gland) accordingly for segmentation.
- The tool can also be trained to do smart selection where in clicking outside the polygon will include the area and clicking inside to exclude it with axis defining vector points. This can be achieved by designing a deep neural network to learn and adapt to any object either in the medical or non-medical world.