Difusion-Weighted MRI: from Brownian Motion to Head&Neck Tumor Characterization

D (DW) magnetic resonance (MR) imaging is a non-invasive functional MR imaging technique that provides complementary information for classic anatomical imaging sequences. DW imaging is a form of MR imaging focuses on the micromovements (random, brownian) of the water molecules inside voxels. The relationship between histology and diffusion is complex, however generally densely cellular tissues or those with cellular swelling exhibit lower diffusion coefficients, and thus diffusion is particularly useful in tumor characterization and cerebral ischemia. The diffusion characteristics of the tissue depend on its internal architecture (cellular packing, nucleus/cytoplasm ratio, intracellular organelles, cell membranes, nature of the extracellular matrix...) and its perfusion (micromotion of molecules in its capillaries) [1]. Molecular diffusion, or brownian motion, was first formally described by Einstein in 1905. Le Bihan et al. applied DWI on human brain for the first time in 1986 [2]. DW imaging has been used since the 1990s in central nervous system imaging, but recently the utilization of whole body DWI is becoming a standard application in routine imaging, more specifically, in the field of oncologic imaging.


A. What is Diffusion?
Molecular diffusion is the random movement of molecules -in our case water (H 2 O) -within tissues propelled by thermal energy. The motion of water protons in the extracellular compartment, the intravascular compartment, the interaction with cell membranes and intracellular water protons contribute to total diffusion. The aim of these DW sequences is to obtain images whose contrast is influenced by the differences in water molecule mobility. Stejskal and Tanner [3] introduced the application of symmetric pair of pulsed gradients during the preparatory phase into the basic spin echo sequence that is T2 weighted.

B. How is MR Sensitizing the Tissue for Diffusion Effects?
Within the spin echo preparation period of an EPI sequence, two strong gradient pulses are played out around the 180° pulse. The first pulse dephases the magnetization of moving and static spins and the second pulse will not be able to completely undo the changes induced by the first gradient and rephases only static spins so that the signal experiments almost no changes. In contrast, moving water spins acquire non-zero phase dispersion, resulting in signal attenuation. Free water experiences the strongest signal attenuation at higher b-values ( Fig. 1 & 2).  Water molecules displace freely in all spatial directions, travelling long distances between the two gradient applications. These highly mobile molecules acquire phase information after the application of the first gradient, but due to their movement they don´t rephase completely after the application of the second gradient, losing signal.  Water molecules which are in a restricted environment don´t travel long distances so phase changes acquired during the application of the first gradient will be canceled by phase changes acquired during the second gradient, without losing signal.

C. What does the b-value Mean?
The degree of diffusion weighting of the sequence, expressed as the b-factor, depends on the characteristics of the diffusion gradients: gradient amplitude, application time and time between the two gradients.
The b-value identifies the measurements sensitivity to diffusion and determines the strength and duration of the diffusion gradients. It combines the following physical factors into b-values and is quantified by the apparent diffusion coefficient (ADC) measured in s/mm² (Fig. 3).
The signal ratio diffusion-weighted to non diffusion-weighted signal is: • S 0 -signal intensity without the diffusion weighting.
• G -amplitude of the two diffusion gradient pulses.
• Δ -time between the two pulses.
• D -diffusion coefficient is a measure of the strength (velocity) of diffusion in tissue. The stronger the diffusion, the greater the diffusion coefficient, i.e. the ADC in our in vivo case.
The stronger the gradients, the longer they are applied and the more spread out in time, the greater the b-factor.

D. What is the Optimum b-value?
Images obtained with the lowest b-values (0-100 sec/mm 2 ) provide T2-weighted EPI image with lower signal-to-noise ratio for anatomical reference. However, they included external effects due to perfusion or microvascularization.
In the range of clinically relevant b-values (up to approximately 1000), the greater the b-value, the stronger the diffusion weighting and higher the contrast in pathogenic regions is notice.
Higher b-values may depict even more lesions, at the price of poor SNR due to longer TEs and increased susceptibility. Increasing averages, which result in longer scan times, can compensate this [4]. Changing the b-value immediately influences other parameters like minimal TE, slice thickness and FOV as well as maximum matrix at a given optimal bandwidth. Furthermore anisotropy of tissues, also influences the choice (Fig. 4). A. Why do most Head and Neck DWI Protocols Start with b-value 50 s/mm²?
The selection of a low b-value larger than zero provides suppression of large vessels, which makes lesions more conspicuous. The calculation of the tissue ADC can be more accurate when starting with even higher b-values like 100 or 200, to omit the contribution of flow and micro vascular effects.
Low b-values more often serve as anatomical reference. The fact of using b-value 0 in head and neck diffusion is for a shorter period of acquisition or seek greater SNR.

B. Why is a Minimum of three Directions Measured for each high b-value?
The sensitivity of these sequences is limited to diffusion in the direction of the gradients, so they must be repeated by applying diffusion gradients in at least 3 spatial directions, and diffusion may be different in all three dimensions.
Diffusion magnitude, calculated from the 3 diffusion images thus obtained, minimize the influence of anisotropy, renders the image weighted in global diffusion (trace image). The ADC images are therefore different depending on the sensitizing direction.
The 'trace image' displays the geometric averaging of all three directional measurements, resulting in trace-weighted images. It suppresses to some extent anisotropy information and focuses on differences in signal attenuation. Like the ADC map, the traceweighted map shows the strength of the diffusion and not its orientation.
Two diffusion sequences with different b-factors can be used to quantitatively measure the degree of molecular mobility, by calculating the ADC, which is represented in the form of a map, whose values (in s/mm²) no longer depend on T2. An ADC hyposignal thus corresponds to a restriction in diffusion.

C. How is the ADC Calculated?
Having measured a set of at least 2 different b-value images (e.g., b 0 and b 1000 s/mm²) the system calculates pixel by pixel the ADC by linear regression.
Signal ADC = -In (S/S0)/ b The ADC pixel values together form the ADC map. On a half logarithmic scale, the signal decay delivers a straight tilted line whose slope provides the ADC. The faster the signal decay the steeper the slope and the higher the ADC.
The Diffusion image (b 1000) below displays reduced diffusion as hyperintense (brighter pixels); in contrast the ADC map displays it as hypointense (darker pixels).

D. Why should I measure three or more b-values for a DWI protocol when two would be enough for calculating ADC?
While two b-values are sufficient for creating an ADC image, the selection of three b-values (b 50, b 500, b 1000) delivers a more accurate calculation of the ADC values (Fig. 5).
The lower SNR of the b 1000 images introduces a higher standard deviation of the ADC that is partially compensated by the median value of b500.
Here is an example of two ADC images, the first acquired with three b-values and the second with two b-values (Fig. 6).

E. Why do we Need ADC Images, and What does the 'A' in ADC Stand For?
Diffusion sequences are actually T2 weighted sequences, sensitized to diffusion by gradients.
The contrast of the diffusion image will have both diffusion and T2 component, which must be taken into consideration in the interpretation. In areas with long T2, this can simulate reduced diffusion ('T2 Shine-Through' effect). Calculating a pure diffusion coefficient can eliminate these portions of the signal.
The 'A' stands for apparent because we do not measure the pure diffusion coefficient (D or DC). In-vivo tissues, as well as the diffusion processes, have superimposed a capillary pseudo diffusion and gross motion to which the MR measurement is also very sensitive.

F. Why are some Lesions Typically Brighter than the Background Head and Neck Tissue on the Higher b-value Image and Darker on the ADC Map?
Due to the nature of certain lesions and their missing perfusion, the cells swell and hinder a normal diffusion; i.e., the mean free path is shorter. Water molecules cannot move as far in the damaged tissue as in normal tissue [5]. As a result, the ADC is lower and appears darker than the surrounding normal tissue.

G. Which Benefit does the Calculation of an Exponential Map Deliver?
Diffusion imaging cannot distinguish between water molecules motion and different microscopic movements such as those occurring in the microcirculation. Depending on the tissue composition, water molecule movements are different, that is the reason for measuring the apparent diffusion in each voxel.
The exponential map or image is calculated by dividing the maximal b-value diffusion-weighted image by the b0 image. Mathematically the exponential map displays the negative exponential of the ADC; it is a synthetic DW image without T2 'shine-through' effect [6].
The contrast behavior is similar to the high b-value image (Fig. 7).

H. Why Fat Saturation is Important in Studies of Diffusion?
In most applications the diffusion gradients are integrated in echo planar imaging (EPI) sequences, which exhibit high signal intensity in areas with restricted diffusion as well as in fatty tissue.
Furthermore, the fat signal is displaced in the direction of the chemical shift as compared to the water signal. This makes fat saturation techniques necessary to identify the lesions in the diffusion-weighted images.
There are several techniques to suppress the fat signal in MRI.
For head and neck DWI protocols the fat saturation based on SPAIR technique (SPectral Attenuated Inversion Recovery) is a good compromise between acquisition time, SNR and artifacts homogeneity and STIR (Short Tau Inversion Recovery, another fat saturation technique) provides a more homogeneous fat saturation, free of artifacts.

I. Why DWI in Head and Neck Imaging is so Sensitive to Artifacts?
The most common artifacts are those related with imaging distortion due to the field inhomogeneity and to the differences in magnetic susceptibility from the anatomical tissues that compound this region; besides the presence of air-tissue interfaces, metal implants, dental amalgams or implants common in this area.
Another source of artifacts is related with movements, either voluntary or involuntary, such as breathing or coughing. The collaboration of the patient is mandatory to obtain high quality as well as the short as possible acquisition times.

J. High-resolution DWI, RESOLVE
Single-shot echo-planar imaging (EPI) is well established as the method of choice for clinical, diffusion-weighted imaging with MRI because of its low sensitivity to the motion-induced phase errors that occur during diffusion sensitization of the MR signal.
However, the method is prone to artifacts due to susceptibility changes at tissue interfaces and has a limited spatial resolution. RESOLVE (multi-shot EPI sequence) is the combination of readout segmented EPI and parallel imaging can be used to address these issues by generating high-resolution, diffusion-weighted images with a significant reduction in susceptibility artifact compared with the singleshot case. The technique uses data from a 2D navigator acquisition to perform a nonlinear phase correction and to control the real-time reacquisition of unusable data that cannot be corrected (Fig. 8).

K. Which new DWI Features are Introduced with Software for Siemens?
There is a new 'body diffusion' application card with many new applications [7] : • diffusion scheme monopolar/bipolar • start ADC calculation for b > = … • exponential ADC; no T2 shine-through • invert gray scale ("PET-like" image) (Fig. 9).
• choice of dynamic field correction • improved fat saturation schemes

L. Image Evaluation
DWI analysis is usually qualitative, evaluating the signal intensity of the images obtained with high b-values as well as the correlation with the ADC map. This analysis can also be quantitative calculating the ADC values, placing a ROI (region of interest) on the ADC map sequence and recording the mean value in that ROI. A value of 1000 intensity points is to be interpreted as 1 x 10 -3 mm 2 /s. There are more complex methods such as parametric response maps that allow segmenting a tumor by providing better information about the intratumoral heterogeneity.

I. clInIcal aPPlIcatIons
The variation in motion and redistribution of water molecules between tissue compartments that is reflected in DW Imaging and ADC values helps to differentiate disease processes [5] and to characterize tissues, providing complementary information to conventional structural MR imaging [8].
At our institution most of the studies for imaging head and neck pathology are performed in a 1.5 Tesla MR scanner (Avanto Siemens, Erlangen, Germany), using EPI-DW sequences with 3 different b values (0, 500 and 1000 s/mm 2 ).
It is important not to use a different MR scanner or change the imaging protocol during a patient follow up, as ADC values may differ significantly between MRI systems and sequences. In fact, ADC measurements obtained by one person and in the same MR imaging system, protocol, and sequence are reproducible and independent of time.
In head and neck region, anatomical structures such as the lymph nodes, tonsillar tissue, spinal cord and nerve roots are associated with non-pathological restricted diffusion probably due to their high cellularity or highly packed internal structure [9]. Variable diffusion is usually observed within submandibular and parotid glands. The spinal cord and tonsillar tissue are the structures with the lowest ADC variability and therefore should serve as reference tissue for head and neck region studies. This is especially relevant evaluating treatment response of a tumor as they can be used for comparison. DWI in head and neck, mainly in cancer patients, is indicated for tissue characterization of primary tumors and nodal metastases, prediction and monitoring of treatment response after chemotherapy or radiation therapy, and differentiation of radiation changes and residual or recurrent disease [10].

A. Characterization of Primary Tumors
Head and neck cancers account for the sixth most common type of cancer worldwide, causing significant morbidity and mortality, being tobacco and alcohol consumption important risk factors. Differentiation of malignant head and neck tumors from benign lesions and accurate definite diagnosis is essential for treatment planning as well as for prognosis of malignant tumors.
The most relevant reports found that the mean ADC values of benign solid tumors were higher than those observed in malignant tumors, as a result of their histopathological differences.
Furthermore, due to differences in the internal architecture of each lesion, variability in ADC values was reported within each group of tumor (benign or malignant) [12]. (Fig. 10 & 11).  In fact, among squamous cell carcinomas (SCC), those showing highly or moderately differentiated histological type present higher ADC values, than poorly differentiated SCC [13][14]. This may be explained by the presence of liquefactive necrosis in the highly differentiated type.
DWI and ADC values can help to discriminate between SCC and non-Hodgkin Lymphoma; pathologic differences between these two tumors, such as the greater cellularity in lymphomas lead to a different behavior in this sequence. Usually lymphomas present greater diffusion restriction and hence lower ADC values [15][16] . The reported mean ADC for lymphoma is fairly consistent, in the range of 0.64 to 0.66 x 10 -3 mm 2 /sec [14,16]. Distinguishing between SCC and lymphoma is important to optimize the treatment of these patients. Usually SCCs require complex surgeries with extensive resections and reconstructions, alone or combined with radiation therapy and/or chemotherapy, and lymphomas are usually treated with radio-chemotherapy.
Salivary gland tumors are a rare condition, accounting for less than 3% of all head and neck cancers [17]. The salivary gland tumors display a wide spectrum of histologic features tumors are composed of distinctive tissues (tumor cells, myxomatous tissues, lymphoid tissues, necrosis, and cysts). Conventional MRI has limited utility in differentiation of salivary gland tumors [18,19]; on the other hand DWI is demonstrated to be very sensitive to biophysical abnormalities within the tumor. Preoperative prediction of tumor malignancy is clinically very important, because this information strongly influences the surgical plan. The most common benign salivary gland tumors are pleomorphic adenomas and Warthin tumors. In general, pleomorphic adenomas have highest ADC values due to the cystic or myxomatous component that characterized them, while Warthin tumors have lowest ADC values in keeping with the presence of lymphoid tissue [20] (Fig. 12). The ADC maps for malignant salivary gland tumors (such as mucoepidermoid carcinomas) demonstrate relatively homogeneous areas of low ADC values (that represent cell proliferation), in contrast to other salivary gland tumors, for example lymphomas arising in the salivary glands, that are associated to extremely low ADC values (because of the presence of lymphoma cells) [15].
Anyway, in some cases, there is considerable overlap of ADC values, and DWI alone may not be sufficient to discriminate between benign and malignant salivary gland tumors [21,22].

B. Evaluation of Lymph Nodes
The presence of cervical lymph node metastases is the most important prognostic factor in head and neck squamous cell carcinomas as this worsens the treatment outcome. Pretreatment staging is crucial in the management of head and neck cancer, and it has been considered one of the most important aspects in the selection of treatment options.
Differentiation between inflammatory and metastatic lymphadenopathy is often challenging with conventional imaging [23]. Also, morphologic and size criteria in MRI are not enough for the assessment of lymph node metastases.
DW imaging can help to detect cervical lymph node metastases, and to differentiate between benign and malignant enlarged lymph nodes. The general consensus appears to be that ADCs of malignant lymph nodes are significantly lower than those of normal lymph nodes [13,23,24]. Threshold ADC values (1.0-1.38x10 -3 mm 2 /s) have been reported to differentiate between malignant and benign lymph nodes [13,23,25]. De Bondt et al reported a threshold, when ADC is lower than 1.0×10 −3 mm 2 /s this was the strongest independent predictor of presence of metastasis (Fig. 13). DWI can be better in differentiating between malignant and benign lymph nodes when abnormal lymph nodes show significantly different diffusion characteristics to normal lymph nodes within the same patient, as it is easy to compare. Despite the promising potential of DWI in detection of small malignant lymph nodes, low in-plane resolution of ADC maps and the presence of image artifacts can impact negatively on specificity and reproducibility of findings. For this reason, DWI should always be interpreted in conjunction with other MRI sequences to improve diagnostic accuracy [10]. Nowadays, depicting small metastatic lymph nodes (<4 mm) and lymph nodes with micrometastases that are below the resolution of currently available morphologic MR and DW images, remains challenging.

C. Monitoring and Prediction of Treatment Response after Chemotherapy or Radiation Therapy
The prognosis of patients with SCC of the head and neck remains poor despite aggressive therapeutic regimens and technological advances in surgery [26]. DWI is a noninvasive imaging biomarker to predict tumor response and one of the greatest potential benefits of DW imaging lies in the identification of the group of patients who could respond or fail to respond to therapy. Furthermore this technique could detect disease before clinical signs or symptoms are evident.
As DWI evaluates the motion of water molecules within intracellular and extracellular spaces, it reflects biological changes in tumor microenvironment, and therefore changes in ADC may imply changes in tumor composition.
There are published data that suggest the change in ADC over the course of treatment may indeed be a predictor of outcome [27][28][29], and could be use in monitoring treatment response. A treatment-induced increase in ADC during therapy for head and neck squamous cell carcinoma has been confirmed in several studies, and results suggest that tumors that show a lower increase or even a decrease in ADC are more likely to fail treatment [24,28,30,31,32] (Fig. 14 & 15).
Vandencaveye and colleagues reported ADC changes before pretreatment studies and 3 weeks after chemo or radiotherapy allows early assessment of treatment response. This allows early assessment of treatment response. The ADC showed a PPV of 89% and a NPV of 100% for primary lesions and a PPV of 70% and a NPV of 96% for lymph nodes33.  These findings could lead to stop ineffective treatments and to avoid delays in starting alternative and maybe more effective therapies.
It could also be important to develop prognostic imaging markers that can accurately predict treatment response before therapy. These imaging biomarkers may help in stratifying patients into those who would benefit from chemo-radiation therapy from those who would not. DWI studies of HNSCC have suggested that ADC can be used as a potential marker for prediction of treatment response and long-term survival [28,32,34]. These results are consistent with the hypothesis that a high pretreatment ADC value may be indicative of micronecrosis and, consequently, of hypoxia-mediated increased resistance to treatment and poor prognosis in these patients [35].
Kim and cols reported that the mean ADC of responders increases significantly after one week and it increases until the end of treatment. Values were found to be higher in complete responders than in partial responders [30].
This technique potentially could help in the detection of responder or not responder patients.

D. Residual or Recurrent Disease
Chemotherapy or radiation therapy changes and recurrent neck tumor have similar CT and MR appearance and are difficult to differentiate. Anatomical distortion due to surgery and the presence of edema and necrosis after chemo-radiation therapy may difficult the interpretation of the findings [36].
FDG-PET/CT may help to detect recurrent SCC [37], but inflammatory changes within the first 4 months following radiotherapy is an important confounding factor, even biopsies performed after radiotherapy to identify residual/recurrent disease are often equivocal [38].
Qualitative DW imaging analysis after treatment may be helpful and is most of the times is performed by means of visual assessment of signal intensity on DW images [39].
Post-therapeutic changes induced by radio or chemotherapy can be visualized as high, or sometimes also low, signal intensity on high-b-value images but generally show high signal intensity on the corresponding ADC map, as compared with tumors [10].
Although ADCs often allow differentiation between tumor and inflammation, reported ADC thresholds differ from one series to another because of variable technical parameters used by various investigators [33,39,41]. For example, Vandecaveye et al [33] reported a high sensitivity (94.6%), specificity (95.9%) and accuracy (95.5%) for DWI to distinguish between tumoral and nontumoral tissue. The ΔADC showed a PPV of 89% and an NPV of 100% for primary lesions and a PPV of 70% and an NPV of 96% for lymph nodes. They also found that DWI yielded fewer false positives in comparison with CT or PET for both residual primary tumor and lymph node metastases.
As there may be some overlap between ADCs measured in recurrent tumors and those in radiation therapy-induced inflammatory tissue, DW imaging findings must be correlated with morphologic MR imaging findings.

II. what aBout the futuRe?
DWI has been shown to add value in several areas by being part of the multi-parametric MRI approach, even though quantitative values tend to overlap.
Investigations into the clinical applications are still at an early stage. A challenge that DWI faces is standardization of imaging protocols allowing for better comparisons across studies, getting higher spatial resolution for better tumor delineation, to depict smaller lesions, to reduce susceptibility artifacts and acquisition times.
The improvement of other (nonEPI) techniques, less sensitive to artifacts, such as halfFourier single-shot turbo spin-echo (HASTE), the split acquisition of fast spin-echo signals (SPLICE), PROLLELER, or BLADE could lead into a better approach to this difficult area. To achieve field homogeneity, both 1.5 Tesla and 3 Tesla, allowing better fat suppression could also be helpful.
There are new applications in this field: diffusion scheme monopolar/bipolar, start ADC calculation for b >, exponential ADC; no T2 shine-through, invert gray scale ("PET-like" image) calculated image of artificial b-values plus, choice of dynamic field correction, improved fat saturation scheme.
Moreover, to develop methods to analyze ADC maps more accurately will be essential as well as the standardization of the technique acquisition and post processing methods that would allow setting thresholds and integrating them in clinical settings.
Finally, it will be crucial to correlate DWI with morphology on MRI and other functional techniques, such as Perfusion MRI, Dynamic Contrast Enhanced (DCE) Imaging and PET to reach a better clinical approach.