Justine's AP186 Blog: Activity 7 : Fourier Transform Model of Image Formation

(Welcome, Interwebs traveler. A note before you go: please click any image to view in enlarged/theater mode. Now explore on!)

Lens as a Fourier Transformer

For this activity, we learn about the Fourier transform in relation to image formation. What is the Fourier transform (FT)? In the words of our Activity 7 handout [1], the FT is a linear transform which converts a signal with X dimension into one with 1/X dimension; particularly, if the signal is in dimensions of space, the FT will convert it to one of inverse space or spatial frequency [1]. In its continuous two-dimensional (2D) form, the FT of some function f is expressed as [2]:

Figure 1. Continuous two-dimensional expression of the Fourier transform (FT) of a function f.
Image taken from source [2].

where the function f has dimensions in x and y. While in theoretical or mathematical calculations use of this form is suitable, it will take quite some time in numerical calculations. An example would be if the FT is used in image processing (where it is often used). To reduce processing time, the Fast Fourier Transform (FFT) algorithm was developed [3], which provides quick calculation of the FT in its discrete form [2]:

Figure 2. Discrete two-dimensional expression of the FT of a function f.
Image taken from source [2].

We demonstrate the effectiveness of the FFT by employing the pre-programmed algorithm in MATLAB. This is called by using the code fft2(). We use a white circle on a black background (created in MATLAB) to simulate a circular aperture. Figure 3 shows the expected FT plot [4] and the calculated FT plot. We can see that the calculated/numerical FT plot is similar to the expected FT plot, signifying that the algorithm works. Yay!

Figure 3. Circular aperture and its Fourier transform plot,
plotted theoretically (a) and (b) (taken from source [4]),
and numerically (c) and (d).

The numerical FT in Figure 3 above was calculated and saved after applying fftshift(), which essentially shifts the low-frequency components in the frequency space to the center of the image. Figure 4 compares the plot of the FT before and after shifting (shown in Fig 4(a) and Fig 4(b), respectively).

Figure 4. 2D FFT of the circle: (a) before shifting, and (b) after shifting.

We also check what happens when we apply the FFT algorithm on the calculated FT plot, and the result is shown in Fig 5(c). The result is the original centered circle, indicating that the FFT transforms the image to its spatial frequency domain and another FFT takes it back to the spatial domain.

Figure 5. The circle as viewed in the different domains:
(a) in spatial domain; (b) in frequency domain, after first application of 2D FFT;
and, (c) in spatial domain, after second application of 2D FFT.

But what if the object in the image is not symmetrical? We apply the 2D FFT algorithm once and twice on certain Latin letters, the letter 'A' and the letters in my name 'JUSTINE'. Of course, after one application of the FFT algorithm, the letters are transformed to the frequency domain and we see the signal frequency of each letter. After the second FFT, we see that the letters are recovered but are inverted.

Figure 6. Latin letters as viewed in the different domains. 1st row: in the spatial domain;
2nd row: in the frequency domain, after first application of 2D FFT;
3rd row: in the spatial domain, after second application of 2D FFT.

Wait a minute, this looks familiar... why yes, of course! This is the same effect that using a convex lens would produce! For light passing through the circular lens, it would be focused to a point (with some diffraction effects) and then enlarge again to the circular figure. For asymmetrical objects placed more than the lens's focal length, the image will be inverted [5], which we see for the Latin letters. Thus, we have demonstrated that the FFT simulates using a concave lens.

Convolution

The convolution is an operation which gives the overlap of two functions f and g, expressed by

Figure 7. Continuous expression of the convolution of f and g.
This is also denoted by f *g, where the star or asterisk (*) indicates convolution [6].
Image taken from the AP186 handout [1].

where x' and y' are the original 2D coordinates of f and g, and x and y are the 2D coordinates of the convolution h [1]. This is denoted by a star or asterisk (*) as in (f *g) [6]. Our handout helpfully adds that when applied to functions under linear transforms such as our FT above, the convolution of f and g in one domain becomes the multiplication of their transform F and G in the other domain [1]. This means that to simulate the overlap of f on g (or vice versa), we only need to multiply their corresponding linear transforms.

We can use this knowledge to simulate the passage of light through a circular aperture by taking the FFT of an object, namely the letters VIP, and convolving it with circles of different radii. Note that we consider the circles to be transfer functions already in the frequency domain. Figure 8 shows the result of changing the circle radius before convolution. Note that the resulting image is rotated 180 degrees (using Adobe Illustrator) for presentation purposes on this blog post.

Figure 8. Result of convolution between a circular aperture (1st row)
of radius 0.1, 0.2, 0.3, 0.4 and 0.5 a.u. (left to right columns),
and the letters VIP. The results are shown in the 2nd row.

We can see that the convolution simulates light passing through an aperture, where a small radius creates an image distorted by diffraction and a large radius allows most (if not all) of the object light to pass.

But does this work for other optical phenomena? For fun, I wanted to see what happens if we use two rectangles placed close together as the transfer function. We expect this to simulate Young's double-slit experiment [7]. Figure 9 illustrates the effect of the slit widths, indicating that as the slit widths increase the diffraction envelope shrinks and thus the diffraction effects should be less visible [8].

Figure 9. Illustration of the effect of the slit widths, for a 2-slit aperture.
As the slit widths increase, the diffraction envelope shrinks.
Image taken from source [8].

From the resulting convolved images shown in Figure 10, we can see that the double-slit experiment is simulated again by the convolution method, since for an increase in slit widths there are less diffraction effects perceived in the convolved images.

Figure 10. Result of convolution between two rectangles (1st row)
of width 1, 2, 3, 4 and 5 pixels (left to right columns),
and the letters VIP. The results are shown in the 2nd row.

Correlation

An operator similar to convolution is correlation, which is given by

Figure 11. Continuous expression of the correlation of f and g.
Image taken from the AP186 handout [1].

where x' and y' are the original 2D coordinates of f and g, and x and y are the 2D coordinates of the correlation p [1]. This is denoted by

Figure 12. Correlation p expressed in terms of f and g (middle term),
and in terms of the convolution of f and g (last term).
Image taken from AP186 handout [1].

and we can see that the correlation is simply the correlation of f (with conjugate coordinates) and g [1]. Now, what the correlation does is it gives an indication of the relationship between f and g [9]. If f and g are highly and positively correlated, meaning that when one increases the other increases as well, p will have a value +1 or almost +1. If they are highly but negatively correlated, meaning that when one increases the other decreases, p will have a value -1 or almost -1. If they are not correlated, p will have a value 0 or around 0.

Due to this nature of the correlation operator, we can use it to locate frequencies or perform 'template matching'. We demonstrate this on the text "The rain in Spain stays mainly in the plain", capitalized and evenly sized, in Figure 13(a). Taking the correlation of the letters 'A', 'N', and 'S' of the same size as the "rain in Spain" text, we get the result in Figure 13(b) where a bright peak is shown at the position the letters are found.

Figure 13. Correlation of text in (a), made using capitalized and evenly sized letters,
and select letters of the same size as shown in (b).
A bright spot or peak is present at the position the letters are found in the text.

What if the text has letters of different sizes and position? Figure 14(a) shows the new image after resizing and rotation of some of the letters. We take its correlation with the letters 'A', 'N', and 'S' from Figure 13(b). The result is shown by Figure 14(b). We can see that the peaks occur only where the letters have the same size and orientation. However, for the correlation with 'A', some peaks can be found possibly because the FT of the rotated 'SP' in Spain resemble the FT of the upright 'A'. For 'S', no bright peaks stand out because there is little to no correlation between the text and 'S'.

Figure 14. Correlation of new text in (a), made using resized and rotated letters,
and the same letters as in Figure 13(b).

We also test the text in Figure 14(a) for resized and rotated letters, such that they have the same size and orientation as the word 'Spain'. As expected, there are bright peaks in the corresponding letters in the word 'Spain'. Yay!

Figure 15. Correlation of new text in (a), made using resized and rotated letters,
and new resized and rotated letters made to be the same as the word 'Spain' in (a).

We can also use the correlation for edge detection [1], which is essentially template matching with a line template. Using a 3x3-pixel line template (1st column in Figure 16) and taking the correlation with the 'VIP', 'Rain in Spain' and the 'Rain' rotated counterpart, we obtain the edge outlines of the texts.

Figure 14. Images resulting from edge detection, with the edge template on the 1st column
and three sample texts in the 2nd to 4th columns.

We can see that the orientation of the templates dictate the edge outline that is emphasized. For example, the first template shows a white horizontal line. The resulting image shows the parts of the text edge outline that is horizontal in orientation. All the templates are lines (horizontal, vertical, slanting) except for the third template which is essentially a single white pixel surrounded by black background. Correlation of the template with the texts produce the corresponding edge outline which is equally emphasized on all sides.

Grading

Based on the criteria required, I believe I deserve a grade of 12/10 since I performed a few extra activities such as the resized and rotated letters for template matching.

Thursday, July 4, 2013

Activity 7 : Fourier Transform Model of Image Formation

No comments:

Post a Comment