The need for focus stacking arises due to the problem of insufficient depth of field (DoF) in macro photography. DoF is particularly shallow when it comes to objects that are very close to the lens. Therefore, when a subject fills the frame, i.e., the sensor, it is often so close to the camera that the DoF simply does not cover the entire depth of the subject. As a result, it may not be fully in sharp focus. This sharp focus is defined using a property called “circle of confusion”.
DoF can be mathematically defined using the following formula:
\[\text{DoF} \approx \frac{2u^2Nc}{f^2}\]Where:
Note that formally the DoF varies inversely quadratically with the focal length and the distance to the subject. These will therefore have a high impact on the DoF.
Now, to create an example, let’s say I go to photograph a subject that is 1 cm long (z-plane) with a full-frame camera and a macro lens with the focal length of 105mm at aperture f/4. The subject is 30 cm away from the lens. What will the DoF be?
In this example:
Using these values, you can calculate the DoF as follows:
\[\text{DOF} = \frac{2 \cdot 4 \cdot 0.00003 \cdot 0.105}{(0.30)^2}\]Now, let’s calculate the DoF:
\[\text{DOF} = \frac{0.0000252}{0.09}\] \[\text{DOF} = 0.00028 \, \text{m}\]So, in this example, with a subject at a distance of 30 cm (0.30 m), an f-number of 4, a focal length of 105 mm (0.105 m), and a circle of confusion of 0.03mm (0.00003 m), the DoF is approximately 0.00028 meters, which is equivalent to 0.28 millimeters. This means that the area around the subject, which is within this range in front of and behind the subject (approximately 29.986 cm to 30.014 cm), will be in acceptable focus.
In the above mentioned example the resulting DoF is 0.28mm. Therefore the task of capturing the entire subject acceptably sharp in a single photo becomes impossible. Many practical examples of macro photography in the wild, such as mushroom photography, often offer limited flexibility in adjusting these parameters. For instance, changing the aperture to a high value, like f/22, introduces the challenge of ensuring sufficient light reaches the sensor without resorting to high ISO settings, which can introduce noise into the image. To address this, many photographers turn to external flashes to provide adequate lighting.
However, even with these measures, achieving sharp results in macro photography can remain a significant challenge. Changing the aperture value to f/22, as in the above example, results in a DoF of 7.7 mm. While this extends the depth of field, it may still fall short of covering the entire 1 cm subject in the z-plane.
Conclusion: Our ability to capture macro subjects in sharp focus is ultimately constrained by the limitations of depth of field, aperture, and available lighting.
Drumm roll…
Focus-Stacking 📷
As the name suggests, focus stacking involves capturing multiple images with varying focus points in the z-plane, from the front to the back. These images are then merged using specialized software that identifies and combines the sharpest parts of each image, resulting in a final image where the subject is in complete focus.
Here’s an detailed list of the key equipment needed:
Equipment | Description |
---|---|
Full-Frame ILC | A full-frame Interchangeable Lens Camera (ILC) stands as a preferred choice for focus stacking in macro photography. The larger sensor size of full-frame ILCs provides greater control over depth of field, allowing to capture fine details with precision. |
Macro Lens | A macro lens is specifically designed for close-up photography, making it ideal for capturing detailed shots of subjects at close distances. These lenses provide high magnification and are capable of rendering fine details. |
Tripod | A sturdy tripod is crucial and keeps your camera stable, eliminating camera shake and ensuring that each frame is perfectly aligned. Even the slightest movement between shots can disrupt the stacking process. |
Remote Shutter Release | A remote shutter release or cable release allows you to trigger your camera’s shutter without physically touching it. This minimizes the risk of camera shake caused by pressing the shutter button. Alternatively, you can use your camera’s built-in self-timer function to achieve a similar effect. |
Lighting Equipment | Depending on the shooting conditions, you may also need lighting equipment, such as a ring flash or external flashes. Controlling the lighting is crucial in focus stacking to ensure consistent exposure and avoid shadows. |
When using a camera without built-in focus stacking capabilities, capturing a series of images with varying focus points is a manual process that demands precision and attention to detail. Here are key considerations for taking the pictures:
Point | Description |
---|---|
Tripod | Begin by setting up your camera on a stable tripod. Even the slightest movement can impact the final result, so a sturdy support is crucial. |
Manual Focusing | Use manual focus mode on your camera or lens to control the focus yourself. Ensure your camera is in manual focus mode, and manually focus on the closest part of the subject. |
Select Focus Points | Determine the range of focus points you want to capture. This usually means a sequence of focus points along the z-axis, from the closest to the farthest part of the subject. |
Consistent Settings | Maintain consistent camera settings throughout the shoot. Keep the same aperture, shutter speed, and ISO for all images. Consistency in exposure settings is vital. |
Capture a Series of Images | Begin taking a series of shots while adjusting the focus manually. Start at your initial focus point and gradually move the focus towards the ending point. It’s essential to overlap the focus points slightly to ensure a smooth transition when stacking. |
External Release or Timer | Minimize any potential camera shake by using an external shutter release or the camera’s built-in timer to trigger the shutter. Avoid touching the camera during the capture process. |
Proper Lighting | Maintain consistent lighting conditions throughout the shoot. Any changes in lighting can make it more challenging to merge the images seamlessly. |
Practice and Patience | Focus stacking can be a skill that requires practice and patience. Experiment with different subject distances, aperture settings, and focus point intervals to refine your technique. Over time, you’ll become more proficient in taking pictures for focus stacking. 😉 |
After being out in the field and capturing a stack of images of your prefered subject it is time to invest some time in the post processing process. Both Adobe Photoshop and Helicon Focus provide functionality to stack the images and do some post processing. Adobe Lightroom can be used for the final touches regarding lighting and color grading. Due to the fact that this is a very subjective task, i won’t cover it in this post. Rather i will focus on Helicon Focus for the part of stacking the images.
The first step of stacking the imgaes will be that you open the stack of images in helicon focus. The software supports the following formats: JPEG, TIFF and various RAW formats with 8 and 16 bits per channel. Helicon focus is able to open RAW formats due to its integration of Adobe’s Digital Negative Converter. This enables you to convert camera-specific raw files, such as ARW for Sony cameras to a more universal DNG raw file with the emphasis lying on backward compatibility.
Helicon Focus provides you with three different stacking methods: a) weighted average, b) depth map, and c) pyramid. Which one of the methods (a, b, or c) will work best for you depends not only on the image itself but also on the number of images in the stack and whether these images have been taken in consecutive order. Helicon Focus suggests some of these for particular situations, but there is no guarantee that the stacked image will turn out without any artifacts.
Helicon Focus recommends the methods as follows:
Method A determines pixel weights by analyzing their contrast and then calculates the average of all pixels from the source images based on these weights.
Method B identifies the source image containing the sharpest pixel and generates a “depth map” using this data. This method assumes that the images are captured sequentially, either from front to back or vice versa.
Method C employs a pyramid-based approach for image representation. It yields promising outcomes in challenging scenarios, such as overlapping objects, edges, and deep image stacks, but may intensify contrast and glare.
For mushrooms where the stack has been captured in consecutive order, I personally recommend method B. Feel free to try out the other methods as well.
The most important parameter for processing is the Radius. It is advised to try different values, starting from its minimum of 0, and work your way up from there. Increasing the value will generally yield less noise or artifacts, which are particularly visible as halos along the edges of your subject.
It is especially advised to use a low radius level (3-5) if your images contain a lot of thin lines and very fine details. You will need to find a balance between fine details and less noise and a halo effect around the edges of the subject. Increasing the radius will always help you reduce or get rid of the halo effect. Generally, you can stop increasing this value if the halo isn’t noticeable. The higher you go with this value, the more details you will lose.
How come increasing the radius parameter leads to a decrease in the halo and also to a decrease in the details?
The radius parameter specifies the pixel neighborhood size used for contrast calculations. Each pixel undergoes an assessment to determine its sharpness. Subsequently, the software combines those pixels that exhibit higher contrast, as they are presumed to be sharper.
Here is a visualization of the Method depth map with different values for the parameter radius and how it effects the image:
This is generally how all focus-stacking works. The algorithm finds and combines sharp areas.
Here, another parameter is introduced: Smoothing. Smoothing defines how the sharp areas are combined. The relation here is that a lower value produces a sharper image. But you have to keep in mind that the transition areas may contain some artifacts, which can be mitigated when the smoothing value is increased. The advice here is to start with the lowest smoothing value and then work your way up. Not enough detail in the image? Decrease the smoothing value. Too much noise and too many artifacts? Set a higher value.
Here is a visualization of the Method depth map with different values for the parameter smoothing and how it effects the image:
Here, you can observe the result of stacking 27 images of two mushroom specimen discovered on a dead tree trunk in a local forest. It is evident that not only the caps of the mushrooms are in focus, but also the gills and the stems, thanks to the focus stacking method.
The following introductory part is based on the Stanford CS229 lecture notes by Andrew Ng and Tengyu Ma.
The k-means clustering problem is an unsupervised learning problem. We are given a training set \(\{x^{(1)},...,x^{(n)}\}\) that we want to group into cohesive clusters, where \(x{(i)} \in \mathbb{R}^{d}\), but no labels \(y^{(i)}\) are provided.
The k-means clustering algorithm is defined as follows:
Randomly initialize Centroid Clusters \(\mu_1,\mu_1,...,\mu_k \in \mathbb{R}^{d}\)
Repeat until convergence:
i) For every i, set
\[c^{(i)} := \underset{\substack{\mu}}{\text{arg min}} || x^{(i)} - \mu_j ||^2.\]ii) For every j, set
\[\mu_j := \frac{\sum_{k=1}^n 1\{c^{(i)}=j\}x^{(i)}}{\sum_{k=1}^n 1\{c^{(i)}=j\}}.\]In this algorithm \(k\) is a parameter and denotes the number of clusters we wan’t to find; the cluster centroids \(\mu_j\) denote our current guesses for the position of the centers of the clusters. To set the initial cluster centroids (as described in step 1 of the algorithm above), one approach is to randomly select \(k\) training examples and then assign the cluster centroids to have the same values as these \(k\) selected examples. It’s worth noting that there are alternative methods for initializing cluster centroids as well. Check this paper out if your interested in cluster center initialization algorithms or this one for big data optimized approaches for cluster initialization.
The inner loop of the algorithm iteratively performs two key steps:
i) Assigning each training example \(x^{(i)}\) to its closest cluster centroid \(\mu_j\).
ii) Updating each cluster centroid \(\mu_j\) to be the mean of the points assigned to it.
And do this two key steps until convergance. But will the k-means algorithm always converge?
Using the vectorized form, which sums up the squared distances for all data points in a more compact way, the distortion function can be defined as:
\[J(c,\mu) = \sum_{i=1}^{N} ||x^{(i)} - \mu_{c^{(i)}}||^2\]\(J\) is a measure of how well the data points are clustered around their respective cluster centroids. For each training example \(x^{(i)}\), \(J\) measures the sum of squared distances between \(x^{(i)}\) and the cluster centroid \(\mu_c^{(i)}\) to which it has been assigned. This means that \(J\) quantifies how spread out or how far each data point is from its assigned centroid.
It can be proven that k-means is exactly “coordinate descent” on \(J\). Coordinate descent is an optimization technique where you iteratively minimize a function with respect to one variable while holding the others fixed. In the context of k-means: The inner-loop of k-means refers to the iterative steps of the algorithm. The algorithm repeats these steps until it converges. In each iteration of the inner-loop, the algorithm alternates between two tasks:
Minimizing \(J\) with respect to the cluster assignments (\(c\)) while keeping the cluster centroids (\(\mu\)) fixed.
Minimizing \(J\) with respect to the cluster centroids (\(\mu\)) while keeping the cluster assignments (\(c\)) fixed.
Because k-means is “coordinate descent” \(J\) must monotonically decrease during the iterative process. The value of \(J\) must eventually converge to a stable value. This is because, in each iteration, the algorithm strives to improve the clustering, which should lead to a lower value of \(J\). The convergence of \(J\) implies that both the cluster assignments (\(c\)) and the cluster centroids (\(\mu\)) will also converge. As \(J\) decreases and converges, the algorithm refines the cluster assignments and centroids, moving them towards a better clustering solution.
But attention \(J\) is a non-convex function which means, that it is susceptible for local optima. More times then not \(k\)-means will work fine and this should be no reason to worry. But if you are afraid that it does not come up with very good clusterings there is always the possibility to run \(k\)-means many times (with varying random initial values for the cluster centroids \(\mu_j\)). The best clustering can be determined by the lowest \(J(c,\mu)\) out of every try.
Our subject study for this blog post will consist of this image:
Digital images consist of pixels arranged in a grid, with each pixel containing data about color and brightness. In color images, pixels use red, green, and blue (RGB) channels, each with values from 0 to 255, determining the pixel’s color. Grayscale images rely on a single brightness value per pixel, ranging from 0 (black) to 255 (white), producing various shades of gray.
Now let’s visualize this image in 3D. First we read the image and resize it. Then we split the image into red, green, and blue channels. After that we extract pixel data from each channel and plot the result:
Remember in k-means clustering \(k\) is a parameter that denotes the number of well defined clusters we wan’t to find. There are various methods to help you determine the optimal value of \(k\), such as the “elbow method,” silhouette analysis, or even domain-specific knowledge. Each method has its strengths and limitations, and the choice of \(k\) often involves a balance between the desired granularity of clustering and the interpretability of the results.
The Gap statistic is one of the standard methods to determine the appropriate number of clusters in a data set. This method can be used for the output of the k-means clustering algorithm, where it compares the change in within-cluster dispersion \(W_k\) standardized as \(\log(W_k)\) with that expected under an appropriate reference null distribution.
Although this method outperforms others it is sometimes not able to suggest the correct number of clusters given the data that was for example derived from exponential distributions. Mohajer et. al. suggest using \(W_k\) instead of \(\log(W_k)\) and to compare the expectations of \(W_k\) under a null reference distribution.
The method goes back to speculations by Robert L. Thorndike in 1953.
The elbow method is a popular technique used to determine the optimal number of clusters \(k\). It helps you find a suitable value for k by analyzing the inertia explained as a function of the number of clusters.
In the process of determining the ideal number of clusters for your data, you start by running the algorithm with different cluster counts. Following this, you calculate the sum of squared distances, known as inertia, for each cluster count. These inertia values are then plotted against the corresponding cluster counts. Your goal is to pinpoint the ‘elbow point’ on the plot, which signifies the juncture where the rate of inertia decrease begins to slow down. This ‘elbow point’ serves as a crucial indicator, representing the most suitable number of clusters.
It’s important to note that while the elbow method is a useful heuristic, it might not always yield a clear elbow point, especially for complex datasets or when clusters have irregular shapes and densities like we have in this example.
To calculate the elbow point, a package called kneedle was used. In this example, I have marked the optimal number of clusters using the KneeLocator method with parameters curve=’convex’ and direction=’decreasing’ from this package.
Logically, since two of the clusters, along with three others, are in close proximity to each other, the Elbow Method may suggest merging them. This is because placing a centroid between these clusters would result in short distances from data points to it.
Consequently, a need arises for a more reliable approach to determine the optimal number of clusters for our clustering task. This is where the Silhouette score comes into play.
The Silhouette Score was proposed in 1987 by the Belgian statistician Peter Rousseeuw.
The Silhouette score can generally help to determine the best number of clusters, although it’s not its primary purpose. The Silhouette score is primarily used to assess the quality of existing clusters, but it can be used in a heuristic manner to guide your choice of the number of clusters. It provides a measure by gauging how similar an object is to its own cluster (cohesion) compared to other clusters (separation).
The silhouette coefficient ranges from -1 to 1, where: A coefficient close to +1 indicates that the object is well-matched to its own cluster and poorly-matched to neighboring clusters, implying a good separation. A coefficient around 0 indicates that the object is on or very close to the decision boundary between two neighboring clusters. A coefficient close to -1 indicates that the object is probably assigned to the wrong cluster.
Starting with the lower score, in our example, it’s evident that when setting $k$ to 3, we achieve a silhouette coefficient of 0.58. However, when we increase $k$ to 6, the silhouette coefficient substantially improves to 0.76.
Combining the two methods for determining the optimal k, we can plot the silhouette score and inertia against the number of clusters. Here, we observe that 6 is a suitable choice for k according to both the elbow method and the silhouette score. We can confidently rule out 3 as a viable option for k.
Now that we have the appropriate k for our k-means clustering algorithm, we can proceed to cluster the data and plot it, assigning each point a color corresponding to its cluster. The resulting plot clearly displays the 6 clusters, each distinguished by its respective cluster centroid color.
After the successful clustering of our image data into 6 distinct groups, each representing one of the 6 most dominant colors, we can also pinpoint the primary color, the one that captures our visual perception.
This selection process is achieved by calculating the frequency of data points assigned to each cluster using the np.bincount
function. The cluster index with the highest count is designated as the most dominant cluster, and the color at the centroid of this cluster represents the primary and most dominant color in the image.
This blog post will cover the requirements that must be met for an AHVN13 number to be considered valid, using FHIRPath for validation. Additionally, we will explore the AHVN13 number and its potential applications as an identifier, such as in an Implementation Guide.
The AHV has used an insured person number since 1948. In 2008, a 13-digit AHVN was introduced, which can be systematically used under specific conditions. However, using the AHVN outside of the AHV is only permitted under certain circumstances. The AHVN does not provide any personal information about the holder, making it impossible to draw any conclusions about their personal characteristics. The unique patient identification number for the electronic patient dossier (EPD) is a register-specific number managed by the ZAS, which is linked to the AHVN.
Example: 756.2295.8830.70
Provided that AHVN13 is represented as a dotless string, the FHIRPath expressions below can be utilized to perform the corresponding validations:
The function startsWith(prefix : String) : Boolean can be used:
value.startsWith('756')
The function matches(regex : String) : Boolean can be used in combination with a simple regex:
value.matches('^[0-9]{13}$')
The FHIRPath expression used in these examples computes the check digit based on ean13 and verifies it against the given check digit.
(((10-(28+(value.substring(3,1).toInteger()\*3)+(value.substring(4,1).toInteger()\*1)+(value.substring(5,1).toInteger()\*3)+(value.substring(6,1).toInteger()\*1)+(value.substring(7,1).toInteger()\*3)+(value.substring(8,1).toInteger()\*1)+(value.substring(9,1).toInteger()\*3)+(value.substring(10,1).toInteger()\*1)+(value.substring(11,1).toInteger()\*3))mod(10))mod(10))=value.substring(12,1).toInteger())
In this blog post, there are two examples provided regarding check digits - one with a valid ahvn13 and the other with an invalid one.
The CHCorePatient profile is an example of a resource that makes use of an identifier, i.e., amongst others, the AHVN13Identifier. This identifier profile has multiple constraints attached to it. In FHIR terms these are called invariants. Namely, they are ahvn13-length
, ahvn14-startswith756
and ahvn13-digit-check
.
Profile: AHVN13Identifier
Parent: Identifier
Id: ch-core-ahvn13-identifier
Title: "AHVN13 / NAVS13 Identifier"
Description: "Identifier holding a 13 digit social security number. The number shall have exactly 13 digits and shall not contain point characters for separation."
* system 1..
* system = "urn:oid:2.16.756.5.32" (exactly)
* value 1..
* value obeys ahvn13-length and ahvn13-startswith756 and ahvn13-digit-check
For deduplication purposes the invariants were not defined inline, but rather in seperate .fsh files in a separate folder.
Invariant: ahvn13-digit-check
Description: "AHVN13 / NAVS13 must pass digit check - https://www.gs1.org/services/how-calculate-check-digit-manually"
Severity: #error
Expression: "(((10-(28+(value.substring(3,1).toInteger()*3)+(value.substring(4,1).toInteger()*1)+(value.substring(5,1).toInteger()*3)+(value.substring(6,1).toInteger()*1)+(value.substring(7,1).toInteger()*3)+(value.substring(8,1).toInteger()*1)+(value.substring(9,1).toInteger()*3)+(value.substring(10,1).toInteger()*1)+(value.substring(11,1).toInteger()*3))mod(10))mod(10))=value.substring(12,1).toInteger())"
Invariant: ahvn13-length
Description: "AHVN13 / NAVS13 must be exactly 13 characters long"
Severity: #error
Expression: "value.matches('^[0-9]{13}$')"
Invariant: ahvn13-startswith756
Description: "AHVN13 / NAVS13 must start with 756"
Severity: #error
Expression: "value.startsWith('756')"
The purpose of the check digit is to detect input errors and confirm the correctness of the social security number. How is the check digit created? To demonstrate, let’s examine the AHVN13 (756.2295.8830.70) and confirm that the correct check digit is indeed 0.
Format | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | D9 | D10 | D11 | D12 | D13 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GTIN-13 | 7 | 5 | 6 | 2 | 2 | 9 | 5 | 8 | 8 | 3 | 0 | 7 | 0 |
\[\displaylines{ (D1 \cdot 1) + (D2 \cdot 3) + (D3 \cdot 1) + \\ (D4 \cdot 3) + (D5 \cdot 1) + (D6 \cdot 3)+ \\ (D7 \cdot 1) + (D8 \cdot 3) + (D9 \cdot 1) + \\ (D10 \cdot 3) + (D11 \cdot 1) + (D12 \cdot 3) \\ = sum }\]Step 1: The individual digits D1 - D12 are multiplied by 1 and by 3 alternately from left to right and the sum is calculated.
or with the numbers inserted:
\[\displaylines{ (7 \cdot 1) + (5 \cdot 3) + (6 \cdot 1) + \\ (2 \cdot 3) + (2 \cdot 1) + (5 \cdot 3) + \\ (9 \cdot 1) + (8 \cdot 3) + (8 \cdot 1) + \\ (3 \cdot 3) + (0 \cdot 1) + (7 \cdot 3) \\ = 130 }\]Since the country code (756) is always the same, we can shorten the expression as follows:
\[\displaylines{ (7 \cdot 1) + (5 \cdot 3) + (6 \cdot 1) \\ = 28 }\]Which leaves us with:
\[\displaylines{ 28 + \\ (2 \cdot 3) + (2 \cdot 1) + (5 \cdot 3)+ \\ (9 \cdot 1) + (8 \cdot 3) + (8 \cdot 1) + \\ (3 \cdot 3) + (0 \cdot 1) + (7 \cdot 3) \\ = 130 }\]\[130 - 130 = 0\]Step 2: Subtract the sum from nearest equal or higher multiple of ten
Result: In our case given the sum of 130, the number itself is the nearest equal or higher multiple of ten. By subtracting 130 from it, we therefore get the checksum 0.
Here are two examples that demonstrate how the FHIRPath expression validates the ahvn13 with the check digit. The first example shows a valid ahvn13, while the second example shows an invalid one.
AHVN13 is correct: 756.1234.5678.97
(((10-(28+(value.substring(3,1).toInteger()\*3)+(value.substring(4,1).toInteger()\*1)+(value.substring(5,1).toInteger()\*3)+(value.substring(6,1).toInteger()\*1)+(value.substring(7,1).toInteger()\*3)+(value.substring(8,1).toInteger()\*1)+(value.substring(9,1).toInteger()\*3)+(value.substring(10,1).toInteger()\*1)+(value.substring(11,1).toInteger()\*3))mod(10))mod(10))=value.substring(12,1).toInteger())
The above FHIRPath expression is equivalent to:
\[\displaylines{ (((10-(28+(1\cdot3))+(2\cdot1)+(3\cdot3)+(4\cdot1)+(5\cdot3) + \\ (6\cdot1)+(7\cdot3)+(8\cdot1))+(9\cdot3))mod(10))mod(10))=7) }\]AHVN13 is incorrect: 756.2435.3002.21
(((10-(28+(value.substring(3,1).toInteger()\*3)+(value.substring(4,1).toInteger()\*1)+(value.substring(5,1).toInteger()\*3)+(value.substring(6,1).toInteger()\*1)+(value.substring(7,1).toInteger()\*3)+(value.substring(8,1).toInteger()\*1)+(value.substring(9,1).toInteger()\*3)+(value.substring(10,1).toInteger()\*1)+(value.substring(11,1).toInteger()\*3))mod(10))mod(10))=value.substring(12,1).toInteger())
The above FHIRPath expression is equivalent to:
\[\displaylines{ (((10-(28+(2\cdot3))+(4\cdot1)+(3\cdot3)+(5\cdot1)+(3\cdot3)+ \\ (0\cdot1)+(0\cdot3)+(2\cdot1))+(1\cdot2))mod(10))mod(10))=1) }\]