Using Image Metadata in Formulas and Scripts
Most SEMs store additional information about an SEM image in Image Metadata, sometimes embedded within the image file itself, in other cases, in separate "sidecar" files. ProSEM reads image metadata when available, and makes that information accessible to users either as Formulas in the Variables panel, or within Scripts.
Seeing Image Metadata
Metadata associated with an image is visible in ProSEM's Image Information Panel. If this panel is not visible, it can be made so by selecting it in ProSEM's View menu.
This panel displays two sections of information about the image.
This section exists for all images, and contains information about the image file itself, including:
- The file path
- The pixel count in X and Y
- The pixel size
- Image Rotation applied within ProSEM
- The image field of view
This section exists only if the image file has associated metadata, either embedded within the image, or in an adjunct sidecar file read by ProSEM. The contents depend entirely on the SEM software and settings, and varies by SEM manufacturer and model. The Image Information panel shows the available metadata. Metadata is stored in a dictionary-style data structure, with a name or a "key", and as associated value. In many cases, the meaning of the information is obvious; in some cases, however, users may need to consult their SEM manual or applications support for proper interpretation of metadata items.
For example, for SEM images from various manufacturer's tools, the Working Distance might be found in various forms, including:
WorkingDistance : 0.00786781
$$SM_WD : 3.6
WD : 5.3 mm
The interpretation is generally clear, but obviously depends strongly on the tool's software. Here, the first is expressed in meters, the second in millimeters, and the third has an explicit unit included. Depending on how the data is to be used, it may be necessary to convert to different units, or remove the units so that a numerical comparison can be made.
All Metadata values are reported as Strings, even when the data is numeric. For some uses, the string representation is fine, for example, if the task is just print the metadata value for labeling or identification. In other applications, it is necessary to process the metadata value in one or more ways:
- Convert the data into a numeric type, for example for comparison, or bounds checking
- Extract just a portion of the metadata string
|Description||Metadata Content||Desired Result||
|Extract numeric value from string, method 1, fixed position||5.32 mm||
Simple, but least flexible method, depends on format of metadata string always have the exact same number of digits
|Extract numeric value from string, method 2: using regular expression||5.32 mm||5.32||
|Using regex, finds sequence of digits, and optionally decimal point, and optionally a leading negative sign|
|Extract numeric stage position using regex||X=40.4550, Y=11.8194, R=356.5767, Z=4.00, T=0.00||11.8194||
|Using regex, finds sequence of digits, optional decimal or negative sign, immediately following string "Y="|
|Extract data from image filename||Dose_140||140||
|Using regex, find digits following underscore character, up until the end of the image name|
|Extract data from image filename||WFR_126A_DOS_0.70_DEN_025_25||
|Using regex, find digits following literal string "_DOS_"|
In scripts, the regular expression library must be included in the Python script in the script header with:
Regular Expressions (regex): Really Brief Overview
Regular Expressions are a very common method for string processing, but can also be a bit obscure and complex. There are many online resources for learning about Regular Expressions, but for ProSEM use, many tasks can be accomplished with just a very small subset of the capabilities, summarized here:
General Form for apply a regular expression to a string:
This applies the regular expression
EXPR to the string
STRING, and returns the first match result.
EXPR is a regular expression literal, and generally has the pattern to be matched, enclosed in slashes, for example:
will match the string "ProSEM" if contained in the character string
STRING. When working with SEM metadata, it is often useful to extract just portions of the the full metadata string. For this one or more 'groups' are defined in the regex; groupings are enclosed by parentheses, and match content usually using a set of metacharacters within the parentheses.
/ProSEM v(/d/./d/./d)/ when applied to the string:
"ProSEM v2.8.4" will return the grouped match:
"2.8.4". If the portion of the expression not inside the grouping parentheses matches the literal '
ProSEM v', and then the portion inside the parentheses consists of a series of special metacharacters, each starting with a backslash '\'. Here the pattern
\d\.\d\.\d looks for a digit, then a period, then another digit, then another period and then a final digit. Note that this simple example will only match single digits, so if the version string were
2.10.1, this would not match and no value would be returned. The expression can be modified to match one or more digits in each location by adding a quantifier character to the digit indicator, in this example the '
+' quantifier after the
'\d' digit character indicates to match one or more digits in a row, so the expression:
will match either
"ProSEM v2.8.4" or
This expression could also be modified to match the three individual portions of the version number individually, so that this expression:
when matched against a version string of the correct form such as
"ProSEM v2.8.4" will now return 3 distinct answers, for the above:
EXPR.exec(metadata_string) will return 2, the first grouped match
EXPR.exec(metadata_string) returns 8, the second grouping, and
EXPR.exec(metadata_string) returns 4.
Regular Expressions (regex) Most Common Elements
|\s||Character match white space
|\d||Character match a digit, 0-9||
|.||Character match any character||
|^||Anchor to start of string||
|$||Anchor to end of string||
|*||Quantifier match 0 or more
of the preceding match
|+||Quantifier match 1 or more
of the preceding match
|( )||Grouping, match and
capture enclosed pattern
For multiple matches, the results are available by indexing the full results, ie the first match is returned as index , second as index  and so on.
|[abc]||Range, a or b or c||
[a-j] match a letter from a to j