Title: | Binary and Categorical Image Similarity Index |
---|---|
Description: | Computes a structural similarity metric (after the style of MS-SSIM for images) for binary and categorical 2D and 3D images. Can be based on accuracy (simple matching), Cohen's kappa, Rand index, adjusted Rand index, Jaccard index, Dice index, normalized mutual information, or adjusted mutual information. In addition, has fast computation of Cohen's kappa, the Rand indices, and the two mutual informations. Implements the methods of Thompson and Maitra (2020) <doi:10.48550/arXiv.2004.09073>. |
Authors: | Geoffrey Thompson [aut, cre] |
Maintainer: | Geoffrey Thompson <[email protected]> |
License: | GPL-3 |
Version: | 0.2.4 |
Built: | 2024-10-30 06:42:41 UTC |
Source: | https://github.com/gzt/catsim |
An matrix representing a two-color
hand-drawn scene designed specifically to contain some
awkward features for an image reconstruction method
evaluated in the paper.
data(besag)
data(besag)
an matrix with entries
1
and 2
denoting the color of the corresponding pixels. The example code will
produce the image as it is in the original paper. To use as a 0-1
binary
dataset, either use besag - 1
or besag %% 2
.
J. Besag, “On the statistical analysis of dirty pictures,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 48, no. 3, pp. 259–279, 1986. doi:10.1111/j.2517-6161.1986.tb01412.x
image(besag[, 88:1])
image(besag[, 88:1])
This computes the categorical or binary structural similarity index metric on a whole-image scale. The difference between this and the default 2-D method is that this considers the whole image at once and one scale rather than computing the index over a sliding window and downsampling to consider it at other scales.
binssim( x, y, alpha = 1, beta = 1, gamma = 1, c1 = 0.01, c2 = 0.01, method = "Cohen", ... )
binssim( x, y, alpha = 1, beta = 1, gamma = 1, c1 = 0.01, c2 = 0.01, method = "Cohen", ... )
x , y
|
binary or categorical image |
alpha |
normalizing parameter, by default 1 |
beta |
normalizing parameter, by default 1 |
gamma |
normalizing parameter, by default 1 |
c1 |
small normalization constant for the |
c2 |
small normalization constant for the |
method |
whether to use Cohen's kappa ( |
... |
Constants can be passed to the components of the index. |
Structural similarity index.
set.seed(20181207) x <- matrix(sample(1:4, 10000, replace = TRUE), nrow = 100) y <- x for (i in 1:100) y[i, i] <- 1 for (i in 1:99) y[i, i + 1] <- 1 binssim(x, y)
set.seed(20181207) x <- matrix(sample(1:4, 10000, replace = TRUE), nrow = 100) y <- x for (i in 1:100) y[i, i] <- 1 for (i in 1:99) y[i, i + 1] <- 1 binssim(x, y)
The categorical structural similarity index measure for 2D categorical or binary images for multiple scales. The default is to compute over 5 scales.
catmssim_2d( x, y, levels = NULL, weights = NULL, window = 11, method = "Cohen", ..., random = "random" )
catmssim_2d( x, y, levels = NULL, weights = NULL, window = 11, method = "Cohen", ..., random = "random" )
x , y
|
a binary or categorical image |
levels |
how many levels of downsampling to use. By default, 5. If
|
weights |
a vector of weights for the different scales. By default,
equal to |
window |
by default 11 for 2D and 5 for 3D images,
but can be specified as a
vector if the window sizes differ by dimension.
The vector must have the same number of
dimensions as the inputted |
method |
whether to use Cohen's kappa ( |
... |
additional constants can be passed to internal functions. |
random |
whether to have deterministic PRNG ( |
a value less than 1 indicating the similarity between the images.
set.seed(20181207) x <- matrix(sample(0:3, 128^2, replace = TRUE), nrow = 128) y <- x for (i in 1:128) y[i, i] <- 0 for (i in 1:127) y[i, i + 1] <- 0 catmssim_2d(x, y, method = "Cohen", levels = 2) # the default # now using a different similarity score (Jaccard Index) catmssim_2d(x, y, method = "NMI")
set.seed(20181207) x <- matrix(sample(0:3, 128^2, replace = TRUE), nrow = 128) y <- x for (i in 1:128) y[i, i] <- 0 for (i in 1:127) y[i, i + 1] <- 0 catmssim_2d(x, y, method = "Cohen", levels = 2) # the default # now using a different similarity score (Jaccard Index) catmssim_2d(x, y, method = "NMI")
The categorical structural similarity index measure for 3D
categorical or binary images for multiple scales.
The default is to compute over 5 scales.
This computes a 3D measure based on
windows by default with 5 levels of downsampling.
catmssim_3d_cube( x, y, levels = NULL, weights = NULL, window = 5, method = "Cohen", ..., random = "random" )
catmssim_3d_cube( x, y, levels = NULL, weights = NULL, window = 5, method = "Cohen", ..., random = "random" )
x , y
|
a binary or categorical image |
levels |
how many levels of downsampling to use. By default, 5. If
|
weights |
a vector of weights for the different scales. By default,
equal to |
window |
by default 11 for 2D and 5 for 3D images,
but can be specified as a
vector if the window sizes differ by dimension.
The vector must have the same number of
dimensions as the inputted |
method |
whether to use Cohen's kappa ( |
... |
additional constants can be passed to internal functions. |
random |
whether to have deterministic PRNG ( |
a value less than 1 indicating the similarity between the images.
set.seed(20181207) dim <- 16 x <- array(sample(0:4, dim^3, replace = TRUE), dim = c(dim, dim, dim)) y <- x for (j in 1:dim) { for (i in 1:dim) y[i, i, j] <- 0 for (i in 1:(dim - 1)) y[i, i + 1, j] <- 0 } catmssim_3d_cube(x, y, weights = c(.75, .25)) # Now using a different similarity score catmssim_3d_cube(x, y, weights = c(.75, .25), method = "Accuracy")
set.seed(20181207) dim <- 16 x <- array(sample(0:4, dim^3, replace = TRUE), dim = c(dim, dim, dim)) y <- x for (j in 1:dim) { for (i in 1:dim) y[i, i, j] <- 0 for (i in 1:(dim - 1)) y[i, i + 1, j] <- 0 } catmssim_3d_cube(x, y, weights = c(.75, .25)) # Now using a different similarity score catmssim_3d_cube(x, y, weights = c(.75, .25), method = "Accuracy")
The categorical structural similarity index measure for 3D categorical or binary images for multiple scales. The default is to compute over 5 scales. This computes a 2D measure for each x-y slice of the z-axis and then averages over the z-axis.
catmssim_3d_slice( x, y, levels = NULL, weights = NULL, window = 11, method = "Cohen", ..., random = "random" )
catmssim_3d_slice( x, y, levels = NULL, weights = NULL, window = 11, method = "Cohen", ..., random = "random" )
x , y
|
a binary or categorical image |
levels |
how many levels of downsampling to use. By default, 5. If
|
weights |
a vector of weights for the different scales. By default,
equal to |
window |
by default 11 for 2D and 5 for 3D images,
but can be specified as a
vector if the window sizes differ by dimension.
The vector must have the same number of
dimensions as the inputted |
method |
whether to use Cohen's kappa ( |
... |
additional constants can be passed to internal functions. |
random |
whether to have deterministic PRNG ( |
a value less than 1 indicating the similarity between the images.
set.seed(20181207) dim <- 8 x <- array(sample(0:4, dim^5, replace = TRUE), dim = c(dim^2, dim^2, dim)) y <- x for (j in 1:(dim)) { for (i in 1:(dim^2)) y[i, i, j] <- 0 for (i in 1:(dim^2 - 1)) y[i, i + 1, j] <- 0 } catmssim_3d_slice(x, y, weights = c(.75, .25)) # by default method = "Cohen" # compare to some simple metric: mean(x == y)
set.seed(20181207) dim <- 8 x <- array(sample(0:4, dim^5, replace = TRUE), dim = c(dim^2, dim^2, dim)) y <- x for (j in 1:(dim)) { for (i in 1:(dim^2)) y[i, i, j] <- 0 for (i in 1:(dim^2 - 1)) y[i, i + 1, j] <- 0 } catmssim_3d_slice(x, y, weights = c(.75, .25)) # by default method = "Cohen" # compare to some simple metric: mean(x == y)
The categorical structural similarity index measure for 2D or 3D categorical or
binary images for multiple scales. The default is to compute over 5 scales.
This determines whether this is a 2D or 3D image and applies the appropriate
windowing, weighting, and scaling. Additional arguments can be passed.
This is a wrapper function for the 2D and 3D functions whose functionality
can be accessed through the ... arguments. This function is a wrapper for the
catmssim_2d()
, catmssim_3d_slice()
, and
catmssim_3d_cube()
functions.
catsim( x, y, ..., cube = TRUE, levels = NULL, weights = NULL, method = "Cohen", window = NULL )
catsim( x, y, ..., cube = TRUE, levels = NULL, weights = NULL, method = "Cohen", window = NULL )
x , y
|
a binary or categorical image |
... |
additional arguments, such as window, can be passed as well as arguments for internal functions. |
cube |
for the 3D method, whether to use the true 3D method
(cube or |
levels |
how many levels of downsampling to use. By default, 5. If
|
weights |
a vector of weights for the different scales. By default,
equal to |
method |
whether to use Cohen's kappa ( |
window |
by default 11 for 2D and 5 for 3D images, but can be
specified as a vector if the window sizes differ by dimension.
The vector must have the same number of
dimensions as the inputted |
a value less than 1 indicating the similarity between the images.
set.seed(20181207) dim <- 16 x <- array(sample(0:4, dim^3, replace = TRUE), dim = c(dim, dim, dim)) y <- x for (j in 1:dim) { for (i in 1:dim) y[i, i, j] <- 0 for (i in 1:(dim - 1)) y[i, i + 1, j] <- 0 } catsim(x, y, weights = c(.75, .25)) # Now using a different similarity score catsim(x, y, levels = 2, method = "accuracy") # with the slice method: catsim(x, y, weights = c(.75, .25), cube = FALSE, window = 8)
set.seed(20181207) dim <- 16 x <- array(sample(0:4, dim^3, replace = TRUE), dim = c(dim, dim, dim)) y <- x for (j in 1:dim) { for (i in 1:dim) y[i, i, j] <- 0 for (i in 1:(dim - 1)) y[i, i + 1, j] <- 0 } catsim(x, y, weights = c(.75, .25)) # Now using a different similarity score catsim(x, y, levels = 2, method = "accuracy") # with the slice method: catsim(x, y, weights = c(.75, .25), cube = FALSE, window = 8)
gini()
is a measure of diversity that goes by a
number of different names, such as the probability of interspecific encounter
or the Gibbs-Martin index. It is , where
is the
probability of observing class i.
The corrected Gini-Simpson index, ginicorr
takes the
index and corrects it so that the maximum possible is 1. If there are
k
categories, the maximum possible of the uncorrected index is
. It corrects the index by dividing by the maximum.
k
must be specified.
The modified Gini-Simpson index is similar to the unmodified,
except it uses the square root of the summed squared
probabilities, that is, , where
is the
probability of observing class i.
The modified corrected Gini index then
corrects the modified index for the number of categories, k
.
gini(x) ginicorr(x, k) sqrtgini(x) sqrtginicorr(x, k)
gini(x) ginicorr(x, k) sqrtgini(x) sqrtginicorr(x, k)
x |
binary or categorical image or vector |
k |
number of categories |
The index (between 0 and 1), with 0 indicating no variation and 1
being maximal. The Gini index is bounded above by for a group
with
k
categories. The modified index is bounded above by
. The corrected indices fix this by dividing by the
maximum.
x <- rep(c(1:4), 5) gini(x) x <- rep(c(1:4), 5) ginicorr(x, 4) x <- rep(c(1:4), 5) sqrtgini(x) x <- rep(c(1:4), 5) sqrtginicorr(x, 4)
x <- rep(c(1:4), 5) gini(x) x <- rep(c(1:4), 5) ginicorr(x, 4) x <- rep(c(1:4), 5) sqrtgini(x) x <- rep(c(1:4), 5) sqrtginicorr(x, 4)
A activation map for a slice of an fMRI
phantom and an anatomical reference.
data(hoffmanphantom)
data(hoffmanphantom)
a array with the
first slice an activation map for an MRI phantom and the second an
anatomical overlay.
NA
values are outside the surface. The
activation map (hoffmanphantom[,,1]
) is 1 if activated,
0 otherwise. The second layer (hoffmanphantom[,,2]
)
indicates the anatomical structure. Approximately 3.8 percent of the
pixels are activated in this slice.
E. Hoffman, P. Cutler, W. Digby, and J. Mazziotta, “3-D phantom to simulate cerebral blood flow and metabolic images for PET,” Nuclear Science, IEEE Transactions on, vol. 37, pp. 616 – 620, 05 1990.
I. A. Almodóvar-Rivera and R. Maitra, “FAST adaptive smoothed thresholding for improved activation detection in low-signal fMRI,” IEEE Transactions on Medical Imaging, vol. 38, no. 12, pp. 2821–2828, 2019.
image(hoffmanphantom[, , 2], col = rev(gray(0:15 / 16))[1:4], axes = FALSE) image(hoffmanphantom[, , 1], add = TRUE, zlim = c(0.01, 1), col = c("yellow", "maroon") )
image(hoffmanphantom[, , 2], col = rev(gray(0:15 / 16))[1:4], axes = FALSE) image(hoffmanphantom[, , 1], add = TRUE, zlim = c(0.01, 1), col = c("yellow", "maroon") )
The Rand index, rand_index, computes the agreement between two different clusterings or partitions of the same set of objects. The inputs to the function should be binary or categorical and of the same length.
The adjusted Rand index, adj_rand
,
computes a corrected version
of the Rand index, adjusting for the probability
of chance agreement of clusterings. A small constant is added to the
numerator and denominator of the adjusted Rand index to ensure stability
when there is a small or 0 denominator, as it is possible to have a zero
denominator.
Cohen's kappa, cohen_kappa
,
is an inter-rater agreement metric for two raters which
corrects for the probability of chance agreement. Note
there is a difference here
between this measure and the Rand indices and mutual information:
those consider the similarities of the groupings of points,
while this considers how often the
raters agreed on individual points.
Like the Rand index, the mutual information
computes the agreement between two different clusterings or
partitions of the same set of objects. If is the
entropy of some probability distribution
, then
the mutual information of two distributions is
.
The normalized mutual information,
normalized_mi
, is defined here as:
but is set to be 0 if both H(X) and H(Y) are 0.
The adjusted mutual information, adjusted_mi
,
is a correction of the mutual information to account
for the probability of chance agreement in a manner similar to the
adjusted Rand index
or Cohen's kappa.
rand_index(x, y, na.rm = FALSE) adj_rand(x, y, na.rm = FALSE) cohen_kappa(x, y, na.rm = FALSE) normalized_mi(x, y, na.rm = FALSE) adjusted_mi(x, y, na.rm = FALSE)
rand_index(x, y, na.rm = FALSE) adj_rand(x, y, na.rm = FALSE) cohen_kappa(x, y, na.rm = FALSE) normalized_mi(x, y, na.rm = FALSE) adjusted_mi(x, y, na.rm = FALSE)
x , y
|
a numeric or factor vector or array |
na.rm |
whether to remove |
the similarity index, which is between 0 and 1 for most of the options. The adjusted Rand and Cohen's kappa can be negative, but are bounded above by 1.
W. M. Rand (1971). "Objective criteria for the evaluation of clustering methods". Journal of the American Statistical Association. American Statistical Association. 66 (336): 846–850. doi:10.2307/2284239
Lawrence Hubert and Phipps Arabie (1985). "Comparing partitions". Journal of Classification. 2 (1): 193–218. doi:10.1007/BF01908075
Cohen, Jacob (1960). "A coefficient of agreement for nominal scales". Educational and Psychological Measurement. 20 (1): 37–46. doi:10.1177/001316446002000104
Jaccard, Paul (1912). "The distribution of the flora in the alpine zone,” New Phytologist, vol. 11, no. 2, pp. 37–50. doi:10.1111/j.1469-8137.1912.tb05611.x
Nguyen Xuan Vinh, Julien Epps, and James Bailey (2010). Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance. J. Mach. Learn. Res. 11 (December 2010), 2837–2854. https://jmlr.org/papers/v11/vinh10a.html
x <- rep(0:5, 5) y <- c(rep(0:5, 4), rep(0, 6)) # Simple Matching, or Accuracy mean(x == y) # Hamming distance sum(x != y) rand_index(x, y) adj_rand(x, y) cohen_kappa(x, y) normalized_mi(x, y) adjusted_mi(x, y)
x <- rep(0:5, 5) y <- c(rep(0:5, 4), rep(0, 6)) # Simple Matching, or Accuracy mean(x == y) # Hamming distance sum(x != y) rand_index(x, y) adj_rand(x, y) cohen_kappa(x, y) normalized_mi(x, y) adjusted_mi(x, y)