Sorry, your browser cannot access this site
This page requires browser support (enable) JavaScript
Learn more >

In bioinformatics, the more cutting-edge your research direction, the more problems you’ll face from the informatics side. Even when papers are published with excellent results, and the original authors share their code or even provide ready-to-use software tools, it doesn’t mean we can easily use these existing resources for reproduction or further research. Chaotic environment setup is just one aspect - more often than not, since software authors aren’t professional software engineers, we should be grateful if the tool just works. We can’t expect these software to be bug-free, nor can we expect them to have decent performance (unless performance was a development goal). Even tools from well-established labs aren’t free to these issues, such as… Azimuth.

Introduction to Azimuth

Azimuth is a single-cell data annotation tool developed by the Satija Lab, designed to simplify the Label Transfer process in Seurat and quickly perform Label Transfer on cells to be classified.

Problem Description

In the current latest 0.5.0 version, running AzimuthReference in Jupyter Notebook trigger an error: Error in ValidateAzimuthReference(object = object): Reference must contain an AzimuthData object in the tools slot.

A quick Google search reveals that this issue was first reported in April 2024 (issue #219). The user who reported the problem already provided a solution, but the issue remains open and unresolved (last week another user reported encountering the same problem).

Solution Approach

As mentioned earlier, issue reporter zacharyrs has already identified the problem. In commit b1b6895, the code author uses sys.calls() to determine the name of the currently called function and makes subsequent processing decisions based on this name.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
tool.name <- as.character(x = sys.calls()) # <-- sys.calls() here returns a list; if not calling AzimuthReference directly, it deletes some information from the object
tool.name <- lapply(
X = strsplit(x = tool.name, split = "(", fixed = TRUE),
FUN = "[",
1
)[[1]]
if (tool.name != "AzimuthReference") {
slot(object, name = "tools")["AzimuthReference"] <- slot(object, name = "tools")[tool.name]
slot(object, name = "tools")[tool.name] <- NULL
}
object <- DietSeurat(object = object,
counts = FALSE,
assays = c("refAssay", assays),
dimreducs = c("refDR", "refUMAP"))

However, in practice, we often need to wrap AzimuthReference within functions. In such cases, the first element in the list returned by sys.calls() becomes the outermost function name, which causes the data required for Azimuth to run to be deleted, leading to errors in subsequent checks.

In my case, I learned for the first time that all code in notebooks is wrapped within IRKernel functions (thanks to AI for helping me troubleshoot). So even when directly running AzimuthReference in my notebook, it triggers the same error that would occur when running AzimuthReference within a function.

The solution is simple: modify the judgment logic - after sys.calls() returns the list, take the last element and then extract the function name using regular expressions:

1
2
call_list <- sys.calls()
tool.name <- as.character(x = call_list[[length(call_list)]])

In our actual code, we can load Azimuth and then override the original AzimuthReference function with our modified version, allowing subsequent code to run properly.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
library(Azimuth)
AzimuthReference <- function (object, refUMAP = "umap", refDR = "spca", refAssay = "SCT",
dims = 1:50, k.param = 31, plotref = "umap", plot.metadata = NULL,
ori.index = NULL, colormap = NULL, assays = NULL, metadata = NULL,
reference.version = "0.0.0", verbose = FALSE)
{
if (!refUMAP %in% Reductions(object = object)) {
stop("refUMAP (", refUMAP, ") not found in Seurat object provided")
}
if (is.null(x = Misc(object = object[[refUMAP]], slot = "model"))) {
stop("refUMAP (", refUMAP, ") does not have the umap model info stored. ",
"Please rerun RunUMAP with return.model = TRUE.")
}
if (!refDR %in% Reductions(object = object)) {
stop("refDR (", refDR, ") not found in Seurat object provided")
}
if (is.null(x = metadata)) {
stop("Please specify at least one metadata field (for transfer and plotting).")
}
for (i in metadata) {
if (!i %in% colnames(x = object[[]])) {
warning(i, " not found in Seurat object metadata")
next
}
if (!is.factor(x = object[[i, drop = TRUE]])) {
warning(i, " is not a factor. Converting to factor with alphabetical ",
"levels.", call. = FALSE)
object[[i, drop = TRUE]] <- factor(x = object[[i,
drop = TRUE]], levels = sort(x = unique(object[[i,
drop = TRUE]])))
}
}
if (!refAssay %in% Assays(object = object)) {
stop("Seurat object provided must have the SCT Assay stored.")
}
if (!inherits(x = object[[refAssay]], what = "SCTAssay")) {
stop("refAssay (", refAssay, ") is not an SCTAssay.")
}
if (length(x = levels(x = object[[refAssay]])) != 1) {
stop("refAssay (", refAssay, ") should contain a single SCT model.")
}
suppressWarnings(expr = object[["refUMAP"]] <- object[[refUMAP]])
suppressWarnings(expr = object[["refDR"]] <- object[[refDR]])
object <- FindNeighbors(object = object, reduction = "refDR",
dims = dims, graph.name = "refdr.annoy.neighbors", k.param = k.param,
nn.method = "annoy", annoy.metric = "cosine", cache.index = TRUE,
return.neighbor = TRUE, l2.norm = FALSE, verbose = verbose)
if (verbose) {
message("Computing pseudobulk averages")
}
features <- rownames(x = Loadings(object = object[["refDR"]]))
plot.metadata <- plot.metadata %||% object[[metadata]]
if (inherits(x = plotref, what = "DimReduc")) {
plot.metadata <- plot.metadata[Cells(x = plotref), ]
}
ad <- CreateAzimuthData(object = object, plotref = plotref,
plot.metadata = plot.metadata, colormap = colormap, reference.version = reference.version)
ori.index <- ori.index %||% match(Cells(x = object), Cells(x = object[["refUMAP"]]))
object$ori.index <- ori.index
DefaultAssay(object = object) <- refAssay
object[[refAssay]] <- subset(x = object[[refAssay]], features = features)
DefaultAssay(object = object[["refDR"]]) <- refAssay
object <- DietSeurat(object = object, counts = FALSE, assays = c(refAssay,
assays), dimreducs = c("refDR", "refUMAP"))
metadata <- c(metadata, "ori.index")
for (i in colnames(x = object[[]])) {
if (!i %in% metadata) {
object[[i]] <- NULL
}
}
sct.model <- slot(object = object[[refAssay]], name = "SCTModel.list")[[1]]
object[["refAssay"]] <- as(object = suppressWarnings(Seurat:::CreateDummyAssay(assay = object[[refAssay]])),
Class = "SCTAssay")
slot(object = object[["refAssay"]], name = "SCTModel.list") <- list(refmodel = sct.model)
DefaultAssay(object = object) <- "refAssay"
DefaultAssay(object = object[["refDR"]]) <- "refAssay"
Tool(object = object) <- ad
call_list <- sys.calls() # use the last element of sys.call() to get the right function name
tool.name <- as.character(x = call_list[[length(call_list)]]) # use the last element of sys.call() to get the right function name
tool.name <- lapply(X = strsplit(x = tool.name, split = "(",
fixed = TRUE), FUN = "[", 1)[[1]]
if (tool.name != "AzimuthReference") {
slot(object, name = "tools")["AzimuthReference"] <- slot(object,
name = "tools")[tool.name]
slot(object, name = "tools")[tool.name] <- NULL
}
object <- DietSeurat(object = object, counts = FALSE, assays = c("refAssay",
assays), dimreducs = c("refDR", "refUMAP"))
ValidateAzimuthReference(object = object)
return(object)
}

Azimuth Seems Abandoned

Although it comes from the Satija Lab, it feels like this project has been abandoned. There are nearly a hundred open issues on GitHub, and PRs submitted since last year still have 5 that are neither merged nor rejected. Recently, they’ve started a new Python-based deep learning-based universal cell type Label Transfer project, suggesting that as research moves forward, older projects are no longer being maintained.

Moreover, for Label Transfer, one can always follow tutorials step by step - it’s not absolutely necessary to use Azimuth. Perhaps soon, this project will be archived…

Comments

Please leave your comments here