Category: Bioinformatics

Pitfalls When Plotting with Pathview

Recently, while doing differential gene expression analysis, I needed to map the expression values of DEGs onto KEGG pathways. I used Bioconductor’s pathview package, but the resulting nodes were ridiculously large and the colors were wrong. With the help of AI, I fixed the issues and also picked up a new patching technique along the way.

2026-05-28 Bioinformatics

Unveiling post-link: The Real Root Cause of Bioconductor Missing Dependencies in pixi

I previously wrote a blog post about using pixi’s tasks feature to fix missing dependency issues with Bioconductor packages (like GenomeInfoDbData, BSgenome.Hsapiens.UCSC.hg38, etc.) after installation. At the time, I only knew the problem existed but didn’t understand the root cause — I was just providing a less-than-ideal workaround.

But recently, with insights from AI responses, I finally figured out the real reason — it all comes down to the post-link script mechanism in the Conda ecosystem.

2026-05-22 Bioinformatics

Trying to Submit a New Package to Bioconda/Conda‑Forge

After my previous small contribution to a conda‑forge package, I wanted to take on something more advanced: trying to publish the singler‑py Python package into the conda ecosystem. Little did I know that it would turn out to be… somewhat troublesome.

2026-01-11 Bioinformatics

Rasterizing Single‑Cell Dot Plots in R

In bioinformatics visualization, we often need to handle plots containing tens of thousands of data points, such as scatter plots from single‑cell RNA‑seq data. When saved in vector formats like PDF, such graphics can suffer from huge file sizes and slow rendering (most software other than AI will simply freeze), because a vector file records the coordinates, color, size, and other attributes of every single point, resulting in a PDF with an enormous number of objects that hampers viewing and editing efficiency.

2025-12-17 Bioinformatics

Hot-fixing R Functions When Discovering Bugs

In a short period of time, I’ve encountered two situations where I needed to fix bugs in R functions, and I’ve also learned how to perform hot replacements…

2025-11-26 Bioinformatics

To Reproduce Results, I Had to Reproduce Bugs Too

If you live long enough, you’ll encounter plenty of awkward situations.

2025-11-26 Bioinformatics

Even Well-Established Projects Can Be Buggy - Azimuth is Full of Bugs

In bioinformatics, the more cutting-edge your research direction, the more problems you’ll face from the informatics side. Even when papers are published with excellent results, and the original authors share their code or even provide ready-to-use software tools, it doesn’t mean we can easily use these existing resources for reproduction or further research. Chaotic environment setup is just one aspect - more often than not, since software authors aren’t professional software engineers, we should be grateful if the tool just works. We can’t expect these software to be bug-free, nor can we expect them to have decent performance (unless performance was a development goal). Even tools from well-established labs aren’t free to these issues, such as… Azimuth.

2025-11-22 Bioinformatics

Fixing Missing Dependency Issues When Deploying Some Bioconductor R Packages with pixi

When using pixi to manage bioinformatics analysis environments, we often encounter issues where some Bioconductor R packages show missing dependencies after installation. The exact cause of this problem is currently unclear. After using pixi for a year, this issue still hasn’t been fixed (as of October 2025). This article introduces how to use pixi’s tasks feature to resolve such problems.

2025-10-29 Bioinformatics

Fixing the Issue of Lost obs and var Index When Reading/Writing Loom Files with scanpy

In our current single-cell analysis pipeline based on scanpy, there’s a step that requires saving AnnData objects in loom format. However, unlike saving to h5ad format, when we write an AnnData object to a loom file without any special handling and then read it back, we find that the index information of obs and var (typically cell barcodes and gene names) is lost, and these indices become ordinary numeric identifiers.

2025-10-28 Bioinformatics

Comparing Milo Implementations in R and Python

Milo is a differential abundance analysis method for single-cell RNA sequencing data that can detect compositional changes in cell neighborhoods across different conditions.

2025-10-28 Bioinformatics