Sorry, your browser cannot access this site
This page requires browser support (enable) JavaScript
Learn more >

Recently, while doing differential gene expression analysis, I needed to map the expression values of DEGs onto KEGG pathways. I used Bioconductor’s pathview package, but the resulting nodes were ridiculously large and the colors were wrong. With the help of AI, I fixed the issues and also picked up a new patching technique along the way.

I previously wrote a blog post about using pixi’s tasks feature to fix missing dependency issues with Bioconductor packages (like GenomeInfoDbData, BSgenome.Hsapiens.UCSC.hg38, etc.) after installation. At the time, I only knew the problem existed but didn’t understand the root cause — I was just providing a less-than-ideal workaround.

But recently, with insights from AI responses, I finally figured out the real reason — it all comes down to the post-link script mechanism in the Conda ecosystem.

After my previous small contribution to a conda‑forge package, I wanted to take on something more advanced: trying to publish the singler‑py Python package into the conda ecosystem. Little did I know that it would turn out to be… somewhat troublesome.

In bioinformatics visualization, we often need to handle plots containing tens of thousands of data points, such as scatter plots from single‑cell RNA‑seq data. When saved in vector formats like PDF, such graphics can suffer from huge file sizes and slow rendering (most software other than AI will simply freeze), because a vector file records the coordinates, color, size, and other attributes of every single point, resulting in a PDF with an enormous number of objects that hampers viewing and editing efficiency.

In a short period of time, I’ve encountered two situations where I needed to fix bugs in R functions, and I’ve also learned how to perform hot replacements…

If you live long enough, you’ll encounter plenty of awkward situations.

In bioinformatics, the more cutting-edge your research direction, the more problems you’ll face from the informatics side. Even when papers are published with excellent results, and the original authors share their code or even provide ready-to-use software tools, it doesn’t mean we can easily use these existing resources for reproduction or further research. Chaotic environment setup is just one aspect - more often than not, since software authors aren’t professional software engineers, we should be grateful if the tool just works. We can’t expect these software to be bug-free, nor can we expect them to have decent performance (unless performance was a development goal). Even tools from well-established labs aren’t free to these issues, such as… Azimuth.

When using pixi to manage bioinformatics analysis environments, we often encounter issues where some Bioconductor R packages show missing dependencies after installation. The exact cause of this problem is currently unclear. After using pixi for a year, this issue still hasn’t been fixed (as of October 2025). This article introduces how to use pixi’s tasks feature to resolve such problems.

In our current single-cell analysis pipeline based on scanpy, there’s a step that requires saving AnnData objects in loom format. However, unlike saving to h5ad format, when we write an AnnData object to a loom file without any special handling and then read it back, we find that the index information of obs and var (typically cell barcodes and gene names) is lost, and these indices become ordinary numeric identifiers.

Milo is a differential abundance analysis method for single-cell RNA sequencing data that can detect compositional changes in cell neighborhoods across different conditions.