It was a night when I felt overwhelmed by reading papers (March 29, 2022). As I browsed through the messy directory tree in Zotero… it suddenly occurred to me how many papers I had read over the past few years? Exporting from emails… I was shocked… 561 entries, 419 PDFs downloaded… really… If I had gone straight for a PhD, I probably wouldn’t have read so many papers.
So I decided to take a closer look at the information of these papers.
Steps Taken
It was a night when I felt overwhelmed by reading papers (March 29, 2022). As I browsed through the messy directory tree in Zotero… it suddenly occurred to me how many papers I had read over the past few years? Exporting from emails… I was shocked… 561 entries, 419 PDFs downloaded… really… If I had gone straight for a PhD, I probably wouldn’t have read so many papers.
- Exported the bibliography from Zotero as an
ris
file - Used
rispy
to parse theris
file into a dataframe, retaining the type of literature, authors, year of entry, abstract, and publication journal. - Retained entries where the type of literature was a journal article.
- Used
wordcloud
to create word clouds for authors and publishing journals. - Combined the abstracts into a single string and used
wordcloud
to self-segment and create a word cloud. - Created a bar chart for the year of entry.
The approximate results are shown below:
Author Word Cloud
From the graph, we can see a few prominent authors. Let’s take a brief look at them:
Griffith Malachi, Griffith Obi L.: These two Griffiths are twins and come from the Griffith Lab. This lab specializes in applying bioinformatics to various aspects of cancer. They have developed a series of related software/databases.
Sette Alessandro: From the Sette Lab, an expert in immunology. His research mainly focuses on immune studies of pathogens, especially viruses. Since the outbreak of COVID-19, most of his published research seems to be related to COVID.
Morten Nielsen: A researcher from the Technical University of Denmark (DTU). His main area of focus is likely algorithm development for immune recognition. The prominent appearance in the graph might be due to my collecting multiple papers related to affinity prediction. His university profile can be found here.
Nir Hacohen: A researcher at the Broad Institute, focusing on immunology according to his profile. I don’t have a particular impression of this author yet.
Bjoern Peters: From the Peters Lab, focusing on bioinformatics in immunology. He is from the same La Jolla Institute for Immunology as Sette Alessandro.
Eilon Barnea: From Meytal Landau’s Lab. His main research seems to be computational and experimental identification of immune peptides?
Arie Admon: From the Arie Admon Lab, mainly researching proteomics. I think I collected some papers related to mass spectrometry identification.
Michal Bassani-Sternberg: An expert in tumor biology and tumor immunology. According to her profile, she mainly uses mass spectrometry as a research tool.
Anthony Purcell: From the Monash University Research Profile. His research direction is bioinformatics applications in tumor immunity and autoimmune diseases.
Anne Searls De Groot: Co-founder of EpiVax. Such a big name… I think it was because there were many papers from their company that I read at the time…
Ugur Sahin: CEO of BioNTech… hmm, also brought in when reading company literature.
Catherine J Wu: Researcher focusing on neoantigens. Recently, several research results led by her have been included… Additionally, she has assisted the Broad Institute in tumor immunity-related research.
Journal Word Cloud
In terms of journals… it’s still dominated by bioinformatics, with a lot from biotechnology and immunology due to the application direction.
Abstract Word Cloud
The abstract word cloud clearly shows the directions of my recent work… peptides, neoantigens, T cells…
Year of Entry Statistics
As for the statistics on the year of entry… it’s clear why I feel overwhelmed by reading papers recently…
That’s all.