Research Computing develops software container tools for bioinformatics research

Purdue Research Computing now offers biocontainers – software containers designed for Purdue’s community clusters that contain more than 400 commonly used bioinformatics tools.

Biocontainers have the advantage of keeping things organized by isolating programs inside their containers, allowing for portability and reproducibility, and making software installation and use easier for researchers.

This project is an extension of the international Biocontainers initiative. Research Computing staff members Yucheng Zhang and Lev Gorenstein customized existing containers for Purdue community clusters using Singularity, a container application that was developed specifically with high-performance computing in mind.

Containers can be intimidating for novice users, but Gorenstein and Zhang have wrapped them into familiar module environments that hide the complexity of the software.

Nadia Atallah Lanman, a research assistant professor of comparative pathobiology, uses the biocontainers extensively to perform single-cell RNA velocity analyses.

“The packages that perform these analyses are often incredibly buggy and sometimes conflict with other Python packages that may be loaded,” says Lanman. “The biocontainers provide a really stellar way to run these analyses without having to deal with package conflicts.”

For Sagar Utturkar, senior bioinformatician at the Purdue Center for Cancer Research, reproducibility and version control are major benefits of using biocontainers.

“We use so many different tools for research, and each tool comes with various versions,” says Utturkar. “Biocontainers make it easy for someone to reproduce our work using the exact same version of the tool.”

To help users submit jobs using these biocontainer modules, Gorenstein and Zhang prepared a user guide containing software description, supported commands, notes, and example job scripts for each biocontainer tool. The user guides are currently hosted here

More information about using biocontainers can also be found here.

To learn more about biocontainers or other Research Computing resources, contact rcac-help@purdue.edu.

Writer: Adrienne Miller, science and technology writer, Research Computing, mill2027@purdue.edu.

Last updated: May 12, 2022