Skip to main content Skip to navigation
BD Rhapsody™ Sequence Analysis Pipeline Banner
Overview

The BD RhapsodySequence Analysis Pipeline is a versatile tool that offers the flexibility to run your bioinformatics analysis on either a Seven Bridges cloud-based platform or on a local installation.


The BD RhapsodySequence Analysis Pipeline:

  • Provides a primary analysis of single-cell multiomics data by leveraging cutting-edge algorithms to deliver fast results and deep insights.
  • Utilizes an intuitive user interface via a cloud-based platform and is easy to use, regardless of the computational expertise of the user.
  • Offers the ability to choose between cloud-based or local installation options and affords maximum convenience and accessibility for single-cell multiomics data analysis.
  • Provides broad compatibility of output data with downstream analysis tools such as Seurat and Scanpy.
 

The .Cellismo output files from the BD Rhapsody Sequence Analysis Pipeline can be imported into the BD Cellismo Data Visualization Tool for secondary analysis and visualization.

 

 

 

Pipeline Overview

After sequencing, the pipeline takes input from FASTQ files, a reference (Targeted panel or WTA / WTA+ATAC-Seq reference archive), an AbSeq reference (if required) and a supplemental reference (if required) to generate output files and metrics about the pipeline run.

 

BD Rhapsody Pipeline Workflow

Overview of the steps in the analysis pipeline.

Release Notes

 

v3 BD Rhapsody™ Sequence Analysis Pipeline   |   October 2025

 

Added

  • ATAC:
    • Gene Activity output—new modality in the .Cellismo output file and also a separate MEX output file. Gene activity is a Gene-by-Cell matrix, where counts are number of transposase cut sites in the gene body or 2,000 bases upstream of the gene start position
    • Transcription factor motif output—new modality in the .Cellismo output file and also a separate MEX output file. This is a TFmotif-by-cell matrix, where values are z-scores of the enrichment of each TF motif
  • VDJ
    • New assembly algorithm improves speed of this step by up to 23 fold (range 7x-23x), enabling the processing of billions of TCR/BCR reads. Metrics are generally equivalent or slightly better.
    • VDJ only pipeline—able to provide only TCR and/or BCR FASTQs and get a cell call and VDJ results. Sample multiplexing with VDJ only is also supported. VDJ in combination with a mRNA assay is still recommended for better cell calling and identification.
  • New pipeline node to downsample data to calculate a sequencing saturation curve and median genes per cell curve, which are output on the pipeline report
  • Make Rhapsody Reference tool:
    • Added an optional input for Transcription Factor Motif PFM file
    • Will now filter out readthrough transcripts and genes with only readthrough transcripts. Added optional parameter to turn off this filtering
    • Added optional parameter to filter out Y chromosome Pseudo-Autosomal Regions from Human reference build 38
  • Pipeline Report:
    • New Read Flow diagram, showing a sankey diagram of read filtering steps for each library and for each of the RNA and/or ATAC modalities
    • New Sequencing Saturation calculator to enable calculation of required total reads to achieve a target saturation value

 

Updated

  • VDJ
    • _VDJ_perCell.csv file CDR3 columns are updated to use CDR3 junction instead of CDR3 alone, resulting in the inclusion of canonical amino acids
    • _VDJ_perCell.csv file added full length pairing columns
    • New column in AIRR outputs "junction_anchored_aa"—a direct translation of only the CDR3 nucleotide sequence, not influenced by upstream frameshifts
    • Update constant region gene identification to prevent mismatched chain types
    • Removed PyIR wrapper and call IgBlast directly
  • Basic putative cell calling algorithm updated to fix several edge cases and get more precise cell calls. Increase in putative cell number of ~1% is typical. Use of the Expected Cell Count parameter is highly encouraged
  • Pipeline Report:
    • Various metric alert updates
    • Mean bioproducts per cell added to summary section
  • Gene expression _MolsPerCell MEX output now contains Ensembl IDs as well as Gene symbols
  • Improved library name determination from FASTQ file names
  • More aggressive cleanup of polyA sequence in reads to prevent spurious alignments
  • Make Rhapsody Reference tool: Extra Sequence input is now included in the BWA-Mem index
  • Seven Bridges CWL: Instance types updated to be more performant, and increase size of instances for ATAC related nodes
  • ATAC peak annotation now uses transcript features rather than gene features, which better classifies peaks when a gene has multiple transcription start sites
  • .Cellismo output file now contains GTF data for genes
  • Dimensionality reduction threshold updates: Below 100,000 cells, both t-SNE and UMAP coordinates are generated. Between 100,000 and 300,000 cells, only UMAP coordinates. Above 300,000 cells, a sub-sample of 300,000 cells will be selected and UMAP coordinates generated.

 

Fixed

  • AlignmentAnalysis node was not getting an early cell count estimate, which could cause downstream node scaling issues
  • TCR/BCR node failure when the number of valid TCR or BCR reads exceeded 2,147,483,647 reads
  • Pipeline Report error when exact cell count parameter specified
  • Pipeline Report error when CITE-seq/AbSeq only datasets are run
  • Targeted RNA pipeline did not output a DBEC MEX file
  • ATAC pipeline could get stuck in QualCLAlign_ATAC for some reference genomes with large numbers of contigs
  • Rare issue where an ATAC peak could exceed the length of the contig on which it resides
  • Improved handling chromosome names with unexpected characters
  • Failure in GenerateSeurat node when there is only 1 AbSeq input
  • Rare failure cause by poor quality read 1 data creating a race condition
  • Rare failure in ATAC node caused by incorrect BWA-MEM2 binary selection
  • ATAC pipeline failure when more than one ATAC library was present in the pipeline inputs
  • ATAC pipeline failure when using sample tags or an "Extra seqs" input
  • ATAC pipeline discrepancy in putative cell numbers in different output files.

Get Free Access to the Pipeline

 

Cloud-Based Version

  • Go to Velsera.com
  • Click Request Access. In the request access window, enter your email address to receive an email invitation to the Seven Bridges Genomics platform within 24 hours.
  • Click the link in the email invitation and complete the registration. Seven Bridges Genomics displays the dashboard with the demo projects.

 

Local Version

  • Go to bitbucket.org/CRSwDev/cwl. If necessary, create a Bitbucket account.
  • In the left pane, click Downloads > Download Repository. The CWL and YML files will download.
  • Unzip the archive. Each folder within the archive is named after the pipeline version to which it corresponds.

   For Research Use Only. Not for use in diagnostic or therapeutic procedures.