Skip to main content Skip to navigation
BD Rhapsody™ Sequence Analysis Pipeline Banner
Overview

The BD RhapsodySequence Analysis Pipeline is a versatile tool that offers the flexibility to run your bioinformatics analysis on either a Seven Bridges cloud-based platform or on a local installation.


The BD RhapsodySequence Analysis Pipeline:

  • Provides a primary analysis of single-cell multiomics data by leveraging cutting-edge algorithms to deliver fast results and deep insights.
  • Utilizes an intuitive user interface via a cloud-based platform and is easy to use, regardless of the computational expertise of the user.
  • Offers the ability to choose between cloud-based or local installation options and affords maximum convenience and accessibility for single-cell multiomics data analysis.
  • Provides broad compatibility of output data with downstream analysis tools such as Seurat and Scanpy.

 

 

 

Pipeline Overview

After sequencing, the pipeline takes input from FASTQ files, a reference (Targeted panel or WTA genome archive), an AbSeq reference (if required), and a supplemental reference (if required), using those to generate molecule counts per cell and metrics about the pipleline run.

 

BD Rhapsody Pipeline Workflow

Overview of the steps in the analysis pipeline.

Release notes

v2.1 BD Rhapsody™ Sequence Analysis Pipeline

 

New additions

  • TCR/BCR high-quality cell designation and associated metrics. This creates a new set of VDJ metrics similar to products where there is a putative cell call for VDJ libraries, separate from the cell call from associated gene expression libraries
  • UMAP dimensionality reduction coordinates as an output file and also builds those coordinates into the pipeline report, Seurat and Scanpy outputs
  • Extra utility for only annotating the cell index and UMI of R1 and putting it in the header of R2

 

Updates

  • Seurat output to separate mRNA and AbSeq data into RNA and ADT assays, respectively
  • Scanpy output to use Muon (.h5mu) and create mRNA and AbSeq data in separate anndata objects, rna and prot respectively
  • TCR/BCR dominant contigs file to include AIRR compliant germline columns and to only retain cell type appropriate chains. All chains are still available in the unfiltered contigs file.
  • TCR/BCR dominant contigs file to rename the column 'duplicate_count' to 'umi_count', in accordance with AIRR's definition update in v1.4.1
  • TCR/BCR dominant contig selection process, elevating the importance of a productive contig with high relative read count and removing the CDR3 requirement
  • TCR/BCR DBEC algorithm to allow exceptions for CDR3 sequences not seen in any other cell and CDR3 paired chains seen in other cells
  • TCR/BCR contig_id to correspond with annotated chain type
  • Basic cell calling to scale better with small and large cell datasets and prevent most inappropriately high-cell calls derived from noise signatures
  • Alignment Category 'No_Feature_Pct' metric to include targeted mRNA reads that are filtered out due to an invalid alignment
  • Cell label annotation to improve the speed of annotation for reads with cell label sequences that contain more than 1 error
  • RAM requirements for VDJ_preprocess_reads on local server runs
  • Error handling and reporting in read processing steps
  • Logging to capture errors during alignment with STAR
  • FASTQ handling to skip reads with empty sequence
  • Cell type classification model selection to better select an appropriate model when not all bioproducts are found in any one model
  • Pipeline report to show sub-sampled tSNE and UMAP plots in the case where the putative cell count exceeds 100,000 and to show details of refined cell calling when refined cell calling is selected
  • Bead version detection and read trimming

 

Fixes

  • Issue that caused failure when a gene symbol was named 'nan'
  • Issue with a quote mark in a gene symbol causing a failure in the Seurat output file generation
  • Rare division by zero issue in DBEC algorithm
  • Rare issue caused by including "SampleTag" in the Run_Name parameter

 

Experimental

  • Added docker-free version of the pipeline, available for local server installs as a tar.gz bundle. Tested on Linux versions: Ubuntu 16 / 20 / 22 - Red Hat 7 - CentOS 7 / 9

 

make_rhapsody_reference tool:

  • Added an 'Extra_STAR_params' input to enable passing parameters to the STAR genomeGenerate process
  • Updated to automatically generate a GTF for sequences added in the 'Extra_sequences' FASTA input—useful for transgenes

Get free access to the pipeline

Cloud-based version

  • Go to Velsera.com
  • Click Request Access. In the request access window, enter your email address to receive an email invitation to the Seven Bridges Genomics platform within 24 hours.
  • Click the link in the email invitation and complete the registration. Seven Bridges Genomics displays the dashboard with the demo projects.

 

Local version

  • Go to bitbucket.org/CRSwDev/cwl. If necessary, create a Bitbucket account.
  • In the left pane, click Downloads > Download Repository. The CWL and YML files will download.
  • Unzip the archive. Each folder within the archive is named after the pipeline version to which it corresponds.