For our clients engaged in biotechnology research, the sheer volume of data is often the biggest challenge. For example, identification of candidate genes and single nucleotide polymorphisms (SNPs) often requires alignments of data from sources like next generation sequencing. These data can be on the scale of terabytes depending on the project and source. 

The first stage of many bioinformatics analyses begins with development of a data pipeline. Whether the sequencing is done in house or at an external laboratory, we can build a direct connection so that experimental data are processed as quickly as possible once sequencing is complete. Data can then be processed and assessed for quality and any necessary alignments can be performed. 

Some sources of bioinformatics data we commonly encounter include: 

  • Next generation sequencing data
  • Sanger sequencing data
  • Enzyme-linked immunosorbent (ELISA) and other assay data 
  • DNA-Seq / RNA-Seq
  • Publicly available data from sources such as NCBI, including annotated geomes

Common data formats include FASTA, FASTQ, MSA, and plain text. 

Our general bioinformatics pipeline consists of modules to receive and standardize genetic data, perform quality control, perform standard analyses such as alignments, and our platform outputs results to a user-friendly interface. By supporting analyses from start to finish in this way, we automate the most time intensive tasks while facilitating research by iterating on analyses with the client. 

To learn more, get in touch.