Contract Analysis


Research & Development based on NGS data analysis

Expert analysis staffs, who have led the genome / epigenome research project using next-generation sequencing (NGS) in research institutions, offer the commissioned analysis of design, carry out hands-on formula research and technological development based on the NGS data analysis from the scratch.

In the research and development using the NGS data analysis, you need to make a pre-careful planning for the four elements, such as the funds, the period, manpower and technology. In those elements, which we should put the emphasis are depending on the purpose and circumstances, and adequate know-how is also required in R&D using the NGS data.

Our company not only offer the contract analysis of genome/epigenome data obtained from the NGS, but also support the customer from the beginning of the research design to the end of the project.


Genome / epigenome batch analysis from the base call to higher-order analysis

Our genome / epigenome contract analysis is not only a general primary analysis, but the collective service that combines a high-order analysis to find a biological significance. In the analysis flow, the accuracy evaluation and interpretation at each phase by NGS professional analysts makes it possible to proceed forward research and development more smoothly.

The figure above shows the analysis flow example of the binding domain analysis at transcription factors using next-generation sequencing (ChIP-seq data analysis using Illumina HiSeq2500), we will introduce a part of its outline here.

  1. Base call(Illumina CASAVA, etc.):From specific fluorescence image obtained from the CCD camera, we determine the nucleotide sequence of each sequence lead. It is possible to verify the accuracy of sequencing or library creating, by checking out the overlap of the fluorescence corresponding to each base in the same position of the CCD image and the population of sequence lead with the same nucleotide sequence. Since the criteria in accordance with the sequencing protocol of NGS is different, it is not eligible for our basic services.If you need the analysis from the base course, please consult us.
  2. Mapping(bwa, bowtie2, etc.):For the genome in the reference, we map the determined sequence lead of base sequence. It is necessary to decide how much you are able to accept the mismatch upon mapping. Purpose, accuracy of analysis, species and cell types determine the best settings. In addition, if a single sequence lead is assigned to more than one genomic area or a sequence lead with the same base sequence exists, you need to determine how to handle it.
  3. Peak call(MACS, MACS2, SICER, etc.):It is possible to identify genome / epigenome area associated with the focused cell function (in this example, the binding area of the transcription control factor) by identifying a concentrated area of the mapped sequence lead. Method of appropriate peak call of NGS data depends on the nature of the experimental procedures and the target control factors. Repeat sequences in many mammalian genome, such as from human and mouse genome, it is also important how to handle the peaks that seem to be concentrated.
  4. Binding motif analysis(MODC, DME, MEME, TRAP etc): We are able to identify what base sequence patterns in functional areas of the genome / epigenome (DNA base sequence patterns recognized by transcription control factors in this example) is concentrated. Multiple calculation algorithms to predict the binding motif have been developed already, when you use them you need to understand the feature of each algorithm. There are two patterns of algorithms for identifying the binding motif is mainly, 1)A method of predicting from the scratch of the data, 2)A method of searching for known motif.
  5. Integrated analysis with the gene expression data: It is possible to narrow down the function of specific genomic / epigenome area (target of transcription control factor in this example), by combined with gene expression data. Should be considered an appropriate response relationship according to the data, because position relationship between the control area and the particular gene are diverse, it is involved from 1Mbp away remote control area, and it might be jointly controlled by multiple areas. Also, in gene expression data, determination of an "activated" gene is not always determined only by the absolute amount in expression, but that criterion is also important at the integration analysis.
  6. Co-localization analysis:Transcription control factor does not always necessarily have to function on its own, it may have to function the genomic area in collaboration with a number of factors, to cells / species-specifically. At that time, recognition mechanism depends on whether each one of them intracts directly or indirectly, you need to decipher its mechanism rather than obtained NGS data.

Fully customized commissioned analysis tailored to the project

We offer the contract analysis with a focus on NGS data in the following table(mainly human, mouse, and microbiota). With regard to data that is not included in the table, please mention that in the "Contact Us". And we will give you the visualized result back after the analysis if you want.

We propose the most appropriate contract analysis plan, in accordance with the situation, such as before and after the acquisition of NGS data or before and after of the primary analysis. Please contact us if you need research and consulting services related to the technology development specifications prior to the genome / epigenome data acquisition, the interpretation of the biological significance of the obtained data, or the creation of illustrations for the papers and presentations.

Purpose Method Outline Contract analysis content
Identify the role and the transcription control factor in cell function ChIP-seq for transcription factor Identify genome-wide as the binding area of transcription control factor Predict and identify binding area of transcription control factor, around TSS and genome-wide binding area distribution, co-localization factor, binding motif (recognition sequence),and target gene
ChIP-seq for histone modification (H3K4me1, H3K4me3, H3K27ac, H3K27me3, H3K36me3, etc.) Identifying genome-wide in the area corresponding to specific histone modifications Predict / identify histone modification area, TSS / genome-wide area distributions, and binding motif of transcription control factor associated with the histone modifications (recognition sequences), and comparison with the public data
FAIRE-seq Identify the open chromatin domain (DNA fragmentation by sonication) Identify / predict open chromatin domain, TSS / genome-wide area distribution, the binding motif (recognition sequences) of transcription control factor associated with the open chromatin, and comparison with the public data
DNase-seq Identify the genome-wide open chromatin domain (DNA fragmentation by DNase I) In addition to FAIRE-seq, possible "footprint analysis" to identify the binding domain of transcription control factor for each 1bp unit
ATAC-seq Identify the genome-wide open chromatin domain (DNA fragmentation due to the Tn5 Transposase) In addition to FAIRE-seq, possible "footprint analysis" to identify the binding domain of transcription control factor for each 1bp unit
Would like to identify the genome-wide methylation patterns MBD-seq Identify genome-wide DNA methylation area DNA methylation area, TSS / genome-wide area distribution, identify methylation changes in patterns, integrated analysis of the gene expression, and comparison of the public data
Methylation Array (e.g., Illumina 450K array) Identify genome-wide DNA methylation area (using microarrays) Identify DNA methylation specialized for particular CpG area and methylation change pattern, integrated analysis of the gene expression, and comparison with the public data
Bisulfite Sequencing Identify genome-wide DNA methylation area for each 1bp unit by bisulfite processing processing Prediction of DNA methylation changes in patterns of each 1bp unit, and detailed classification of CpG area
Would like to identify the encompassing expression pattern of gene mRNA-seq Quantify the encompassing expression level of all genes (including such lincRNA) Clustering analysis of each gene / sample based on the gene expression pattern, integrated analysis of the epigenome data, and the like, building a transcriptional control network, and comparison with the public data
mRNA Expression Array Quantify the encompassing expression level of all genes (using microarrays) Quantifying the expression pattern specific to a particular gene area/clustering, and others are as well as mRNA-seq
microRNA Expression Array Quantify the encompassing amount of expression of microRNA by using microarray Quantifying expression level of for the known microRNA
Would like to identify the genome mutations of specific population Exome Sequencing Identify the genome mutations with a particular population as a genome-wide CNV to be present in a particular sample population, identify genomic mutations such as SNP, comparison with the public data, and integrated analysis of the epigenome data
SNP array Identify the genome mutations with a particular population (using microarrays) Genomic mutation analysis specialized for DNA mutation of the known disease
Would like to know the population distribution of microbiota 16S rRNA Sequencing Identify the population distribution of microbiota from phylogenetic analysis of 16S rRNA base sequence (Available from the DNA extraction process) Population of microbiota distributions, clustering in between samples (extraction of common / specific microorganisms), and comparison with the public microbiota data

Information analysis of large-scale biochemical data

Information analysis of large-scale biochemical data is demanded in a variety of situations, such as not only the NGS data but the molecular target prediction and individual health control optimization. We Rhelixa, as a professional team of large-scale biochemical data analysis, provide information analysis solutions, by applying the machine learning (Deep Learning, Neural Network, Support Vector Machine, Self Organization Map, EM algorithm, Simulated Annealing, etc.) and the statistical analysis technology.


Upon contract analysis

According to your R&D project, we will propose the best contract analysis. If you would like to analysis only for a single NGS data, the price upon the contract analysis will be from 2 to 10 thousand yen depending on the variation of the higher-order analysis (delivery time in this case is one to two weeks). In addition, if you want to contract on a project-by-project basis, the contract period is three months at shortest, regardless of the analysis data amount newly obtained in the interim, we provide upon a straight-line basis. As we able to propose a contract tailored to the project, we arrange a meeting with the customer and our professional staffs for free. Regarding the details, estimates of contract analysis, please contact the inquiry, or info(a)rhelixa.co.jp.

Access

  • 〒101-0032
  • 3-7-4 3F Iwamotocho, Chiyoda-ku, Tokyo, Japan

Contact

  • E-mail:info@rhelixa.co.jp
  • TEL:+81-3-6240-9330
  • FAX:+81-3-6240-9331