Genome annotation from scientists to scientists

Logo of Mendle Analytics service

The founding team of GOENOMICS has been developing and researching genome annotation software for more than 35 years in academic research. Recognizing the limitations of current approaches, we have joined forces to develop new software. If you have questions about annotating a genome assembly or a particular genomic region, contact us and we will provide you with a detailed assessment.

Configure the genome annotation according to your needs

Your advantages

Order the annotations you need, select from 7 annotation packages.

Considerably higher quality of gene prediction by a new annotation approach.

Annotation of genomes at all assembly levels.

Short delivery time

Genome annotation is more than the prediction of protein-coding genes

Genome annotation means that each nucleotide in a genome sequence is assigned a function. Current genome annotations focus on the prediction of protein-coding genes. Untranslated regions (5'- and 3'-UTRs), non-coding regions, transposons and even RNA genes are rarely annotated. GOENOMICS provides all-encompassing genome annotations, but you can also select only the annotations you need. By looking at all types of genes and regions simultaneously, we significantly reduce the false-positive prediction that one type of gene belongs to another type of gene.
Structural gene prediction takes into account various features, such as open reading frames (ORFs), splice sites, start and stop codons, and consensus sequences, to distinguish coding regions from non-coding ones.

Annotation process

Our work starts with the customer's genome assembly and optional supporting RNA-Seq and/or Isoseq data

We deliver, depending on the packages ordered, the annotation results as gff3, fasta, csv/xlsx (described below)

Check the enhanced analysis report of your genome assembly and annotation

Package: Protein-coding genes

Smarter Gene Reconstruction for Deeper Insight

Accurately identifying protein-coding genes is the foundation for understanding the biological blueprint of any organism. Our approach focuses on precisely mapping these genes within your DNA sequence, unlocking the functional elements that drive cellular processes, development, and evolution.

To separate coding from non-coding regions, we analyze key gene structure signals — including open reading frames (ORFs), splice sites, and consensus motifs — ensuring high-confidence predictions you can trust.

At the core of our solution is a proprietary, next-generation algorithm for homology-based gene reconstruction. Built from the ground up using thousands of in-house annotations and refined through iterative optimization, it delivers remarkably accurate and biologically relevant gene models that are ready to power your research.

Your benefits

Cleaner annotations, minimal contamination

Avoid the noise of false positives. Thanks to our proprietary transposon detection technology, your genome annotations are free from misleading transposon overlaps — responsible for up to 20% of inaccuracies in other tools like MAKER and BRAKER — ensuring you work with high-confidence protein-coding gene predictions.

Precision That Powers Discovery

Achieve unmatched annotation accuracy with exon-level precision. Our advanced genome annotation pipeline eliminates mis-predicted exons — common in public databases — that often lead to cascading errors in downstream analyses. By delivering cleaner data and maximizing completeness through the identification of lineage-specific genes, our annotations offer deeper biological insights tailored to your research goals.

Package: Functional annotation

Functional Annotation: Turning Sequences into Biological Insight

Functional annotation adds biological context to your gene predictions by identifying the roles of genes, transcripts, and proteins through comparison with curated reference databases. This includes detecting functional domains, sequence motifs, gene ontology (GO) terms, regulatory elements, and biological pathways. The result is a comprehensive understanding of how each gene contributes to cellular functions and broader biological processes.

Accurate Naming: Beyond Generic Labels

Protein and gene names are typically assigned based on similarity to known entries in reference databases. However, database content and versioning can significantly affect name accuracy. Unverified classification into subfamilies or classes without phylogenetic validation often leads to confusion and mislabeling. We also avoid ambiguous terms like “hypothetical,” “probable,” or “related,” which dilute the clarity of annotation. Our approach ensures reliable, meaningful names grounded in biological evidence.

Your benefits

Crystal-Clear Functional Annotations

Say goodbye to cluttered, confusing gene and protein names. We refine your functional annotations through rigorous post-processing — removing vague terms and unnecessary additions for cleaner, publication-ready naming. Our process follows GenBank best practices and eliminates false-positive domain assignments by using only carefully selected, up-to-date tools with high precision.

Tailored Database Matching for Maximum Clarity

Get precise, 1:1 protein matches by selecting from trusted reference databases aligned with your research focus (e.g. FlyBase, TAIR, ZFIN). Whether you're working with model organisms or niche species, this ensures consistent gene naming and easier communication in publications and scientific communities.

Package: RNA genes

RNA Genes and Their Cellular Roles

RNA genes produce functional RNA molecules that play key roles in cellular processes. Among the most critical are transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs), which are central to the protein synthesis machinery. Beyond these, non-coding RNAs — such as microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) — act as important regulators of gene expression, influencing mRNA stability and translation efficiency.

Your benefits

Accurate RNA Gene Detection, No False Signals

Get reliable results with our expertly curated and customized RNA gene set. We go beyond standard pipelines by focusing on biologically relevant RNA genes — including ribosomal and spliceosomal RNAs, the RNA component of the signal recognition particle, and ribonuclease P genes — while filtering out noise from random matches often found when using the full Rfam database. This means cleaner data and more meaningful insights for your downstream analyses.

tRNA Insights Aligned with the Genetic Code

Our enhanced analysis report gives you a comprehensive overview of your tRNA gene content, mapped directly onto the genetic code. Understand how wobble base pairing reduces redundancy, and assess the completeness of tRNA coverage. This powerful feature can help uncover potential causes of mistranslation or tRNA scarcity — valuable knowledge for evolutionary studies, synthetic biology, or translational optimization.

Package: UTRs / Coverage

Untranslated Regions (UTRs): More Than Just Exons

The regions flanking the coding exons — known as untranslated regions or UTRs — play key roles in gene regulation. Located at the 5' and 3' ends of transcripts, these are referred to as 5'UTRs (leader sequences) and 3'UTRs (trailer sequences). Like coding exons, UTRs can be interrupted by introns or undergo alternative splicing.

How We Identify UTRs

UTRs are defined either through direct mapping of RNA-Seq data or predicted using models trained on real data or derived from closely related species. This dual approach ensures both accuracy and completeness across diverse genomes.

Your benefits

Confidence You Can Trust: RNA-Seq Coverage Insights

Get precise validation of your annotations with our dual RNA-Seq coverage analysis — designed to give you a deeper level of confidence in every gene model.

  • Full-Length Coverage shows how completely a transcript is captured, offering a high-level view of structural integrity from 5' to 3'.
  • Base-Level Coverage drills down to each individual nucleotide, revealing exactly how strongly your data is supported by experimental evidence.

These metrics help you assess annotation quality with clarity and precision. With high RNA-Seq support, you can move forward knowing your gene models are built on solid, verifiable data. (Note: Coverage values are used for structural validation, not for differential expression analysis.)

Package: Transposons

High-Confidence Retrotransposon Annotations: Uncover the Hidden Drivers of Genome Evolution

Retrotransposons aren’t just genomic “junk” — they’re powerful elements that shape genome structure, regulation, and evolution. Yet, most annotation pipelines overlook them or mistake them for protein-coding genes, leading to confusion in gene models and lost biological insights.

At GOENOMICS, we treat retrotransposons as a critical feature of your genome assembly — not background noise. Using our proprietary mendle® technology, we deliver comprehensive and precise annotations of all major retrotransposon classes, including:

  • LINEs (Long Interspersed Nuclear Elements) – autonomous elements that encode reverse transcriptase
  • SINEs (Short Interspersed Nuclear Elements) – non-autonomous elements often piggybacking on LINE machinery
  • LTR retrotransposons (Long Terminal Repeat elements) – similar in structure to retroviruses, including Ty1-copia and Ty3-gypsy families
  • Retrovirus-like elements – endogenous viral elements that can be mistakenly classified as gene fragments

Your benefits

By accurately identifying these elements and distinguishing them from protein-coding genes, we empower you to:

  • Gain deeper insight into genome organization and regulatory complexity
  • Track species-specific retrotransposon activity and evolutionary signatures
  • Avoid false gene predictions caused by transposon interference
  • Support research in epigenetics, genome plasticity, and adaptation

Whether you’re working in plant, animal, or fungal genomics, our transposon annotations give you the clarity and resolution needed to explore the full potential of your genome.

Package: 5 Genes

Expert-Curated Gene Models — Tailored to Your Research Priorities

As part of this annotation package, you’ll have the opportunity to select up to five genes of particular interest for in-depth, manual curation by our expert team. This premium service ensures your most important targets receive the highest level of attention and accuracy.

Your benefits

  • Exon-by-exon validation using RNA-Seq data and comparative genomics, ensuring that each gene structure reflects biological reality — not just computational predictions.
  • Accurate identification of alternative splice variants that are often missed in automated workflows but are crucial for understanding gene function and regulatory diversity.
  • Full functional annotation, including protein domain insights, gene names, and ontology terms, carefully evaluated for consistency and clarity.
  • Clear visualization of RNA-Seq support where available, helping you confidently interpret expression patterns and experimental evidence.

Your final genome report will include a dedicated section for each manually annotated gene, providing publication-ready gene models with transparent evidence and interpretation — perfect for follow-up experiments, grants, or community database submissions.
Let your key genes stand out—with annotations you can trust.

Package: Enhanced analysis report

Turn data into discovery with interactive, biology-driven analysis

Get access to our Enhanced Analysis Report — a powerful, interactive online tool that transforms raw annotation output into meaningful biological insight. Designed for researchers who want more than just files, this report offers an in-depth, visual breakdown of your assembly and annotation quality.
From genome-wide metrics to fine-grained structural insights, you’ll get intuitive charts, graphs, and tables that let you explore, interpret, and share your results effortlessly.

Whether you’re preparing for publication, evaluating a new genome, or comparing multiple assemblies, our Enhanced Analysis Report gives you a clear, data-rich foundation for confident decision-making and next-step planning.

Ready to go beyond the GFF? Dive deeper with your data.