Plugins are small software components that can be dynamically loaded by EuGène. Although it is completely transparent to the end-user, every plugin loaded by EuGène must be written in C++ and be a subclass of the Sensor class (for more details see the EuGène documentation).

Plugins are presented in 4 categories:
Signal plugins
Content plugins
Mixed signal/content plugins
Others plugins

Signal plugins

Name Description Information source Link
ATGpr (obsolete) Injects possible translation starts as predicted by the ATGpr program.
The plugin reads the prediction of the programs from two files whose names are derived from the sequence name by adding the .atgpr and .atgprR suffix (respectively prediction for the forward and reverse strand).
ATGpr
EuStop Predicts translation stops. It is able to deal with noisy sequences and will eg. predict a possible stop on TGN. EuGène dummy stop sensor -
FrameShift Predicts possible frameshits (either insertions or deletions) at each position of the sequence with a uniform cost. EuGène dummy frameshift sensor -
GSplicer Injects possible splice sites as predicted by the GeneSplicer program.
The plugin reads the prediction of the program from one file whose name is derived from the sequence name by adding the .Gsplicer suffix. This file describe the predicted splice sites for the forward and reverse strand.
GeneSplicer
NG2 Injects possible splice sites as predicted by the NetGene2 program.
The plugin reads the prediction of the programs from two files whose names are derived from the sequence name by adding the .splices and .splicesR suffix (respectively prediction for the forward and reverse strand).
NetGene2
NStart Injects possible translation starts as predicted by NetStart program.
The plugin reads the prediction of the program from two files whose names are derived from the sequence name by adding the .starts and .startsR suffix (respectively prediction for the forward and reverse strand).
NetStart
PatConst Predicts a specific chosen type of signal at each occurence of a given pattern on the sequence. - -
PepSignal Uses Predotar to predict peptide adressing sequences after every occurrence of an ATG an modifies ATG scoring accordingly.
Predotar
Predotar - INRA
SMachine Injects possible signals as predicted by Splice Machine program. SpliceMachine SpliceMachine
SpliceWAM Detects the splice sites and to give them a score reflecting the context accordance with given models.
A score is attributed at each potential splice sites (AG / GT), according to Weight Array Method.
EuGéne Window Array Model for splice -
SPred Injects possible splice sites as predicted by the SplicePredictor program.
The plugin reads the prediction of the programs from two files whose names are derived from the sequence name by adding the .spliceP and .splicePR suffix (respectively prediction for the forward and reverse strand).
SplicePredictor
StartWAM Detects the translation start codons and to give them a score reflecting the context accordance with given models.
A score is attributed at each potential start codons (ATG), according to Weight Array Method (see Zhang and Marr, Comput Appl Biosci. 1993 Oct;9(5):499-509), or Weighted Array Matrix models (Salzberg, Comput Appl Biosci 1997 Aug;13(4):365-76). A WAM describes a consensus motif of a functional signal, and is composed by one markovian model per each position of the motif. Here the motif is defined by the ATG (present in all start codons) plus the two flanking context (informating for the WAM). Globally, the score of a motif is function of the emission probabilities of this motif given a true positive model and a false positive model.
EuGène Window Array Model for start -
Transcript Predicts a possible transcription start and stop at every position, all with the same uniform cost. EuGène dummy transcript start/stop sensor -
Content plugins

Name Description Information source Link
BlastX Exploit similarities with homologous proteins.
The similarities influence exon and intron detection. Similarities from several databases can be exploited. Usually 3 databases are used: SwissProt, PIR and TrEMBL.
BlastX-based proteic similarity sensor
Est Take into account information from aligned transcribed sequences, both complete cDNA and EST.
The existence of a hit (resp. gap) in the spliced alignment will influence intergenic, exonic and intronic state costs by penalizing states that are incompatible with the alignment. The spliced alginments must be performed beforehand using a spliced aligner such as sim4 or spidey. The output of these aligners must be converted in the adequate format.
Sim4-based transcription similarity sensor
Homology Exploit exon conservation. TBlastX-based exon conservation sensor
MarkovConst A simulated contents sensor that gives constant probabilities to all positions for each region type.
Two parameters (in the EuGène parameters file) indicate the GC scope of the contents sensor. If the GC% of the sequence is out of the scope, the plugin will give an equal null loglikelihood to all types of regions. Used for testing purposes and for simulating the exponential length distributions of HMM.
EuGène dummy constant probabilities -
MarkovIMM Injects coding/intronic/utr/intergenic likelihood as modeled by interpolated Markov models (introduced in Glimmer).
These models are defined in a so-called matrices file. Depending on the matrices file, this may contain IMM for exons, introns and intergenic data and also optionnally 5' and 3' UTR regions. If these 2 last IMMs are absent from the matrices file, intronic models are used for UTR.
EuGène DNA level interpolated markov models -
MarkovProt Injects coding/non coding likelihood as modeled by proteic Markov models.
These models are defined in a matrices file. The order of the Markov model must be given in the EuGène parameters file.
EuGène amino acid level markov models -
Repeat Exploit the output of repeated sequences detector such as RepeatMasker by penalizing exonic, inronic or UTR states when repeats are detected. RepeatMasker
Mixed Signal/Content plugins

Name Description Information source Link
AnnotaStruct Take into account arbitrary user information using GFF files. High level information of either CDS or transcript elements can be given as well as low-level information on signals or regions.
GFF file
-
IfElse Combine the predictions of two existing plugins.
It listen to a first plugin. For each possible predictable item, if this plugin predicts something then this prediction is used. If the plugin does not predict anything, then the output of the second plugin is used.
EuGène information combination -
Riken Full-length mRNA.
A file with extension .riken is read. Each line must contain the positions of the extremities of the match of the 5' EST then the name of the 5' EST the same thing for the 3'EST then the name of the clone.
Full-length mRNA -
User (obsolete) Take into account specific user information (usefull to explore alternative gene structures).
A simple language is required containing statements on signals and on the sequence itself.
EuGène arbitrary user information sensor (language based) -
Others plugins

Name Description Information source Link
GCPlot Add to the graphical representation a plot of basic composition statistics on the sequence.
The composition statistics represented can be arbitrarily chosen. For example, the GC%=(G+C)/(A+T+G+C) Statistics on the 3rd base of each codon are automatically computed and plotted. The color (integer between 0 and 8), the smoothing window width and a zooming factor can be specified. The zooming factor for the 3rd base in each codon in zoomed using specific zooming factor.
EuGène basic composition statitics plot -
GFF Add to the graphical representation an annotation provided in a GFF format.
Note that the provided GFF annotation could be an EuGène prediction given in GFF format (obtained using the -pg argument). This could allow to visualise two predictions on the same graph. For a sequence, the plugin reads the annotation from one file whose name is derived from the sequence name by adding the .gff suffix.
EuGène gff annotation plot -
Plotter Add to the graphical representation the GC%, the GC3% and the two quotients A/T+A and T/T+A. EuGène GC%, GC3%, A/AT% T/AT% plot -
Tester Evaluate signal sensors.
For a sequence, the plugin reads the truth gene coordinates from one file whose name is derived from the sequence name by adding the .gff suffix, and creates one file test.sensorName.gff for each sensor tested.
EuGène signal sensors test -