MS PDA-UniQ : Yu-Cheng Huang, Bioinformatics Lab, CSIE, NTU

Minimum Set Primers and Unique Probes Design Algorithms for

Differential Detection of Symptom-Related Pathogens

HOME
Introduction
Methodology
  Tetra-Nucleotide Nucleation (TNN)
  Unique & Common Sections
  Nearest-Neighbor Model
  MCGA
  Linker Design
Computational Results
Bio-Experiment
Conclusion
Reference

     We have use a modified version of a compact genetic algorithm (MCGA). Compact genetic algorithm (CGA) represents the population as a probability distribution over the set of solutions (Harik, et al., 1999) . The fitness evaluation in CGA is how the individual approximates the solution. We introduced a local search strategy into CGA. This local search heuristic is based on edge replacement, which is inspired by other heuristic approaches (Fernandes and Skiena, 2002) .

The chromosome is represented as a vector for primers, such as:

(8)

     If primer j is selected, uj will be 1. Otherwise uj will be 0. The chromosome can be seen as a primer set which can satisfy the constraints of the set covering problem. A probability vector V is used to represent the population:

(9)

     The element vj is the probability to select primer j . At the initial stage, all the probabilities are set to 0.5 for random selection of primers. All chromosomes are generated according to the probability vector V. We used uniform crossover to generate two intermediates from two parents. Local search strategy is applied to these intermediates in order to produce valid individuals. These new children will compete with their parents and update the probability vector V.

     The local search mechanism first makes the intermediates become valid solutions by introducing some primer pairs to cover all sequences. Then each primer pair is replaced by another primer pair with the same target sequence if the replacement decreases the number of primers. The local search process repeats until the number of primers can not be reduced any further.

     The competition between two chromosomes results in a chromosome with better fitness and the other with an inferior fitness score. The probability vector is updated based on the result of competition. If primer j presents in the chromosome with a better fitness score but not in the inferior one, vj = vj + 1/n , where n is the size of population; if primer j presents in the inferior chromosome but not in the one with a better fitness score, the probability for selecting this primer will become vj = vj - 1/n ; if primer j presents in both chromosomes, then vj is not modified. This process will repeat t times for all primers.

     The entire MCGA process terminates when the probability vector V converges. Combining the local search heuristic and global search of CGA, MCGA can solve the transformed set covering problem efficiently.