We chose three preamplification reactions to enrich the target sequences for Fluidigm PCR based on our experience from a recent study using DNA samples from formalin-fixed, paraffin-embedded (FFPE) tissues [9]. Design multiplex PCR primers subpool by subpool using interactions that are below a set threshold. endobj PrimerPooler is the only software able to divide the primer sets into a user-specified number of subpools.
An example screenshot of the program after genome scan is shown in, After the analysis, PrimerPooler displays the best results found. 0000023415 00000 n Common sequences such as barcode or index tags that are to be added at the 5 end of each forward and reverse primer should be presented as a separate pair of primer sequences in the input text file, named tagF and tagR. Albert 0000024292 00000 n 0000016710 00000 n 0000016546 00000 n Using the above PrimerPooler, 1153 primer pairs designed to cover 53 genes were assigned to three preamplification pools (388, 389 and 376 primer pairs each) and then 144 subpools of six- to nine-plex PCR for Fluidigm Access Array PCR.
0000010395 00000 n Z Two different amounts of DNA sample (50ng and 75ng) from each specimen were used for preamplification and 1l of two different diluted preamplified products (2.5- and 5-fold dilution) were used for Fluidigm Access Array PCR. 0000033180 00000 n
Only the highest interaction of each inter-/intra-primer interaction is counted and scored according to the number of nucleotide matches and mismatches (matching score) or according to the estimated thermodynamic value G (Gibbs free energy; kcal/mol). To overcome the low copy number of intact template DNA, a preamplification step is often used with a large number of primer pairs in a single reaction, followed by multiplex PCR to generate sufficient amplicons for next-generation sequencing. A list of the primer pairs in each subpool can then be written to a file. For (2) and (3), the index sequences for primers are represented by black and blue lines attached to the 5 ends of forward and reverse primers, respectively. 0000032620 00000 n Ideally, each preamplification and multiplex PCR will be composed of primer pairs least likely to form any potential undesired interactions.
A degenerate primer contains at least one base whose value is uncertain. Allawi and SantaLucia [18]. This metaphor allows the code to reason about which amplicons are in progress at the time a new amplicon starts. Allawi
PrimerPooler distributed the primers into a user-defined number of subpools with the interaction stability score less than that produced by other primer grouping programs.
0000018651 00000 n It is often necessary to allocate the set of primers into subpools, a common issue being potential cross-hybridization. We assessed PrimerPoolers interaction analysis by comparing with two of the most commonly used alternatives: AutoDimer [19] (http://www.cstl.nist.gov/div831/strbase/AutoDimerHomepage/DownloadPage.htm) and Multiple Primer Analyzer from ThermoFisher Scientific (https://www.thermofisher.com/uk/en/home/brands/thermo-scientific/molecular-biology/molecular-biology-learning-center/molecular-biology-resource-library/thermo-scientific-web-tools/multiple-primer-analyzer.html; hereafter MPA). 0000018167 00000 n 0000420354 00000 n 0000033968 00000 n
We would like to thank Sarah Moody for proofreading an earlier version of the manuscript. Schematic showing the three parts of PrimerPooler. xref SG 0000026705 00000 n 0000419890 00000 n PrimerPooler consists of three steps (Fig. Incorrect input will lead to non-recognition of any primer pairs that form amplicons with length longer than the maximum input.
trailer << /Info 154 0 R /Root 156 0 R /Size 250 /Prev 806443 /ID [<5c2ef750ef9d38a03e042dfdc3707f25>] /Encrypt 158 0 R >> PrimerPooler outperformed BLAT by performing the genome mapping of 2306 primer sequences in a single run, taking under 3min to complete the genome mapping and display the coordinates of the primers and overlap analysis result from scratch (Table 3). M f-!B$.u G Tz{=^tEv\p Since that state is not necessarily the best overall state, it is saved, and then a random number of random bad moves are made before starting the iteration again to see if it now converges on a better state. HT
0000017517 00000 n << /Filter /Standard /Length 128 /O /P -1052 /R 3 /U /V 2 >> 0000030971 00000 n This option reads a .2bit or FASTA genome file of any species, e.g. 160 0 obj Mapping of large-scale short sequences onto genome is still considered a great challenge. 0000016223 00000 n J Once the text file has been successfully loaded, the software reads the whole file and displays for the user the information on number of primers, the minimum and the maximum length of the primers, the maximum length of the tag sequences and the expected maximum length of the primers with tag sequences in the list. Swapping is stopped once the interactions of each subpool cannot be minimized any further, or at the users request. The depth coverage and variant calling were also performed as previously described [9]. If some primers are longer than 64 bases, the inner loop is slowed by the need to manipulate multiple 64-bit registers to contain everything. 0000002424 00000 n The sequencing coverage data are shown in, Direct selection of human genomic loci by microarray hybridization, Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing, Multiplex amplification enabled by selective circularization of large sets of genomic DNA fragments, Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction, Primique: automatic design of specific PCR primers for each sequence in a family, QuantPrimea flexible tool for reliable high-throughput primer design for quantitative PCR, MRPrimer: a MapReduce-based method for the thorough design of valid and ranked primers for PCR, Somatic mutation screening using archival formalin-fixed, paraffin-embedded tissues by fluidigm multiplex PCR and Illumina sequencing, MPprimer: a program for a reliable multiplex PCR primer design, MultiPLX: automatic grouping and evaluation of PCR primers, Multiplex degenrate primer design for targeted whole genome amplification of many viral genomes, Predicting DNA duplex stability from the base sequence, Improved nearest-neighbor parameters for predicting DNA duplex stability, Improved thermodynamic parameters and helix initiation factor to predict stability of DNA duplexes, Thermodynamics and NMR of internal G.T mismatches in DNA, AutoDimer: a screening tool for primer-dimer and hairpin structures, Addressing print disabilities in adult foreign-language acquisition, Proceedings of the 10th International Conference on Human-Computer Interaction, Capillary electrophoresis as a tool for optimization of multiplex PCR reactions. However, pooling a number of primers together in the same reaction invariably creates chances for undesired primer interactions, which adversely impact on their amplification and consequently sequence coverage [9].
Richards [13] showed it is possible to solve Chess board problems in parallel using bit fields: the 64 bits in a 64-bit register can represent the 64 squares of a Chess board. This fundamental hill-climbing algorithm can quickly reach a local maxima: a state in which no further improvements can be made on the next move.
PrimerPooler performs inter-/intra-primer hybridization analysis to identify the adverse interactions, as well as simultaneous mapping of all primers onto a genome sequence in a single run without requiring a prior index of the genome. 0000003805 00000 n The detailed algorithms for genome mapping and overlapping analysis are described in the Algorithms section below. In our previous study, based on empirical tests, we were successful to perform four- to six-plex PCR using Fluidigm 48.48 Access Array and achieve adequate coverage following Illumina MiSeq sequencing. Correspondence address. Oxford University Press is a department of the University of Oxford. Using our smaller data set, the three tools gave the same inter-/intra-primer interactions, albeit with the need to convert degenerate primers into non-degenerate form for AutoDimer. 0000023189 00000 n Since this corresponds to the binary operators XNOR (. , Butler JM. Swapping is stopped once the program empirically observes that the interactions of each subpool are not being minimized any further, or at the users request.
PCR amplification-based target enrichment is commonly used when the number of regions to be studied is relatively low, and the DNA sample available is suboptimal in quantity (small tissue or cytology specimen after routine histological diagnosis) or quality (formalin-fixed paraffin-embedded tissue/cytology specimen, circulating tumour DNA). 0000018004 00000 n For smaller numbers of primer pairs, pool division can be done manually using trial and error, but this becomes impractical when dealing with large numbers of primer pairs. et al. For (1), the common sequences for both forward and reverse primers used as indexing/barcoding are underlined.
0000022079 00000 n ~,lUCHY%roP-Cc 7^IR9EQ8VzD,/UF%^~6)*7ru M igW\h4/d\dofv!M'[4%c.R;J!U+) H+N-'//hC^}>/:f Wang Time running for each part of the program is noted to show the efficiency of the program. (. %%EOF iwY3HTU:-FtD ]c~b{'7waGMm4Tw8=M_%Ze?V??K70GaFGSe\_[u(t~W i=7 Comparison of available tools for constructing libraries for next-generation sequencing. << /ColorSpace 200 0 R /ExtGState 199 0 R /Font 201 0 R /ProcSet [ /PDF /Text ] >>
The source code and executables are freely downloadable from http://ssb22.user.srcf.net/pooler/. A Delcourt
Each step of the program uses new algorithms to achieve fast and accurate performance: The inter-/intra-primers hybridization analysis uses bit-pattern techniques [13] for unprecedented speed. et al. stream PrimerPooler also sorts and summarizes the number of interactions for each G/score range, with optional bonding diagrams. This is a technical section provided by the corresponding author who wrote the code. Unrecognized pairs are displayed as a warning, as the program is not able to verify that they will not overlap with other pairs in their subpools. If no better state can be found within a fixed number of iterations, the previously saved state becomes the final answer. , SantaLucia JJr. This is problematic for methodologies where it is the number of subpools, rather than the interaction threshold, that is fixed.
MPprimer uses a stringent cutoff value of 7kcal/mol when grouping primer pairs into subgroups [10]. 1). The software can recognize a primer pair as part of an amplicon as long as the primers were labelled with the same name apart from the last letter, where F is assumed to stand for the forward and R the reverse primer in each pair. 0000021339 00000 n ;bI Dqgj\OMEcQJ0\h3EJuF,K9hY\H` C+!1pr36 )8mCC|La5O#_w u^}e R'>YAC3e!i;CnRD#a.KRDW!/f72nl1B.m{+BqVW5 \k2GnFz. 0000017191 00000 n For smaller numbers of primers, pool division can be done manually using trial and error to minimize potential hybridization, but this becomes inefficient and time consuming with increasing numbers of primer pairs. << /Annots [ 170 0 R 171 0 R 172 0 R 173 0 R 174 0 R 175 0 R 176 0 R 177 0 R 178 0 R 179 0 R 180 0 R 181 0 R 182 0 R 183 0 R 184 0 R 185 0 R 186 0 R 187 0 R 188 0 R 189 0 R 190 0 R 191 0 R 192 0 R 193 0 R 194 0 R 195 0 R 196 0 R ] /Contents [ 162 0 R 163 0 R 164 0 R 165 0 R 166 0 R 167 0 R 168 0 R 169 0 R ] /CropBox [ 0 0 612 792 ] /MediaBox [ 0 0 612 792 ] /Parent 108 0 R /Resources 161 0 R /Rotate 0 /Thumb 109 0 R /Type /Page >> , Blake RD. If all this searching is performed using 64-bit registers, it can be very fast indeed: the entire genome can be searched for thousands of primers within a few minutes. If one or more primers in an interaction are degenerate, they are represented using four registers instead of three: Two DNA samples from FFPE tissues were subjected to preamplification, Fluidigm Access Array multiplex PCR and Illumina MiSeQ sequencing based on the primer combination suggested by the above PrimerPooler analyses. Untergasser For example, consider the consider the 2-bit format illustrated here. Shen We detect overlapping amplicons by reading the 2bit or FASTA genome file. 0000018329 00000 n Richards PrimerPooler begins by analysing inter-/intra-primer interactions for all the user-supplied primers, taking into account the common sequence tagged at the 5 end of each primer in the list using bit-pattern techniques [13] for unprecedented speed (see Algorithms section below for details).
A similar technique can be applied to DNA bases. The presence of overlapping primer pairs can lead to the generation of short amplicons, preventing appropriate amplification of the full targeted sequences. The user is provided with the option to save the result of the list of primer pairs of each subpool to a single text file or to separate files. Silas S. Brown, Yun-Wen Chen, Ming Wang, Alexandra Clipson, Eguzkine Ochoa, Ming-Qing Du, PrimerPooler: automated primer pooling to prepare library for targeted sequencing , Biology Methods and Protocols, Volume 2, Issue 1, January 2017, bpx006, https://doi.org/10.1093/biomethods/bpx006. et al. 0000003905 00000 n A total of 1153 primer pairs were designed for mutation screening of 53 genes in lymphoma using Fluidigm Access Array PCR and Illumina MiSeq sequencing. 0000020283 00000 n -N#t5Xsu#"nt`Lo?%x90jS(. and M.W. , Molla MN, Muzny DM The second set included an additional 815 PCR primer pairs, totaling 2324 primer sequences from 56 genes.
Based on the PrimerPooler analyses, theses primer pairs were divided into three pools (388, 389 and 376 primer pairs each) to avoid any major potential undesired interaction. Example screenshot of PrimerPooler for genome mapping of primers. 0000427970 00000 n
155 0 obj Example screenshots of the program running the automated primer pairs distribution are shown in, It can be seen that two bases will bond if and only if their A|T digits are equal and their G|T digits are not. 0000019143 00000 n
Fredslund The four dimensions of the pool score are: the maximum matching score of any single interaction (if using G, the program arbitrarily transforms it as score = int(2 G) so that it is split into ranges of width 0.5); the number of interactions of that score/range; the number of interactions scoring 1 less than that score/range; and. We developed PrimerPooler that automates swapping of primer pairs between any user-defined number of subpools to obtain combinations with low-potential interactions. 0000017679 00000 n endobj et al. Gnirke et al. 0000031928 00000 n This allows detection of overlapping primer pairs and allocation of these primer pairs into separate subpools where tiling approaches are used. 0000420752 00000 n 0000019307 00000 n After being presented with a suggested number of subpools (determined by a single-pass threshold-based packing using a G threshold of 7 and avoiding overlaps), the user is asked for their decision on how many pools, and is given the option of setting a maximum size of each pool in order to obtain more even subpools. 0000027235 00000 n Targeted sequencing of genome regions of interests is a common application in biomedical research. PrimerPooler takes a simple text file with multiple primer sequences in FASTA format as the primer list input. , Robinson P. Butler The PCR product purification, barcoding, library preparation and Illumina MiSeq sequencing were carried out as described previously [9].
, Nakano S, Yoneyama M 0000000015 00000 n , Coulouris G, Zaretskaya I Vallone MPA was able to show all potential interactions in a few seconds for both primer sets but does not rank them. PrimerPooler was written in C and is distributed under the GNU General Public License. , Lange M. Arvidsson 0000002318 00000 n
156 0 obj et al. 0000008685 00000 n , Cutcutache I, Koressaar T 0000018815 00000 n Users can obtain reports on overlapping primer pairs with coordinates of the corresponding amplicons. During the analysis, the user is able to see the summary of the total number of inter-primer interactions for each score/G range and the number of primers in each subpool, for each swap. et al. All primer pairs 4D pool scores are updated the moment any pair is added to or removed from a pool; this can be done very quickly if the list of other primers affected by a pair (and their interaction values) is obtained from a lookup table created using the first part of the program, so that primer interactions do not need repeated recalculation.
The software, together with an optimized experimental protocol, offers a reliable and efficient way to choose and combine a large number of primers for PCR-based target enrichment, ensuring representative amplification and sequencing coverage. et al.
157 0 obj
This binary search is then applied at each position of the genome, to answer the question which, if any, of our primer variants have last N bases matching the N bases weve just seen in the genome.
The results are displayed on screen and/or saved in text files. We used a hill-climbing algorithm from multiple random starting points, and parallelized on multicore CPUs, for quick performance. This includes (1) inter-/intra-primers hybridization analysis, (2) genome mapping and (3) the automated swapping system. The experiments were carried out essentially as described previously with the following modifications [9]. This is because the user can set a maximum amplicon length: on encountering an amplicon start event, the program needs to look ahead in the event list to find a corresponding end event and verify the amplicon length will fall within the limit before marking that amplicon as in progress. Initially, each amplicon is allocated a pool at random, and is also given a four-dimensional (4D) pool score, implemented as four 16-bit fields in a 64-bit integer, to assess the interactions between this amplicons primers and every other primer in its pool. Using PrimerPooler, the 1153 primers were assigned to 144 subpools containing six to nine primer pairs with G values weaker than 1.5 at 60C in 10min (the user can define the temperature and other parameters, Table 3), corresponding to three Fluidigm Access Array PCR. To detect amplicon overlaps in the genome, the occurrences of primers that start and end the amplicons are treated as events in a hypothetical sequential read through each chromosome. 0000023868 00000 n In addition to tracking a pool score for each amplicon in its current pool, the program also tracks what its pool score would become if it were moved to each of the other pools. This requires careful primer design, ensuring all the primers have similar length, GC content, melting temperature, etc. The program works from an ordered event list somewhat like that used in Gradint [20], and the in-progress array uses two bit fields to distinguish between negative-strand-first and positive-strand-first primer pairs and ensure that the opposite primer orientation is what ends the pair. and M.Q.D. The running time of the main functions of each step was measured for the two data sets (Table 3). SS However, the amplicon events are not actually processed while the genome is being read by the program; they are processed in a separate, simulated pass after the actual reading is complete. At each iteration, all primer pairs are searched for the best-ranking move to another pool, and, provided this move provides above-zero benefit, it is performed.