Note that a new beta version of Gambit is available; it has no expiration date.
requirements, downloads, and installing

AN INTRODUCTION TO BOOTSTRAPPERS GAMBIT

(Copyright 2006 J.A. Lake, All rights Reserved)

          It's not generally appreciated that molecular sequence analysis is a field in its infancy. Thus it is an inexact science, in which there are few analytical tools that are based on general mathematical principles. As a result many, perhaps most, phylogenetic trees reconstructed from molecular sequences are incorrect, because they make mathematical assumptions that are not met by the data being analyzed. Frequently these incorrect assumptions lead to long branch attraction.

Long Branch Attraction

          Long branch attraction can be caused by one or more of three pitfalls of sequence analysis. For all three the effects are the same, In trees artifactually produced by long branch attraction, rapidly evolving sequences (represented by long branches on unrooted phylogenetic trees) will be placed with other rapidly evolving sequences, even if the sequences are only distantly related. In comparison with most problems in molecular biology, which can be solved by acquiring more data, long branch attractions are diabolical. When long branch attractions are present, if longer sequences are used, the incorrect solution will be even more strongly supported.

          Specifically the mathematical steps in sequence analysis that produce this pitfall are; i. incorrect sequence alignments, caused by inadequate mathematical models and often related specifically to biases created by progressive alignment algorithms when they are used to align more that three taxa (organisms); ii. the failure to account properly for site to site variation (all sites within sequences can evolve at different rates, yet most algorithms assume they evolve at the same rate), and iii. unequal rate effects (the inability of most tree building algorithms to produce good phylogenetic trees when genes from different taxa in the tree evolve at different rates). Of the three pitfalls, alignment artifacts are potentially the most serious, because even if one solves the second and third problems, then misalignments can still produce incorrect trees. General algorithms are available for pitfalls two (site to site variation) and three (unequal rate effects) and are incorporated in the Gambit program, but none are available for the alignment problem. Gambit contains algorithms not significantly affected by site to site variation or by unequal rate effects. Specifically, paralinear (logdet) distances, is a truly additive method for determining distances between sequences. Since Paralinear distances is based on a very general Markov model, it is not significantly affected by unequal rate effects. Also, Pattern Filtering is a demonstrably optimal method for estimating the variation of rates at different sequence sites, and as such, is not significantly affected by site to site variation effects. Both of these methods are available in Gambit.

Tree Reconstruction

          Determining globally optimal, multi-taxon phylogenetic trees is also computationally intensive because the number of possible trees increases rapidly with increasing taxa. (For four taxa, three unrooted trees must be compared, whereas for thirteen taxa, 13,749,310,575 trees must be compared.) Given such large numbers it is difficult to search exhaustively more than 12- 13 taxon trees even using the branch and bound algorithm (4). Gambit approaches this problem in a unique way. Once Gambit finds a solution (using heuristic methods), it uses the data to estimate the probability that a better solution exists (5). Gambit then accepts only solutions for which better solutions are unlikely (at either the 95% or 99% confidence levels). With these methods it is possible to calculate "best" trees in reasonable times for 15 - 30 taxa, depending upon the sequence data.

          An additional difficulty found when constructing multiple taxon trees, is that many different optimality criteria are used for evaluating the "best" multi-taxon trees. For example, distance trees can be reconstructed by searching for local minima using least-squares criteria, or by the criterion of minimum distance, whereas parsimony methods minimize the number of nucleotide changes often using global searches. Bootstrappers Gambit is a multi-taxon tree reconstruction algorithm designed so that it can be used with most, if not all, phylogenetic methods. It uses a probability criterion as a common basis for comparing trees derived using diverse methods.

          Bootstrappers Gambit combines various algorithms for phylogenetic analysis into a single package. The program is designed for personal computers and runs on Windows operating systems. Among the phylogenetic reconstruction methods accommodated in Gambit, in addition to Paralinear distances are: Jukes-Cantor distances, Kimura two parameter distances, a 6 parameter distance method based on the evolutionary parsimony assumptions (Lake, unpublished), maximum parsimony, evolutionary parsimony, and a symmetric transversion parsimony. Other algorithms, such as maximum likelihood are being added.

Installing Gambit

          Installing Gambit onto your Windows personal computer is simple. You can download the four Gambit files as a single zipped file and unzip them; alternatively, you may choose to download the four files individually. (If you receive Gambit on disk, install it directly from the disk.)

          The four files of Gambit are
  • ReadMe.pdf, which can be read with a variety of applications including Adobe Acrobat Reader,
  • LOPH1294.CUT, a sample metazoan data set slightly modified from Halanych, Bacheller, Aguinaldo, Hillis, and Lake, Science, 267, 1641-43, 1995),
  • GAM95.xyz, the Gambit program for phylogenetic analysis, &
  • SWAPC.xyz, which is useful for manipulating sequence files.

    You will need to replace the .xyz of the last two files with .exe after they are downloaded. You will then be able to run Gambit by double-clicking its icon.



              To obtain Gambit, please choose the appropriate option below.

    I am a commercial user.

    I am a non-commercial user.