Research Article: A Genome-Wide Analysis of FRT-Like Sequences in the Human Genome

Date Published: March 23, 2011

Publisher: Public Library of Science

Author(s): Jeffry L. Shultz, Eugenia Voziyanova, Jay H. Konieczka, Yuri Voziyanov, Robert Oshima.

Abstract: Efficient and precise genome manipulations can be achieved by the
Flp/FRT system of site-specific DNA recombination.
Applications of this system are limited, however, to cases when target sites for
Flp recombinase, FRT sites, are pre-introduced into a genome
locale of interest. To expand use of the Flp/FRT system in
genome engineering, variants of Flp recombinase can be evolved to recognize
pre-existing genomic sequences that resemble FRT and thus can
serve as recombination sites. To understand the distribution and sequence
properties of genomic FRT-like sites, we performed a
genome-wide analysis of FRT-like sites in the human genome
using the experimentally-derived parameters. Out of 642,151 identified
FRT-like sequences, 581,157 sequences were unique and
12,452 sequences had at least one exact duplicate. Duplicated
FRT-like sequences are located mostly within LINE1, but
also within LTRs of endogenous retroviruses, Alu repeats and other repetitive
DNA sequences. The unique FRT-like sequences were classified
based on the number of matches to FRT within the first four
proximal bases pairs of the Flp binding elements of FRT and the
nature of mismatched base pairs in the same region. The data obtained will be
useful for the emerging field of genome engineering.

Partial Text: Site-specific tyrosine recombination systems, such as Flp/FRT,
Cre/loxP and related systems catalyze conservative DNA rearrangements between their
cognate recombination target sites [1], [2], [3]. By manipulating the relative location and orientation of
the recombination target sites, genome rearrangements catalyzed by recombinases can
include integration, excision, inversion or recombinase-mediated cassette exchange
(RMCE). As these recombination systems are active in all cell types tested, they
became popular molecular tools for directed genome rearrangements, including
specific DNA insertions or targeted DNA deletions in chromosomes, DNA translocation,
gene replacement as well as expression of proteins from selected chromosomal
locales, excision of large chromosomal DNA segments for sequencing, rescue of
pathogenic islands and production of biofactories [1], [2], [4], [5], [6].

The main goal of the present work was to gain insight into sequence properties and
distribution of FRT-like sequences in the human genome. To
accomplish this goal we solved three tasks: (1) we developed a computer program able
to quickly scan a mammalian genome for target-like sequences for tyrosine
recombinases; (2) we analyzed genome-wide distribution of FRT-like
sequences in the human genome, and (3) we sorted the identified
FRT-like sequences into groups that have common sequence