Categories
Articles

ABBS 2005,37(12): cDNA Cloning, Sequence Analysis of the Porcine LIM and Cysteine-rich Domain 1 Gene

Research Paper

Pdf file on Synergy

Omments

Acta Biochim Biophys Sin

2005,37: 843850

doi:10.1111/j.1745-7270.2005.00113.x

cDNA Cloning, Sequence

Analysis of the Porcine LIM and Cysteine-rich Domain 1 Gene

Jun WANG, Chang-Yan DENG*,

Yuan-Zhu XIONG, Bo ZUO, Lei XING, Feng-E LI, Ming-Gang LEI, Rong ZHENG, and

Si-Wen JIANG

Key

Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of

Animal Sciences, Huazhong Agricultural University, Wuhan 430070, China

Received: June 20,

2005

Accepted: October

10, 2005

This work was

supported by the grants from the Major State Basic Research Development Program

of China (G2000016105), the National High Technology Development Program of

China and the Natural Science Foundation of Hubei Province (2005ABA142)

*Corresponding author: Tel,

86-27-87287390; Fax, 86-27-87394184; E-mail, [email protected]

Abstract        LIM domain proteins are important regulators in cell

growth, cell fate determination, cell differentiation and remodeling of the

cell cytoskeleton by their interaction with various structural proteins, kinases

and transcriptional regulators. Using molecular biology combined with in

silico cloning, we have cloned the complete coding sequence of pig LIM and

the cysteine-rich domain 1 gene (LMCD1) which encodes a 363 amino acid

protein. The estimated molecular weight of the LMCD1 protein is 40,788 Da with

a pI of 8.39. It was found to be highly expressed in both skeletal muscle and

cardiac muscle. Alignment analysis revealed that the deduced protein sequence

shares 86%, 91% and 93% homology with that of its human, mouse and rat

counterparts, respectively. The LMCD1 protein was predicted by bioinformatics

software to contain a novel cysteine-rich domain in the N-terminal region, two

LIM domains in the C-terminal region, nine potential protein kinase C

phosphorylation sites, seven casein kinase II ­phosphorylation sites, a

tyrosine kinase phosphorylation site, seven N-glycosylation and

N-myristoylation sites and a single potential N-glycosylation site, which is

similar to the protein’s human counterpart. Phylogenetic tree was constructed

by aligning the amino acid sequences of the LIM domain from different species.

In addition, four base mutations were detected by comparing the sequences of

Large White pigs with those of Chinese Meishan pigs. The G294A mutation site

was confirmed by polymerase chain reaction-single-strand ­conformation

polymorphism analysis. Its allele frequencies were studied in five pig breeds.

Key words        LIM and cysteine-rich domain 1 (LMCD1); SSCP; PCR; gene

expression; data analysis

LIM domain proteins are defined as proteins having a double zinc

fingers motif with a consensus amino acid sequence C-X2-C-X1623-H-X2-C-X2-C-X1623-C-X2-C, (where C represents

cysteine, and X represents other amino acids) [1]. They are important

regulators in cell growth, cell fate determination, cell differentiation and

remodeling of the cell cytoskeleton [2]. Bespalova and Burmeister [3] isolated

the human complete LMCD1 coding region and mapped this gene to 3p26-p24

by radiation hybrid mapping. Then the mouse LMCD1 gene was mapped to the

central region of chromosome 6 [3]. The predicted 365-amino acid LMCD1 protein

contains a cysteine-rich domain in the N-terminal region and two LIM domains in

the ­C-terminal region. It also has several potential phosphorylation and

N-myristoylation sites and a single potential N-glycosylation site. Northern

blot analysis detected a major 1.7 kb LMCD1 transcript in most of the

human adult and fetal tissues tested, with highest expression in skeletal

muscle. Little or no LMCD1 expression was found in fetal brain and

liver, or in adult brain, liver, thymus, small intestine, or ­peripheral blood

[3]. Recently, Ota et al. [4] and Strausberg et al. [5] have

isolated the human complete LMCD1 cDNA sequence, which contains six

exons and five introns. Furthermore, LIM protein was validated to have action

on the control of muscle genes [6]. The presence of LIM domains in the LMCD1

gene implies it may be involved in skeletal muscle protein-protein interactions

[1,7,8].

Therefore, the LMCD1 gene was selected as a ­candidate gene

for pig meat quality traits.

Materials and Methods

Tissue and blood samples

The tissue samples (heart, liver, lung, kidney, spleen, longissimus

dorsi muscle, fat, stomach and small intestine) were collected from

Large White, Landrace and Chinese local breed Meishan pigs at Jingping Pig

Station (Huazhong Agricultural University, Wuhan, China).

The blood samples of Yorkshire, Landrace and Chinese local breed

Meishan pigs were collected from Jingpin Pig Station. The Chinese local breeds

Exi Hei and Wannanhua pigs were from scattering farms in Enshi Municipality

(Enshi, China) and Anhui Province, respectively.

Total RNA extraction and

genomic DNA isolation

Total RNA was extracted from different tissues with a Trizol kit

(Gibco, Carlsbad, USA). In case where the samples were seriously contaminated

with genomic DNA, DNase I treatment on the total RNA was carried out ­before

first-strand cDNA synthesis.

Genomic DNA was isolated from the blood samples of Yorkshire,

Landrace, Meishan, Exi Hei and Wannanhua pigs using phenol/chloroform

extraction and ethanol ­precipitation [9].

Primer design

A number of pig expressed sequence tags (ESTs) were initially

identified using the cDNA sequence of human LMCD1 (GenBank accession No.

NM_014583) by ­running a Blast search against the GenBank “EST-others”

databases. Two ESTs (GenBank accession No. BF198782 and BF442878) were

assembled into one contig with Sequencher 4.14 software. Primer L1 (forward, 5-AGCCTCTGTCTAAGAAGCAAA-3;

reverse, 5-CACGGGCTGCTTCTCCTT-3) was designed by the contig.

Primer L2 (forward, 5-CTCCAAGTATTCCACCCTCACA-3, locating the contig;

reverse, 5-CCTCAGGATGGCCTTAGCAC-3, locating the EST) was

designed by the contig and the EST (GenBank accession No. C94730). The ­structure

of the amplified sequence covers the coding region and all the exons. Primer L3

(forward, 5-CCTGGAAGATGATCGGAAAA-3; reverse, 5-TGATGGTGTCAAAGGTGGGA-3)

was designed by the porcine LMCD1 gene sequence (GenBank accession No.

AY821789) which we had submitted. L3 was used to ­validate the correctness of

mutation sites obtained by single-strand conformation polymorphism (SSCP) ­analysis

of polymerase chain reaction (PCR) products.

A number of pig expressed sequence tags (ESTs) were initially

identified using the cDNA sequence of human LMCD1 (GenBank accession No.

NM_014583) by ­running a Blast search against the GenBank “EST-others”

databases. Two ESTs (GenBank accession No. BF198782 and BF442878) were

assembled into one contig with Sequencher 4.14 software. Primer L1 (forward, 5-AGCCTCTGTCTAAGAAGCAAA-3;

reverse, 5-CACGGGCTGCTTCTCCTT-3) was designed by the contig.

Primer L2 (forward, 5-CTCCAAGTATTCCACCCTCACA-3, locating the contig;

reverse, 5-CCTCAGGATGGCCTTAGCAC-3, locating the EST) was

designed by the contig and the EST (GenBank accession No. C94730). The ­structure

of the amplified sequence covers the coding region and all the exons. Primer L3

(forward, 5-CCTGGAAGATGATCGGAAAA-3; reverse, 5-TGATGGTGTCAAAGGTGGGA-3)

was designed by the porcine LMCD1 gene sequence (GenBank accession No.

AY821789) which we had submitted. L3 was used to ­validate the correctness of

mutation sites obtained by single-strand conformation polymorphism (SSCP) ­analysis

of polymerase chain reaction (PCR) products.

Primers were designed with Primer 5.0 (http://www.premierbiosoft.com).

Reverse transcription (RT)-PCR

Primary cDNA synthesis was processed in a final ­volume of 25 ml containing 5?reaction buffer (5 ml), 1 mg of total RNA from certain tissue as the template, 0.5 mM of each

dNTP, 25 U of RNasin (40 U/ml), 2 ml of 10 mM oligo(dT15) and 200 U of M-MuLV reverse transcriptase

(200 U/ml; Promega, USA).

PCR amplification was carried out in a final volume of 25 ml containing

standard 1?PCR buffer with Mg2+ and 1 U Taq polymerase (Biostar International, Toronto,

Canada), 0.8 mM of each dNTP, 10 pmol of each primer and 1.0 ml of first-strand

cDNA product as the template. The PCR conditions were as follows: denaturation

at 94 ?C for 3 min; 94 ?C for 50 s, 57 ?C for 45 s for L1 (or 62 ?C for 50 s

for L2), 72 ?C for 50 s, 35 cycles; and an ­additional extension step at 72 ?C

for 10 min.

Cloning of PCR products and

sequencing

The PCR products were fractionated on 1.5% (w/v) agarose gel, and selected bands were purified using a

gel extraction kit (Sangon, Shanghai, China). The purified PCR products were

ligated into the pGEM-T vector (Promega) and transformed into DH5a competent

cells. Bacteria were grown in LB-ampicillin agar. Cloned PCR products were

sequenced by Sangon.

mRNA expression analysis

The tissue distribution of pig LMCD1 mRNA was ­determined by

semiquantitative RT-PCR [10,11]. The house-keeping gene

glyceraldehydes-3-phosphate dehydrogenase (GAPDH) was used as an

internal control on the template level. Primers for GAPDH amplification

were: forward, 5-ACCACAGTCCATGCCATCAC-3; reverse, 5-TCCACCACCCTGTTGCTGTA-3.

The primer for LMCD1 was L1.

PCR was carried out in a final volume of 25 ml as above. The

conditions for PCR were in the exponential phase of amplification, as

judged by the use of different ­concentrations of cDNA template, which provided

a ­direct correlation between the amount of the amplification ­products

obtained and RNA template abundance in the samples: ­denaturing at 94 ?C for 4

min; 94 ?C for 50 s, 57 ?C for 45 s and 72 ?C for 50 s, 28 cycles; and an

additional extension step for 10 min at 72 ?C. To validate the results, we

repeated the RT-PCR three times.

Prediction and analysis of

putative LMCD1 domain

The cDNA sequence prediction was conducted with GenScan software (http://genes.mit.edu/GENSCAN.html).

Sequence similarity analysis in GenBank was performed using the Blast 2.1

search tool (http://www.ncbi.nlm.nih.gov/blast/).

ClustalW software (http://www.ebi.ac.uk/clustalw/)

was used for alignment of multiple sequences. Phylogenetic and molecular

evolutionary analyses were conducted using MEGA 3.1 software [12]. To predict

the biophysics characteristics of the putative protein of LMCD1,

software on the ExPASy Proteomics Server (http://au.expasy.org/)

was used. The prediction and ­analysis for the protein structural domain and

functional site were performed using Prosite software (http://www.expasy.org/prosite/).

The 3-D structure of the putative protein conserved domain was analyzed using

the 3-D Conserved Domain Architecture Retrieval Tool of Blast (http://www.ncbi.nlm.nih.gov/blast/).

SSCP analysis [13,14]

L3 amplified DNA product (amplification procedure was the same as L1

except for the anneling temperature was 60 ?C) was mixed with four volumes of

formamide loading dye [98% formamide, 0.3% bromophenol blue and xylene cyanol

(Sangon), 10 mM EDTA and 7% glycerin (Sunbiotech, Beijing, China)], then

denaturized at 98 ?C for 10 min, loaded onto a non-denaturing gel [12%

acrylamide/bisacrylamide (29/1, W/W), 1?TBE (25 mM Tris,

25 mM boric acid, 0.5 mM EDTA)] containing 5% glycerol (Sunbiotech), and

subject to electrophoresis at 4 ?C, 5 W for 1025 h, depending on the size

of the PCR product analyzed and the gel composition. The gels were processed by

silver staining.

Results

RT-PCR, cloning of PCR

products and sequencing

Amplified cDNA products were fractionated on 1.5% (w/v) agarose gel, and clear amplified bands of primer L1

and L2 were obtained. RT-PCR products were cloned into vector pGEM-T and

sequenced. Sequencing results showed that the sizes of the L1 and L2 PCR

products were 479 bp and 945 bp, respectively (Fig. 1).

mRNA expression analysis

Fig. 2 shows that LMCD1 mRNA was

present at high levels in the Longissimus dorsi muscle and heart, and at medium

levels in the lung, kidney and spleen. The LMCD1 mRNA expression level

was low in fat and in the stomach, and lower in the liver and small intestine.

All expression patterns were, on the whole, in accordance with the ­results of

adult human LMCD1 mRNA expression [3].

Nucleotide sequence analysis

The sequences amplified by primers L1 and L2 were assembled into a

1216 bp sequence with Sequencher 4.14 software. This sequence was identified

and similarity aligned by running a BlastN search against the GenBank “nr”

databases. The analysis revealed that the sequence contains the complete coding

sequence of LMCD1. It was then submitted to the GenBank database with

the ­accession number AY821789. The porcine LMCD1 nucleotide ­sequence

shared high homology with those of four ­species by Blast analysis: mouse

(86%), pan troglodytes (91%), rat (82%) and human (89%). Comparative analysis ­revealed

that there was a six-base deletion in the coding sequence of porcine LMCD1

which results in a two-amino-acid deletion in the porcine protein, compared

with that of the human gene at position 825830 (GenBank accession No.

14277673).

Four putative base substitutions (G294A as shown in Fig. 3,

C385T as shown in Fig. 4, A748G and A1099G) were detected in the exon

region by comparing sequences of Large White, Landrace and Meishan pig breeds.

A748G substitution changes a codon for Alanine into a codon for Threonine.

Prediction and analysis of

protein structural domain and functional site

Similarity comparison for the amino acid sequence of the LMCD1

gene in four species is shown in Fig. 5. The amino acid sequence of

porcine LMCD1 shares 86% ­identity with that of human, 91% with that of mouse

and 93% with that of rat. The amino acid sequence of LMCD1 shares significant

identity with those of mouse Tes1, mouse Tes2, human LMO6, and human triple-LIM

domain­ ­protein [3]. Based on the above results, Tes1, Tes2 and several kinds

of LIM domain proteins were collected to construct a combined phylogenetic tree

by Neighbor-Joining­ method and Unweighted Pair Group Method with Arithmetic

Mean (UPGMA) using MEGA 3.1 software (http://www.megasoftware.net/),

as shown in Fig. 6. Results revealed that porcine LMCD1 has a closer

genetic­ relationship with mouse and rat LMCD1 than with that of human. All of

the LIM domain proteins and the Tes1 and Tes2 genes have a similar location in

the phylogenetic tree, only their bootstrap values are different. These results

validate the ­correctness of the current classification of the LIM ­domain

protein.

Primary structure analysis revealed that the molecular weight of the

putative LMCD1 protein is 40.788 kDa and its theoretical pI is 8.39. Topology

prediction showed there was a novel cysteine-rich domain in the N-terminal­

region­ and two idiocratic LIM domain structural sequences in the C-terminal

region, but no transmembrane helices [15,16]. Nine potential protein kinase C

phosphorylation sites, seven casein kinase II phosphorylation sites, a tyrosine­

kinase phosphorylation site, seven N-glycosylation and­ N-myristoylation sites

and a single potential N-glycosylation site were also found by prediction (Fig.

6), similar to the structure of human LMCD1 [3]. Based on the single

potential­ N-­­­­­glycosylation site in the protein, it can be inferred that

the LMCD1 protein may be a glycoprotein.

The putative domain of the protein encoded by porcine LMCD1 and the

3-D structure of the conserved domain of this putative protein are shown in Figs.

7 and 8, respectively. From these two figures, we can find two LIM

zinc-binding domains and a PET domain that is ­suggested to be involved in

protein-protein interactions, which further validates the correctness of the

current ­classification of the putative LMCD1 protein.

SSCP analysis

The correctness of the G294A substitute was confirmed by PCR-SSCP.

The size of the primer L3 PCR product was 164 bp. The PCR-SSCP results are

shown in Fig. 9. The distribution of the polymorphism in five different

pig breeds is given in Table 1. We can conclude that only allele G was

found in Large White and Landrace pigs. c2 analysis of three

genotypes in different pig populations showed that the frequency of genotype

was significantly different (c2=128.1200>c20.01(8), P<0.01) in Large White, Landrace, Meishan, Exi Hei and Wannanhua pigs.

Discussion

Muscle LIM protein is composed of two neighboring LIM domains and a

glycine-rich domain [6,17]. It is an important regulator in the development of

skeletal muscle and cardiac muscle. According to the current ­classification of

LIM domain proteins, LMCD1 belongs to the group 3 proteins, which

contain one or more LIM domains at the C-terminal region [3]. This gene family

is characterized by idiocratic LIM domains and a conserved cysteine-rich motif

mostly expressed in musculature [3,8].

When a correct gene has been identified, there may be more than one

polymorphism within that gene. Polymorphism in the coding sequence that does

not change the amino acid (synonymous mutation) is unlikely to have an effect

on phenotype [18,19]. Two clues are used to ­predict whether or not a

non-synonymous polymorphism will ­affect phenotype. First, a mutation that

causes a radical change in the amino acid is more likely to affect the ­properties

of the protein than a conservative amino acid substitution. Second, amino acids

that are conserved across species are likely to be needed for protein function,

so mutations that change them are likely to affect phenotype [19,20]. Because of

SNPs (single nucleotide poly­mor­phisms) have high density and stability in

genomes. They were widely used in the identification of a functional gene and

location of quantitative trait loci (QTL) SNP ­occurring in the coding region

of a gene (cSNP) maybe belong to QTL to affect the level of gene expression and

the protein structure, so taking cSNP as a marker would be more available for

maker-assisted selection [14,21]. In our study, three cSNPs were detected by

aligning the cDNA sequences of different porcine breeds, and the ­correctness

of the G294A substitute was validated by PCR-SSCP. In addition, the

distribution of three genotypes, based on the G294A substitute, was studied in

five pig breeds. The results showed that allele A did not exist in Large White

and Landrace pigs, which may have been affected by long-term breeding and

selection or the limited number of animals in this study. It may also be that

allele A has a special function that affects porcine meat quality. Further

studies must be conducted to confirm whether this site can be regarded as a molecular

marker or not.

Bioinformatics analysis showed the LMCD1 protein was highly

conserved among the different species in this study. Some potential functional

sites predicted in pig LMCD1 were also found in human LMCD1. These results

offer some evidence to further understand the function of ­porcine LMCD1. Two

means that constructing phylogenetic tree have different substitution models

resulted in dissimilar bootstrap values, all LIM domain protein genes have a ­similar

location in the two trees, which further validates the correctness of the

classification of the LMCD1 gene.

This is the first report on the LMCD1 gene in pig. We have

obtained its complete coding sequence; we will ­continue our research to obtain

its full-length sequence, then carry out a functional analysis. Like other LIM ­domain

proteins with predominant expression in skeletal muscle, the LMCD1 protein

might be involved in protein-protein interactions during muscle development and

remodeling [1,3,8], so further research based on these primary ­results is

needed.

Acknowledgements

We would like to thank the staff at Jingping Pig Station

(Huazhong  Agricultural University) and

teachers and ­graduate students at the Key Laboratory of Swine ­Genetics and

Breeding, Ministry of Agriculture for managing and slaughtering experimental

animals.

References

 1   Morgan MJ, Madgwick AJ. The LIM proteins FHL1

and FHL3 are expressed differently in skeletal muscle. Biochem Biophys Res

Commun 1999, 255: 245250

 2   Li HY, Ng EK, Lee SM, Kotaka M, Tsui SK, Lee

CY, Fung KP et al. Protein–protein interaction of FHL3 with FHL2 and

visualization of their interaction by green fluorescent proteins (GFP)

two-fusion fluorescence resonance energy transfer (FRET). J Cell Biochem 2001,

80: 293303

 3   Bespalova IN, Burmeister M. Identification of

a novel LIM domain gene, LMCD1, and chromosomal localization in human and

mouse. Genomics 2000, 63: 6974

 4   Ota T, Suzuki Y, Nishikawa T, Otsuki T,

Sugiyama T, Irie R, Wakamatsu A et al. Complete sequencing and

characterization of 21,243 full-length human cDNAs. Nat Genet 2004, 36: 4045

 5   Strausberg RL, Feingold EA, Grouse LH, Derge

JG, Klausner RD, Collins FS, Wagner L. Generation and initial analysis of more

than 15,000 full-length human and mouse cDNA sequences. Proc Natl Acad Sci USA

2002, 99: 1689916903

 6   Jiang YL, Li N, Wu CX. Studies on the

molecular biology of myogeneis. Chin J Agric Biotechol 1999, 7: 201204

 7   Morgan MJ, Madgwick AJ. Slim defines a novel

family of LIM-proteins expressed in skeletal muscle. Biochem Biophys Res Commun

1996, 225: 632638

 8   Brown S, Biben C, Ooms LM, Maimone M, McGrath

MJ, Gurung R, Harvey RP et al. The cardiac expression of striated muscle

LIM protein 1 (SLIM1) is restricted to the outflow tract of the developing

heart. J Mol Cell Cardiol 1999, 31: 837843

 9   Sambrook J, Fritsch EF, Maniatis T. Molecular

Cloning: A Laboratory Manual. 2nd ed. New York: Cold Spring Harbor Laboratory

Press 1989

10  Kousteni S, Tura-Kockar F, Ramji DP. Sequence

and expression analysis of a novel Xenopus laevis cDNA that encodes a

protein similar to bacterial and chloroplast ribosomal protein L24. Gene 1999,

235: 1318

11  Liu YG, Xiong YZ, Deng CY. Isolation, sequence

analysis and expression profile of a novel swine gene differentially expressed

in the Longissimus dorsi muscle tissues from Landrace?Large White

cross-combination. Acta Biochim Biophys Sin 2005, 37: 186191

12  Kumar S, Tamura K, Nei M. MEGA3: Integrated

software for molecular evolutionary genetics analysis and sequence alignment.

Brief Bioinform 2004, 5: 150163

13  Watterson SA, Wilson SM, Yates MD, Drobniewski

FA. Comparison of three molecular assays for rapid detection of rifampin

resistance in Mycobacterium tuberculosis. J Clin Microbiol 1998, 36:

19691973

14  Tu RJ, Deng CY, Xiong YZ. Sequencing analysis

on partial translated region of uncoupling protein-3 gene and single nucleotide

polymorphism associations with the porcine carcass and meat quality traits.

Acta Genetica Sinica 2004, 31: 807812

15  Tusn?dy GE, Simon I. Principles governing

amino acid composition of integral membrane proteins: Applications to topology

prediction. J Mol Biol 1998, 283: 489506

16  Tusn?dy GE, Simon I. The HMMTOP transmembrane

topology prediction server. Bioinformatics 2001, 17: 849850

17  Yao X, Perez-Alvarado GC, Louis HA, Pomies P,

Hatt C, Summers MF, Beckerle MC. Solution structure of the chicken

cysteine-rich protein, CRP1, a double-LIM protein implicated in muscle

differentiation. Biochemistry 1999, 38: 57015713

18  Urrutia AO, Hurst LD. Codon usage bias

covaries with expression breadth and the rate of synonymous evolution in

humans, but this is not evidence for selection. Genetics 2001, 159: 11911199

19  Goddard ME. Animal breeding in the (post-)

genomic era. Animal Science 2003, 76: 353365

20  Ng PC, Henikoff S. Predicting deleterious

amino acid substitutions. Genome Res 2001, 11: 863874

21  Zhang EP, Geng SM, Liu LL. Basic principles

and statistic methods of QTL marker with SNPs. Animal Biotechnology Bulletin

2002, 8: 365369