You are here
The long term goal of the research in my lab is: 1) Developing novel algorithms for analyzing the regulatory regions of genomes and transcriptional regulation on a genomic scale. The genetic programs coded in the regulatory regions of a genome specify when and where different genes should be turned on or off. Such information is essential for understanding development, tissue specificity, and cellular response to the environment. However, the development of computational tools for analyzing regulatory regions has lagged behind those for gene discovery and protein sequence comparison. Recently, we have developed several novel algorithms to identify multiple regulatory elements from genome sequences. We will further develop these algorithms to increase their sensitivity and specificity. Several important generalizations will be made. We will also develop methods for comparing the regulatory regions of orthologous genes across species. In the past, comparative study of proteins across species has revealed many insights into protein function and evolution. My lab is exploring the potential of comparative study of noncoding regions for deciphering regulatory information. We will develop methods for comparing the regulatory regions of closely related as well as distant species. we will also carry out quantitative analysis of genome-wide gene expression data to extract relevant regulatory elements and determine their logical interrelations. 2) Developing tools for analyzing gene regulatory networks using gene expression and protein-protein interaction data. DNA micro-array has been widely used to monitor genome-wide gene expression. It has also been used to probe biological pathways by measuring the genome-wide change of gene expression due to various genetic and environmental perturbations. One ongoing project in my lab is to identify putative transcription factor binding sites and potential target genes using DNA micro-array data. The long term goal is to develop methods for reconstructing regulatory pathways using gene expression data in conjunction with large scale protein-protein interaction data (e.g., from genome-scale two hybrid screens). 3) Protein sequence and structure analysis. One important task in functional genomics is to determine the functions of novel genes. However, one still cannot reliably predict the 3D structure of a protein from its amino acid sequence. Previously, we have analyzed the protein folding problem from a different perspective by asking why nature only selects about 1000 folds to use as protein folds. We have proposed a designability principle for protein structure selection based on simple model studies (see publications). The principle states that a protein structure should be designable by a huge number of sequences and therefore must satisfy strong constraints. My lab will investigate whether the designability principle is valid for real proteins and what are its consequences on protein design and structure prediction. We are also developing new approaches to predicting protein-protein interaction and protein binding interface using sequence and structure information.