OALib Journal期刊

ISSN: 2333-9721




2018 ( 35 )

2017 ( 49 )

2016 ( 46 )

2015 ( 442 )


匹配条件: “ Fengzhu Yao” ,找到相关结果约12008条。
On Analysis of Tangshan Pre-Hospital Service Efficiency Based on Improved Fuzzy-DEA  [PDF]
Qiaozhi Sang, Fengzhu Yao, Shuo Hao, Zhennan Zhang
Open Access Library Journal (OALib Journal) , 2018, DOI: 10.4236/oalib.1104720
An allocation model based on improved Fuzzy-DEA is proposed for Tangshan medical health resource. The membership function is particularly introduced to characterize the non-strict partial order relation of the fuzzy numbers, and the internal form of the unit efficiency values is converted into point values. The variance analysis index system is then used to find factors leading to non-DEA effectiveness and the allocation plan is recommended accordingly: the relevant departments should coordinate the inter-regional resources and optimize the medical staffing level.
Testing gene set enrichment for subset of genes: Sub-GSE
Xiting Yan, Fengzhu Sun
BMC Bioinformatics , 2008, DOI: 10.1186/1471-2105-9-362
Abstract: In this paper, we develop a novel method, termed Sub-GSE, which measures the enrichment of a predefined gene set, or pathway, by testing its subsets. The application of Sub-GSE to two simulated and two real datasets shows Sub-GSE to be more sensitive than previous methods, such as GSEA, GSA, and SigPath, in detecting gene sets assiated with a phenotype of interest. This is particularly true for cases in which only a fraction of the genes in the gene set are associated with the phenotypes. Furthermore, the application of Sub-GSE to two real data sets demonstrates that it can detect more biologically meaningful gene sets than GSEA.We developed a new method to measure the gene set enrichment. Applications to two simulated datasets and two real datasets show that this method is sensitive to the associations between gene sets and phenotype. The program Sub-GSE can be downloaded from http://www-rcf.usc.edu/~fsun webcite.Genome-wide gene expression profiling using microarray technologies has been ubiquitously used in biological research. An important problem is to identify gene sets that are significantly changed under a certain treatment (for example, two different cell lines or tissues or the same cell line under different conditions). A gene set is basically a group of genes with related functions, e.g., genes in a biological process or in the same complex. There are a variety of ways by which genes, and, ultimately, gene sets may be defined. For example, gene sets can be defined according to the information provided by several databases, such as GeneOntology [1], KEGG [2], Biocarta http://www.biocarta.com webcite, and Pfam [3]. Gene sets may also be defined by cytogenetic bands, by region of genomic sequence or by establishing the functional relationships among them. Importantly, by using a gene set-based approach, a high power can potentially be achieved for detecting differentially expressed gene sets by integrating expression changes of genes inside the same gene se
Assessing the power of tag SNPs in the mapping of quantitative trait loci (QTL) with extremal and random samples
Kui Zhang, Fengzhu Sun
BMC Genetics , 2005, DOI: 10.1186/1471-2156-6-51
Abstract: We design a simulation study to tackle these problems for a variety of quantitative association tests using either case-parent samples or unrelated population samples. First, the samples are generated based on the quantitative trait model with the assumption of either an extremal sampling scheme or a random sampling scheme. Second, a small number of samples are selected to determine the haplotype blocks and the tag SNPs. Third, the statistical power of the tests is evaluated using four kinds of data: (1) all the SNPs and the corresponding haplotypes, (2) the tag SNPs and the corresponding haplotypes, (3) the same number of evenly spaced SNPs with minor allele frequency greater than a threshold and the corresponding haplotypes, (4) the same number of randomly chosen SNPs and their corresponding haplotypes.Our results suggest that in most situations genotyping efforts can be significantly reduced by using tag SNPs for mapping the QTL in association studies without much loss of power, which is consistent with previous studies on association mapping of qualitative traits. For all situations considered, two-locus haplotype analysis using tag SNPs are more powerful than those using the same number of randomly selected SNPs, but the degree of such power differences depends upon the sampling scheme and the population history.Single-nucleotide polymorphism (SNP) markers are preferred over microsatellite markers in association studies because of their high abundance, low mutation rate, and suitability for high-throughput genotyping. The genome-wide association studies on dissection of human complex traits need to screen a large number of SNPs. However, it is prohibitively expensive to genotype all SNPs in an association study with the throughput of current technologies. Judicial selection of SNPs for association studies is therefore of paramount importance. The observation that the human genome can be divided into regions of high linkage disequilibrium (LD) with limited haplo
A model-based approach to selection of tag SNPs
Pierre Nicolas, Fengzhu Sun, Lei M Li
BMC Bioinformatics , 2006, DOI: 10.1186/1471-2105-7-303
Abstract: Here, we compute the description code-lengths of SNP data for an array of models and we develop tag SNP selection methods based on these models and the strategy of entropy maximization. Using data sets from the HapMap and ENCODE projects, we show that the hidden Markov model introduced by Li and Stephens outperforms the other models in several aspects: description code-length of SNP data, information content of tag sets, and prediction of tagged SNPs. This is the first use of this model in the context of tag SNP selection.Our study provides strong evidence that the tag sets selected by our best method, based on Li and Stephens model, outperform those chosen by several existing methods. The results also suggest that information content evaluated with a good model is more sensitive for assessing the quality of a tagging set than the correct prediction rate of tagged SNPs. Besides, we show that haplotype phase uncertainty has an almost negligible impact on the ability of good tag sets to predict tagged SNPs. This justifies the selection of tag SNPs on the basis of haplotype informativeness, although genotyping studies do not directly assess haplotypes. A software that implements our approach is available.Genetic association studies at the population level are one of the most promising ways to discover the genetic basis of subtle human phenotypes such as complex diseases or drug responses [1-3]. The aim of these studies is to map genetic factors underlying such phenotypes by comparing genetic information and phenotypes of individuals sampled from a population. As whole-genome sequencing for each individual remains currently impossible, the genetic information is typically assessed through a set of genetic markers that carry information on their neighborhoods due to Linkage Disequilibrium (LD). Single Nucleotide Polymorphisms (SNPs), the most common type of polymorphisms in the human genome, are markers of great interest in this context. In fact, they are so common that
Variance adjusted weighted UniFrac: a powerful beta diversity measure for comparing communities based on phylogeny
Qin Chang, Yihui Luan, Fengzhu Sun
BMC Bioinformatics , 2011, DOI: 10.1186/1471-2105-12-118
Abstract: We develop a new statistic termed variance adjusted weighted UniFrac (VAW-UniFrac) to compare two communities based on the phylogenetic relationships of the individuals. The VAW-UniFrac is used to test if the two communities are different. To test the power of VAW-UniFrac, we first ran a series of simulations which revealed that it always outperforms W-UniFrac, as well as UniFrac when the individuals are not uniformly distributed. Next, all three methods were applied to analyze three large 16S rRNA sequence collections, including human skin bacteria, mouse gut microbial communities, microbial communities from hypersaline soil and sediments, and a tropical forest census data. Both simulations and applications to real data show that VAW-UniFrac can satisfactorily measure differences between communities, considering not only the species composition but also abundance information.VAW-UniFrac can recover biological insights that cannot be revealed by other beta diversity measures, and it provides a novel alternative for comparing communities.The assessment of differences between communities is an important problem in ecological studies. By comparing the compositions of natural communities from different environments, locations or time periods, we can learn how specific factors affect community assembly and how species or individuals associate with each other [1-3]. The development of next-generation high-throughput sequencers, such as the 454 Life Sciences Genome Sequencer FLX System, the Illumina 1G Genome Analysis System, and Applied Biosystems SOLiD Sequencing, has profoundly changed our approaches to ecological studies. With the rapid development of sequencing technologies, it is now possible to sequence a particular gene, such as 16S rRNA sequences, at very high depth without culturing [2,4-6]. The new sequencing technologies also make it possible to efficiently and economically sequence the whole metagenome within a community [7,8]. These techniques have revealed h
Prioritizing functional modules mediating genetic perturbations and their phenotypic effects: a global strategy
Li Wang, Fengzhu Sun, Ting Chen
Genome Biology , 2008, DOI: 10.1186/gb-2008-9-12-r174
Abstract: How to interpret the nature of biological processes, which, when perturbed, cause certain phenotypes, such as human disease, is a major challenge. The completion of sequencing of many model organisms has made 'reverse genetic approaches' [1] efficient and comprehensive ways to identify causal genes for a given phenotype under investigation. For instance, genome-wide knockout strains are now available for Saccharomyces cerevisiae [2,3], and diverse high throughput RNA interference knockdown experiments have been performed, or are under development, for higher organisms, including C. elegans [4], D. melanogaster [5] and mammals [6,7].Compared to the direct genotype-phenotype correlation observed in the above experiments, what is less obvious is how genetic perturbation leads to the change of phenotypes in the complex of biological systems. That is, we might perceive the cell or organism as a dynamic system composed of interacting functional modules that are defined as discrete entities whose functions are separable from those of other modules [8]. For example, protein complexes and pathways are two types of functional modules. Using this concept as a basis for hypothesis, it is tempting to conclude that it is the perturbation of individual genes that leads to the perturbation of certain functional modules and that this, in turn, causes the observed phenotype. Previous studies have reported this type of module-based interpretation of phenotypic effects [9-11]. For example, Hart and colleagues [12] showed the distribution of gene essentiality among protein complexes in S. cerevisiae and suggested that essentiality is the product of protein complexes rather than individual genes. Other studies have made use of the modular nature of phenotypes to predict unknown causal genes [13]. In a recent study, Lage and colleagues [14] mapped diverse human diseases to their corresponding protein complexes and used such mapping to prioritize unknown disease genes within linkage interv
The effects of protein interactions, gene essentiality and regulatory regions on expression variation
Linqi Zhou, Xiaotu Ma, Fengzhu Sun
BMC Systems Biology , 2008, DOI: 10.1186/1752-0509-2-54
Abstract: Using yeast as a model system, we evaluated the effects of four separate factors and their interactions on gene expression variation: protein interaction degree, toxicity degree, number of TFs, and the presence of TATA box. Results showed that 1) gene expression variation is negatively correlated with the protein interaction degree in the protein interaction network, 2) essential genes tend to have less expression variation than non-essential genes and gene expression variation decreases with toxicity degree, and 3) the number of TFs regulating a gene is the most important factor influencing gene expression variation (R2 = 8–14%). In addition, the number of TFs regulating a gene was found to be an important factor influencing gene expression variation for both TATA-containing and non-TATA-containing genes, but with different association strength. Moreover, gene expression variation was significantly negatively correlated with toxicity degree only for TATA-containing genes.The finding that distinct mechanisms may influence gene expression variation in TATA-containing and non-TATA-containing genes, provides new insights into the mechanisms that underlie the evolution of gene expression.Gene expression variation has been studied on three different levels: single cells across a common environment [1], within one species across a variety of different environments [2,3], and across different species/strains, which is often referred to as evolutionary variation [4-8]. In this paper, we study genetic factors affecting gene expression variation within one species across many different environmental conditions. Broadly, the genetic factors affecting gene expression primarily include the binding of regulatory proteins to cis-elements in the upstream of the gene, as well as physical and genetic interactions with other genes. With the availability of many gene expression profiles, protein interaction networks, and gene regulatory networks, it is now possible to study how gene ex
A network-based integrative approach to prioritize reliable hits from multiple genome-wide RNAi screens in Drosophila
Li Wang, Zhidong Tu, Fengzhu Sun
BMC Genomics , 2009, DOI: 10.1186/1471-2164-10-220
Abstract: By analyzing 24 genome-wide RNAi screens interrogating various biological processes in Drosophila, we found that RNAi positive hits were significantly more connected to each other when analyzed within a protein-protein interaction network, as opposed to random cases, for nearly all screens. Based on this finding, we developed a network-based approach to identify false positives (FPs) and false negatives (FNs) in these screening results. This approach relied on a scoring function, which we termed NePhe, to integrate information obtained from both PPI network and RNAi screening results. Using a novel rank-based test, we compared the performance of different NePhe scoring functions and found that diffusion kernel-based methods generally outperformed others, such as direct neighbor-based methods. Using two genome-wide RNAi screens as examples, we validated our approach extensively from multiple aspects. We prioritized hits in the original screens that were more likely to be reproduced by the validation screen and recovered potential FNs whose involvements in the biological process were suggested by previous knowledge and mutant phenotypes. Finally, we demonstrated that the NePhe scoring system helped to biologically interpret RNAi results at the module level.By comprehensively analyzing multiple genome-wide RNAi screens, we conclude that network information can be effectively integrated with RNAi results to produce suggestive FPs and FNs, and to bring biological insight to the screening results.In the past few years, many groups have successfully conducted multiple genome-wide RNA interference (RNAi) screenings in C. elegans, D. melanogaster and mammals, using either whole animal or cell lines to investigate a full array of biological processes at the systems level [1-4]. Compared with classical genetic screens, such as transposon-mediated mutagenesis and somatic clonal analysis [5-7], RNAi technology is revolutionary in that it allows investigators to quickly interroga
Extreme Value Distribution Based Gene Selection Criteria for Discriminant Microarray Data Analysis Using Logistic Regression
Wentian Li,Fengzhu Sun,Ivo Grosse
Quantitative Biology , 2004, DOI: 10.1089/1066527041410445
Abstract: One important issue commonly encountered in the analysis of microarray data is to decide which and how many genes should be selected for further studies. For discriminant microarray data analyses based on statistical models, such as the logistic regression models, gene selection can be accomplished by a comparison of the maximum likelihood of the model given the real data, $\hat{L}(D|M)$, and the expected maximum likelihood of the model given an ensemble of surrogate data with randomly permuted label, $\hat{L}(D_0|M)$. Typically, the computational burden for obtaining $\hat{L}(D_0|M)$ is immense, often exceeding the limits of computing available resources by orders of magnitude. Here, we propose an approach that circumvents such heavy computations by mapping the simulation problem to an extreme-value problem. We present the derivation of an asymptotic distribution of the extreme-value as well as its mean, median, and variance. Using this distribution, we propose two gene selection criteria, and we apply them to two microarray datasets and three classification tasks for illustration.
Chromatin Regulation and Gene Centrality Are Essential for Controlling Fitness Pleiotropy in Yeast
Linqi Zhou,Xiaotu Ma,Michelle N. Arbeitman,Fengzhu Sun
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0008086
Abstract: There are a wide range of phenotypes that are due to loss-of-function or null mutations. Previously, the functions of gene products that distinguish essential from nonessential genes were characterized. However, the functions of products of non-essential genes that contribute to fitness remain minimally understood.

Copyright © 2008-2017 Open Access Library. All rights reserved.