Monday, October 29, 2012

Bioinformatics BI_V0009


title : Bioinformatics with a French accent

author: LD Hurst, L Duret

year: 2005

place of pulbish : Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, UK. †Pole BioInformatique Lyonnais LaboratoireBBE - UMR CNRS 5558, Université Claude Bernard - Lyon 1, F-69622 Villeurbanne Cedex, France


abstract :

Bioinformatics BI_V0008


title : An international showcase of bioinformatics research

author: Todd Vision

year: 2003

place of pulbish :  Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA


abstract :

Bioinformatics BI_V0007


title : Bioinformatics inspired by a tree

author: A.W Dickerman

year: 2006

place of pulbish : USA

abstract :

Bioinformatics BI_V0006


title : UTILIZATION OF BIOINFORMATICS RESOURCES IN ISOLATION OF POD-SPECIFIC GENES IN THEOBROMA CACAO

author: CL. Tan, JA Verica, A. Young, S. Pishak, SN Maximova

year: 2006

place of pulbish : USA


abstract :

Bioinformatics BI_V0005


title : MINI-BLAST: Computer Systems to Search for the Pattern Sequences in the Bioinformatics Databases

author: Gennadiy Burlak1, Christian Eduardo Martínez Guerrero1, Enrique Merino Pérez2

year: 2001

place of pulbish : IBT, Universidad Nacional Autónoma de México, av. Universidad 2001, Cuernavaca, Mor., CP 62210, México

abstract :

The bioinformatics focus on developing and applying computational-ly intensive techniques to increase the understanding of biological processes. Inthis report we create the compact computer systems mini-blast and methagraphfinding the dna sequences in the bioinformatics databases (dbs) placed in localor web configurations. Our system allows identify the gene sequences relatingto new pattern (metagenome) that is not identified yet in such dbs containingdata on known nucleotides. Such a task is quite expensive and time consumingoperation; therefore for large genomes the parallel algorithms are required. Wedevelop a graphics user-friendly interface (gui) that allows simple input thequery data and representative statistical analysis in the output. Additionally, us-er can select the particular dbs for cases when a specific alignment is required.Although the package is developed in ms .net 3.5/4.0 visual c# system, it workswith no limitations in linux in the mono framework.

Bioinformatics BI_V0004


title : Using Microbial Diversity to Teach Computational Biology and Bioinformatics

author: Sarah M. Boomer1*, Daniel P. Lodge2, Kelly Shipley1, Bryan E. Dutton1

year:

place of pulbish :
1Western Oregon University, Department of Biology, Monmouth, OR 97361
2Oregon State University, Department of Engineering, Corvallis, OR 97331

abstract :

Given that computational skills are central to many sub-disciplines in biology, we developed an undergraduate course called Computational Biology to better prepare students in this widely-applicable field. In this report, we have summarized available resources and original protocol for computational curriculum, all of which have applications beyond microbiology. We have also described specific microbial models that were uniquely selected and employed for class analysis. Using diverse microbial sequences and genomes, students navigated the National Center for Biotechnology Information (NCBI) with an emphasis on database structure, data annotations, effective database searching, understanding genome data archiving and display issues, and using analytical software to identify and rank similar sequences. Next, using original bacterial 16S rRNA sequences from our Red Layer Microbial Observatory project, students assembled and aligned multiple sequence datasets using several tools on the Biology Workbench (BW). Using resulting 16S rRNA alignments, students produced and statistically evaluated phylogenetic trees. Finally, students used a combination of software and data selected from NCBI and BW to analyze model microbial proteins, emphasizing how to view and analyze determined structure data, and how to predict protein structure using sequence information. Repeating all these methods, each student completed an original research project, comparing 20 homologous sequences to address a specific hypothesis of their own design. To complete this report, we summarized and discussed course impact and extensions.

Bioinformatics BI_V0003


title : Bioinformatics and Biomarker Discovery P t 3 E l Part 3: Examples

author: L Wong

year: 2011

place of pulbish : Birkha user Verlag, Basel-Boston-Berlin

abstract :

Bioinformatics BI_V0002


title : Bioinformatik. Methoden zur Vorhersage von RNA- und
Proteinstrukturen (Bioinformatics. Methods for RNA
and protein structure prediction)

author: G. Steger

year: 2003

place of pulbish : Birkha user Verlag, Basel-Boston-Berlin

abstract :

Bioinformatics BI_V0001


title : Bioinformatics as Viewed by a Computer Scientist

author: Raymond Wan

year: 2011

place of pulbish : University of Tokyo

abstract :

Scientists have been interested in biology (including
genetics and molecular biology) for many centuries. Both
to find out more about plants and animals, but of course
to also learn about human health.
Along with physics and chemistry, biology is one of the
natural sciences that many of us (probably) studied in
school.
Over the last decade or two, the amount and type of data
being generated has required computational methods for
data analysis. Simply put, this is the field of bioinformatics
or computational biology.

Bioinformatics BI_E0010


title : SHARE: A Semantic Web Query Engine for Bioinformatics

author: Ben P. Vandervalk, E. Luke McCarthy, Mark D. Wilkinson

year: 2009

place of pulbish : Springer Berlin Heidelberg

abstract :

Driven by the goal of automating data analyses in the field of bioinformatics, SHARE (Semantic Health and Research Environment) is a specialized SPARQL engine that resolves queries against Web Services and SPARQL endpoints. Developed in conjunction with SHARE, SADI (Semantic Automated Discovery and Integration) is a standard for native-RDF services that facilitates the automated assembly of services into workflows, thereby eliminating the need for ad hoc scripting in the construction of a bioinformatics analysis pipeline.

Bioinformatics BI_E0009


title : European Molecular Biology Organization Practical Course on COMPUTATIONAL MOLECULAR EVOLUTION

author: Giorgos Kotoulas, Antonis Magoulas, Stelios Kastrinakis, Eftichia Mironaki, Pelagia Petraki

year: 2006

place of publish : Germany


abstract :

Bioinformatics BI_E0008


title : A grid-oriented genetic algorithm framework for bioinformatics

author: Hiroaki Imade, Ryohei Morishita, Isao Ono, Norihiko Ono, Masahiro Okamoto

year: 2004

place of pulbish : Springer-Verlag

abstract :

In this paper, we propose a framework for enabling for researchers of genetic algorithms (GAs) to easily develop GAs running on the Grid, named “Grid-Oriented Genetic algorithms (GOGAs)”, and actually “Gridify” a GA for estimating genetic networks, which is being developed by our group, in order to examine the usability of the proposed GOGA framework. We also evaluate the scalability of the “Gridified” GA by applying it to a five-gene genetic network estimation problem on a grid testbed constructed in our laboratory.

Bioinformatics BI_E0007


title : Bioinformatics Visualization and Integration with Open Standards: The Bluejay Genomic Browser

author: Andrei L. Turinsky1, Andrew C. Ah-Seng1, Paul M.K. Gordon1, Julie N. Stromer1, Morgan L. Taschuk1, Emily W. Xu1, Christoph W. Sensen1

year: 2005

place of pulbish : canada

abstract :

We have created a new Java™-based integrated computational environment for the exploration of genomic data, called Bluejay. The system is capable of using almost any XML file related to genomic data. Non-XML data sources can be accessed via a proxy server. Bluejay has several features, which are new to Bioinformatics, including an unlimited semantic zoom capability, coupled with Scalable Vector Graphics (SVG) outputs; an implementation of the XLink standard, which features access to MAGPIE Genecards as well as any BioMOBY service accessible over the Internet; and the integration of gene chip analysis tools with the functional assignments. The system can be used as a signed web applet, Web Start, and a local stand-alone application, with or without connection to the Internet. It is available free of charge and as open source via http://bluejay.ucalgary.ca.

Bioinformatics BI_E0006


title : Bioinformatics approaches for the classification of G-protein-coupled receptors

author: Anna Gaulton and Teresa K Attwood

year: 2003

place of pulbish : School of Biological Sciences and Department of Computer Science,
University of Manchester, Oxford Road, Manchester M13 9PT, UK

abstract :

G-protein-coupled receptors are found abundantly in the human
genome, and are the targets of numerous prescribed drugs.
However, many receptors remain orphaned (i.e. with unknown
ligand specificity), and others remain poorly characterised, with
little structural information available. Consequently, there is often
a gulf between sequence data and structural and functional
knowledge of a receptor. Bioinformatics approaches may offer
one approach to bridging this gap. In particular, protein family
databases, which distil information from multiple sequence
alignments into characteristic signatures, could be used to
identify the families to which orphan receptors belong, and might
facilitate discovery of novel motifs associated with ligand binding
and G-protein-coupling.

Bioinformatics BI_E0005


title : Genetic Programming Neural Networks as a Bioinformatics Tool for Human Genetics

author: Marylyn D. Ritchie, Christopher S. Coffey, Jason H. Moore

year: 2004

place of pulbish : Springer Berlin Heidelberg

abstract :

The identification of genes that influence the risk of common, complex diseases primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. This challenge is partly due to the limitations of parametric statistical methods for detecting genetic effects that are dependent solely or partially on interactions. We have previously introduced a genetic programming neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of gene combinations associated with disease risk. Previous empirical studies suggest GPNN has excellent power for identifying gene-gene interactions. The goal of this study was to compare the power of GPNN and stepwise logistic regression (SLR) for identifying gene-gene interactions. Using simulated data, we show that GPNN has higher power to identify gene-gene interactions than SLR. These results indicate that GPNN may be a useful pattern recognition approach for detecting gene-gene interactions.

Bioinformatics BI_E0004


title : An Optimal Algorithm for Maximum-Sum Segment and Its Application in Bioinformatics

author: Tsai-Hung Fan, Shufen Lee, Hsueh-I Lu, Tsung-Shan Tsou, Tsai-Cheng Wang, Adam Yao

year: 2003

place of pulbish : Springer Berlin Heidelberg

abstract :

We study a fundamental sequence algorithm arising from bioinformatics. Given two integers L and U and a sequence A of n numbers, the maximum-sum segment problem is to find a segment A[i,j] of A with L = j+i+1 = U that maximizes A[i]+A[i+1]+···+A[j]. The problem finds applications in finding repeats, designing low complexity filter, and locating segments with rich C+G content for biomolecular sequences. The best known algorithm, due to Lin, Jiang, and Chao, runs in O(n) time, based upon a clever technique called left-negative decomposition for A. In the present paper, we present a new O(n)-time algorithm that bypasses the left-negative decomposition. As a result, our algorithm has the capability to handle the input sequence in an online manner, which is clearly an important feature to cope with genome-scale sequences. We also show how to exploit the sparsity in the input sequence: If A is representable in O(k) space in some format, then our algorithm runs in O(k) time. Moreover, practical implementation of our algorithm running on the rice genome helps us to identify a very long repeat structure in rice chromosome 1 that is previously unknown.

Bioinformatics BI_E0003


title : Ontology-based integration for bioinformatics

author: Vaida Jakonien_e and Patrick Lambrix

year: 2005

place of pulbish : Department of Computer and Information Science Linkopings universitet, Linkoping, Sweden

abstract :

Information integration systems support re-
searchers in bioinformatics to retrieve data
from multiple biological data sources. In this
paper we argue that the current approaches
should be enhanced by ontological knowledge.
We identify the di erent types of ontologi-
cal knowledge that are available on the Web
and propose an approach to use this knowl-
edge to support integrated access to multi-
ple biological data sources. We also show
that current ontology-based integration ap-
proaches only cover parts of our approach

Bioinformatics BI_E0002


title : Current bioinformatics tools in genomic biomedical research (Review)

author: ANDREAS TEUFEL, MARKUS KRUPP, ARNDT WEINMANN and PETER R. GALLE

year: 2006

place of pulbish : Department of Medicine I, Johannes Gutenberg University, Langenbeckstr. 1, D-55101 Mainz, Germany


abstract :

On the advent of a completely assembled human
genome, modern biology and molecular medicine stepped into
an era of increasingly rich sequence database information and
high-throughput genomic analysis. However, as sequence
entries in the major genomic databases currently rise exponentially,
the gap between available, deposited sequence data
and analysis by means of conventional molecular biology is
rapidly widening, making new approaches of high-throughput
genomic analysis necessary. At present, the only effective
way to keep abreast of the dramatic increase in sequence and
related information is to apply biocomputational approaches.
Thus, over recent years, the field of bioinformatics has rapidly
developed into an essential aid for genomic data analysis and
powerful bioinformatics tools have been developed, many of
them publicly available through the World Wide Web. In this
review, we summarize and describe the basic bioinformatics
tools for genomic research such as: genomic databases, genome
browsers, tools for sequence alignment, single nucleotide
polymorphism (SNP) databases, tools for ab initio gene
prediction, expression databases, and algorithms for promoter
prediction.

Bioinformatics BI_E0001


title : Bioinformatics-Guided Identification and Experimental Characterization of Novel RNA Methyltransferases
author: J.M. Bujnicki, L.Droogmans, H.Grosjean, S.K. Purshothaman, B.Lapeyre
year: 2008
place of pulbish : Springer Berlin Heidelberg
abstract :

Naturally occurring RNAs contain numerous chemically altered nucleosides. They are formed by enzymatic modification of the primary transcripts during the complex RNA maturation process. To date, a total of 96 structurally distinguishable modified nucleosides originating from different types of RNAs from many diverse organisms of the three major phylogenetic domains of life have been reported (Rozenski et al. 1999); http://medstat.med.utah.edu/RNAmods; and references therein). The pattern of modifications (type and location) depends on the RNA molecule considered, as well as, on the organism or the organelle they originate from.However, the largest number of modified nucleosides with the greatest structural diversity (a total of 81) is found in transfer RNAs, especially in tRNAs from higher organisms (Sprinzl et al. 1998; http://www.uni-bayreuth.de/departments/biochemie/trna). Other types of RNA (snRNA, snoRNA, rRNA,mRNA) also contain modified nucleosides (see http://rna.wustl.edu/snoRNAdb), however, their occurrence and particularly their diversity are lower than in tRNAs (see, for example,Limbach et al. 1995;Motorin and Grosjean 1998).

Sunday, October 28, 2012

Phylogenetics PG_Q0007


Title : Advances in the phylogenesis of Agaricales and its higher ranks and strategies for establishing phylogenetic hypotheses
Author : Rui-lin Zhao, Dennis E. Desjardin, Kasem Soytong and Kevin D. Hyde
Year Publish : 2008
Place of Publish : Springer
Abstract :
We present an overview of previous research results on the molecular phylogenetic analyses in Agaricales and its higher ranks (Agaricomycetes/Agaricomycotina/Basidiomycota) along with the most recent treatments of taxonomic systems in these taxa. Establishing phylogenetic hypotheses using DNA sequences, from which an understanding of the natural evolutionary relationships amongst clades may be derived, requires a robust dataset. It has been recognized that single-gene phylogenies may not truly represent organismal phylogenies, but the concordant phylogenetic genealogies from multiple-gene datasets can resolve this problem. The genes commonly used in mushroom phylogenetic research are summarized.

Phylogenetics PG_Q0010


Title : Using phylogenies to reveal rare events
Author :   Hamish G. Spencer
Year Publish : 2009
Place of Publish : New Zealand Science Review
Abstract :

Evolutionary biology is sometimes described slightly pejoratively
as being an historical science, dealing in past events whose
uniqueness hinders the derivation of the explanatory generalities
that characterise first-class science. I want to argue here that
rare – even unique – historical events can be studied with full
scientific rigour, in such a way that we learn something quite
general about evolutionary processes. I discuss two examples
from my own work on marine snails from the family Trochidae,
although others could easily have been chosen. Crucially, my
examples rely on the use of phylogenetic methods, championed
so effectively by David Penny.

Phylogenetics PG_Q0009


Title : Combinatorial Optimization in Computational Biology
Author :   Dan Gusfield
Year Publish :
Place of Publish : aporc.org
Abstract :

Combinatorial Optimization is a central sub-area in Operations Research that has found many applications in computational biology. In this talk I will survey some of my research in computational biology that uses graph theory, matroid theory, and integer linear programming. The biological applications come from haplotyping, the study of recombination and recombination networks, and phylogenetics.

Phylogenetics PG_Q0008


Title : Applications of answer set programming in phylogenetic systematics
Author :   Esra Erdem
Year Publish : 2011
Place of Publish : Springer
URL :  http://www.springerlink.com/index/V725L050255387M2.pdf
Abstract :
We summarize some applications of Answer Set Programming (ASP) in phylogenetics systematics, focusing on the challenges, how they are handled using computational methods of ASP, the usefulness of the ASP-based methods/tools both from the point of view of phylogenetic systematics and from the point of view of ASP.

Phylogenetics PG_Q0006


Title : Computational analysis of transposable element sequences
Author :   I King. Jordan and Nathan J. Bowen
Year Publish : 2004
Place of Publish : Springer
Abstract :

Phylogenetics PG_Q0005


Title : Neolindleya Kraenzl.(Orchidaceae), an enigmatic and largely overlooked autogamous genus from temperate East Asia
Author :   Peter G. Efimov, Robert K. Lauri and Richard M. Bateman
Year Publish : 2009
Place of Publish : Springer
Abstract :

The Asiatic orchid species Neolindleya camtschatica (Cham.) Nevski has been omitted from the majority of relevant taxonomic surveys, including the recent Genera Orchidacearum. In most studies where the species has been included, it has been assigned to the species-rich genera Gymnadenia or Platanthera. A few morphologists recognised a new monotypic genus, Neolindleya Kraenzl., to accommodate the species. More recent molecular phylogenetic studies showed barely sufficient molecular disparity to justify generic separation of this species, but demonstrated clearly that the genus Neolindleya is only distantly related toGymnadenia. However, the molecular phylogenies show relationships of equal strength of Neolindleyawith Galearis (including Amerorchis) on the one hand and Platanthera s.l. on the other. In an attempt to better resolve the phylogenetic placement of Neolindleya, and to more clearly understand its biology and ecology, we have re-examined the morphology of this enigmatic species. Our results, based partly on SEM studies, reinforce the validity of Neolindleya as a genus, indicate a closer relationship with Galearis s.l.than with Platanthera s.l., and strongly suggest that the species became an autogam following ‘accidental’ loss of a functional bursicle. Ecologically, N. camtschatica is an opportunistic species that benefits from anthropogenic habitat disturbance.


Phylogenetics PG_Q0004


Title : Detecting putative recombination events of hepatitis B virus: An updated comparative genome analysis
Author :   Lin Ye, Yuan Zhang, Yi Mei, Peng Nan and Yang Zhong
Year Publish : 2010
Place of Publish : Springer
URL :  http://www.springerlink.com/index/Y568814088585W2N.pdf
Abstract :
An updated collection of 791 human hepatitis B virus (HBV) genomes and 38 non-human primate HBV genomes was analyzed for identifying putative recombination events and their recombinants by using two bioinformatics software tools: Simplot and RDP3 with five algorithms (RDP, GENECONV, MaxChi, Chimaera, and SiScan). A total of 61 recombinants from nine putative recombination events were detected with RDP3, especially the breakpoints of six events which have both two parental sequences that can be determined precisely with Simplot. To our knowledge, 53 recombinants were found for the first time. Our study also suggests that a relatively high recombination frequency occurs in the PreC/C gene region and the position near gene boundaries.

Phylogenetics PG_Q0003


Title : A novel, combined approach to assessing species delimitation and biogeography within the well-known desmid species Micrasterias fimbriata andM. rotata (Desmidiales, Steptophyta)
Author :   Ji?? Neustupa, Jan ?t’astn?, Katar?na Nemjov?, Petra Mazalov?, Emma Goodyer, Aloisie Poul??kov? andPavel ?kaloud
Year Publish : 2011
Place of Publish : Springer
Abstract :

Morphological species of freshwater microalgae often have broad geographic distribution. However, traditional species concepts have been challenged by the results of molecular phylogenetic analyses that mostly indicate higher diversity than was previously recognized by purely morphological approaches. A degree of phenotypic differentiation or different geographic distribution of species defined by molecular data remains largely unknown. In this study, we analyzed a pair of well-known and widely distributed desmid species (Micrasterias fimbriata and M. rotata) and tested for their phylogenetic and morphological homogeneity as well as their geographic distribution. Geometric morphometric and morphological attributes of cells were used in combination with genetic analysis of the trnG ucc sequences of 30 strains isolated from a variety of European locations and obtained from culture collections. Micrasterias rotataproved to be phylogenetically homogenous across Europe while M. fimbriata turned out to be composed of two firmly delimited lineages, differing by molecular as well as by morphometric and morphological data. Published records of traditional M. fimbriata were also included in the classification discrimination analysis and were placed into the newly identified lineages upon comparison to the morphometric data collected from living material. Largely disparate geographic patterns were revealed within traditional M. fimbriata. One phylogenetic lineage is frequent in central and eastern Europe, but occurs also in the British Isles. A second lineage has been recorded in North America and in Western Europe, where its distribution is possibly limited to the west of the Rhine River. Interestingly, the morphometric analyses of the published records illustrated that the geographic differences have remained largely unchanged since the 1850s indicating a previously unknown distributional stability among microalgal species groups such as the desmids.

Phylogenetics PG_Q0002


Title : THE ARIID CATFISHES OF SINGAPORE
Author :   Heok Hee Ng
Year Publish : 2012
Place of Publish : National University of Singapore
Abstract :
This study verifies the presence of nine ariid catfish species from Singapore waters based on museum material. They are Arius cf. gagora, Arius leptonotacanthus, Arius oetik, Hemiarius sona, Hexanematichthys sagor, Netuma bilineata, Osteogeneiosus militaris, Plicofollis argyropleuron, and Plicofollis nella. Arius cf. gagora and Netuma bilineata are new records for Singapore, while Hemiarius sona is recorded for the first time in Singapore in more than a century. The occurrence of Cryptarius truncatus in Singapore waters is considered doubtful

Phylogenetics PG_Q0001


Title : Sotoa, a new genus of Spiranthinae (Orchidaceae) from Mexico and the southern United States
Author :  Gerardo A. Salazar & Claudia Ballesteros-Barrera
Year Publish : 2010
Place of Publish: LANKESTERIANA
Abstract :

Generic placement of “Deiregyne” confusa and “D.” durangesis has been inconsistent among several
recent classifications of subtribe Spiranthinae based mainly on floral characters. In this work, we assessed
the systematic position of these two species by means of cladistic parsimony analyses of nuclear (nrITS)
and plastid (trnL-trnF) DNA sequences of 36 species/21 genera of Spiranthinae. Additionally, perceived
differences in habitat preference between the two species were evaluated using geographic information
system and niche modeling tools. Our results show that, in spite of their striking similarity in overall flower
morphology, “D.” confusa and “D.” durangensis are only distantly related to one another. Instead, the former
species is strongly supported as sister to Svenkoeltzia, whereas the latter groups with Schiedeella. Niche
modeling revealed noticeable differences in the two species’ ecological preferences; no overlap of their
potential distribution areas (as inferred using the Maxent modeling method) was predicted. A new monotypic
genus, Sotoa, is proposed to accommodate “Deiregyne” confusa on the basis of genetic, morphological
and (inferred) reproductive differences from other genera of the subtribe. The main morphological feature
distinguishing Sotoa from other Spiranthinae is the folding of the bottom surface of the nectary, which is
deeply concave from outside, resulting in an internally convex surface that is covered by dense pubescence

Phylogenetics PG_J0010


Title : Molecular systematics of the skate subgenus Arctoraja (Bathyraja: Rajidae) and support for an undescribed species, the leopard skate, with comments on the phylogenetics of Bathyraja
Author :  Ingrid B. Spies, Duane E. Stevenson, James W. Orr and Gerald R. Hoff
Year Publish : 2011
Place of Publish : Springer
Abstract :

Sequence variability in the cytochrome c oxidase I (COI) gene from 226 samples of the species previously considered Bathyraja parmifera (Rajidae) revealed three distinct haplotypes, one of which represents an undescribed species, the leopard skate. Further genetic examination of four closely related North Pacific and Bering Sea skate species, Bathyraja parmifera, B. simoterus, B. smirnovi, and the leopard skate in comparison with 19 related species indicates that together these four species comprise the subgenusArctoraja. Phylogenetic analysis suggests that Arctoraja is monophyletic, but that the genus Bathyrajamay be paraphyletic due to the phylogenetic position of Rhinoraja.


Phylogenetics PG_J0009


Title : An overview of Quercus: classification and phylogenetics with comments on differences in wood anatomy
Author :  Kevin C. Nixon
Year Publish : 2007
Place of Publish : texasoakwilt.org
Abstract :

The oaks (genus Quercus) are one of the most important groups of flowering plants and
dominate large regions of the northern hemisphere. They are most prevalent in subtropical,
temperate, and montane tropical regions. Quercus is phylogenetically divided into at least five
major groups, of which three (the red oaks, white oaks, and intermediate oaks) are native to the
New World. Overall, there are more than 200 species of oak in the Western Hemisphere, and
probably a larger number in Asia, and relatively few in Europe. The center of diversity in the
Americas is in the highlands of Mexico, with a secondary center in the southern United States.
From the standpoint of susceptibility to disease, the phylogenetic groupings have some
predictive capability, and in some cases this may be related to differences in ecology,
physiology, and wood anatomy. White oaks in general are more diverse in the drier parts of
North America, and have heartwood that is typically blocked by tyloses, while red oaks generally
have fewer tyloses. Because tyloses block water flow through the heartwood, white oak wood
makes good wine barrels while red oak wood does not. Given the greater susceptibility of red
oaks to both oak wilt and sudden oak death (SOD), these differences in wood anatomy may be
relevant

Phylogenetics PG_J0008


Title : Rate variations, phylogenetics, and partial orders
Author : Sonja J. Prohaska, Guido Fritzsch and Peter F. Stadler
Year Publish : 2008
Place of Publish: bierinformatik.de
Abstract :

The systematic assessment of rate variations across large datasets requires a systematic approach for summarizing results from individual tests. Often, this is performed by coarse-graining the phylogeny to consider rate variations at the level of sub-claded. In a phylo-geographic setting, however, one is often more interested in other partitions of the data, and in an exploratory mode a pre-specified subdivision of the data is often undesirable. We propose here to arrange rate variation data as the partially ordered
set defined by the significant test results.

Phylogenetics PG_J0007


Title : Revisiting an equivalence between maximum parsimony and maximum likelihood methods in phylogenetics
Author :  Mareike Fischer and Bhalchandra Thatte
Year Publish : 2010
Place of Publish: Springer New York
Abstract :

Tuffley and Steel (Bull. Math. Biol. 59:581–607, 1997) proved that maximum likelihood and maximum parsimony methods in phylogenetics are equivalent for sequences of characters under a simple symmetric model of substitution with no common mechanism. This result has been widely cited ever since. We show that small changes to the model assumptions suffice to make the two methods inequivalent. In particular, we analyze the case of bounded substitution probabilities as well as the molecular clock assumption. We show that in these cases, even under no common mechanism, maximum parsimony and maximum likelihood might make conflicting choices. We also show that if there is an upper bound on the substitution probabilities which is ‘sufficiently small’, every maximum likelihood tree is also a maximum parsimony tree (but not vice versa).

Phylogenetics PG_J0006


Title : Sporal characters in Gomphales and their significance for phylogenetics
Author :  Margarita Villegas, Joaqu?n Cifuentes and Arturo Estrada-Torres
Year Publish : 2005
Place of Publish: Fungal Diversity
Abstract :

Traditionally, sporal characters, such as color, shape and ornamentation, have been important in differentiating the various genera within the Gomphales. In some instances, however, no precise analyses have been made that would allow us to build primary homologies between these and other spore features. For this study, the characteristics of the basidiospores of 14 taxa of Gomphales were examined,using both photonic and electronic microscopy. These examinations clearly demonstrated that spore ornamentation is a very variable character and data, such as the base shape of the spore and the hilar appendix, previously not considered in the taxonomy of this group, can be very informative at this level

Phylogenetics PG_J0005


Title : Molecular phylogenetics employing modern and ancient DNA
Author : Carsten M. PUSCH1, Martina BROGHAMMER1,2, Nikolaus BLIN
Year Publish : 2003
Place of Publish: J. Appl. Genet
Abstract :

Comparative studies of DNA in recent populations and characterisation of ancient hereditary material have contributed very interesting facts to our understanding of evolution of modern mankind. Analysis of DNA homology in related species, assessment of mutations and polymorphisms in various populations and new DNA sequence data from prehistoric finds allowed – via sophisticated DNA extraction techniques,PCR, sequencing and digitalised processing of genetic information – insights into possible roots of Homo sapiens and related species, migration patterns and ancient cultural habits, thus enriching the palaeoanthropological discipline. However, a presentation of this development would not be complete without pointing towards the methodological limitations and manifold presentations burdened with artifacts, data misinterpretation and unjustified conclusions. Presently, this modern field of research is in its consolidation phase and new parameters for quality control and authentication are being implemented to avoid spectacular but unfounded reports. It is expected that most of the problems connected to old biomolecules may be closely related to fossilisation parameters. The future challenge will be the full understanding of the complex and multi-faceted processes underlying diagenesis, including the elucidation of nucleic acid “postmortem damage”.

Phylogenetics PG_J0004


Title : From Haeckel’s phylogenetics and Hennig’s cladistics to the method of maximum likelihood: Advantages and limitations of modern and traditional approaches to phylogeny reconstruction
Author : V. A. Lukhtanov
Year Publish : 2010
Place of Publish: MAIK Nauka/Interperiodica distributed exclusively by Springer Science+Business Media LLC.
Abstract :
The maximum likelihood and Bayesian methods are based on parametric models of character evolution. They assume that if we know these models as well as distribution of character states in studied organisms, we can infer the probability of different phylogenetic trajectories leading from ancestors to modern forms. In fact, these methods are mathematized variants of the traditional Haeckel’s approach to phylogeny reconstruction. In contrast to classical and parsimonious cladistics, they infer phylogenies without such limitations as necessity of strictly dichotomous evolution, exclusion of plesiomorphic characters, and acceptance of only holophyletic taxa. They assume that evolution may be reticulated, any homologous characters—both apomorphic and plesiomorphic—can be used for inferring phylogenies, and interpretation of evolutionary lineages as taxa is optional. Thus, the main difference between the new and more traditional approaches to phylogeny reconstruction lies not in the characters used (molecular or morphological) but in the methodology of analysis. It must be admitted that a revolution began in phylogenetics 10–20 years ago. However, the fundamental changes in phylogenetics have been carried out so calmly and neatly by the people who started this revolution, that many systematists still do not realize their importance.

Phylogenetics PG_J0003


Title : Molecular phylogenetics and host ranges of the Melanterius weevils used as biocontrol agents of Australian acacias in South Africa
Author : GM Clarke
Year Publish : 2002
Place of Publish: CSIRO Entomology, Canberra
URL :
Abstract :

Phylogenetics PG_J0002


Title : Which was first, TSD or GSD
Author : Janzen, F. J. & J. G. Krenz
Year Publish : 2004
Place of Publish: Smithsonian Institution
Abstract :

Phylogenetics PG_J0001


Title : Morphological phylogenetics of the early Bivalvia
Author : Carter,Joseph G. Campbell,David C Campbell,Matthew R
Year Publish : 2006
Place of Publish: International Congress on Bivalvia
URL :  http://www.senckenberg.uni-frankfurt.de/odes/06-16/Carter_et_al_Phylogeny-EarlyBiv.pdf
Abstract :

Genomics GM_Q0010


Title : Everything I need to know about genomics, I learned from Yogi Berra
Author : Gregory A Petsko
Year Publish : 2002
Place of Publish : BioMed Central Ltd
Abstract :

“I really didn’t say everything I said.” Lawrence Peter (Yogi) Berra As 2002 draws to a close, I find  yself contemplating the future of biology with hope, but also with uncertainty and some apprehension. Will biology continue to be the richest source of discoveries and ideas in all the sciences? Will it become just another Big Science, like nuclear physics, in which the many slave for the enrichment of the few and powerful? Will it bog down in its own arrogance? Will it lose the trust of the public as bioterrorism fears grow and genetically modified organisms try to pervade the marketplace? Or will it become the savior of mankind, ridding the world of disease and famine? And will I be able to get in on any of this?


Genomics GM_Q0009


Title : Beyond 100 genomes
Author : Paul Janssen, Benjamin Audit, Ildefonso Cases, Nikos Darzentas, Leon Goldovsky, Victor Kunin, Nuria Lopez-Bigas, José Manuel Peregrin- Alvarez, José B Pereira-Leal, Sophia Tsoka and Christos A Ouzounis
Year Publish : 2003
Place of Publish : BioMed Central Ltd
Abstract :

Since the publication of the first entire genome sequence seven years ago [1], a multitude of other genomes have been - or are in the process of being - sequenced [2]. By the end of 2002, we witnessed the landmark submission of the 100th complete genome sequence in the databases [3]. There are now 106 complete genomes in the public domain, thanks to advances in sequencing technology and sustained funding. An overview, and in particular the rank ordering, of these genomes reveals certain interesting trends and provides valuable insights into possible future developments.

Genomics GM_Q0008


Title : Approaching biomarker discovery through genomics
Author : Stephen S. Rich
Year Publish : 2008
Place of Publish : Springer New York
Abstract :

The promise of genomic medicine lies in the ability to identify those factors that modify risk of disease at the individual level and, once identified, to be able to provide a personalized treatment or intervention to ablate the disease process. This concept is based upon a number of assumptions and current limitations that genomic science has yet to address. Critical to the development of personalized medicine is the determination of the genetic and epidemiologic cause of complex human disease, such as coronary heart disease, diabetes, asthma, and stroke. The risk factors that predispose an individual to any one of these disorders may not be unique, and the genomic profiles may be similar. Increasing the complexity of understanding the pathogenesis of these disorders is the growing recognition that the genetic risk factors likely interact not only with each other but also with poorly understood environmental factors. Ultimately, the prediction of an individual’s risk for any disorder will be determined by their genotype and their environmental exposures; however, in the absence of a defined genomic fingerprint, a subset of confirmed genetic risk factors can be used to help define biomarkers of disease. Clinically validated biomarkers can then serve as surrogates for the combined effects of genotype and environment and provide insights into disease pathogenesis.

Genomics GM_Q0007


Title : Unassigned MURF1 of kinetoplastids codes for NADH dehydrogenase subunit 2
Author : Sivakumar Kannan and Gertraud Burger
Year Publish : 2008
Place of Publish : BioMed Central Ltd
Abstract :

Background: In a previous study, we conducted a large-scale similarity-free function prediction of mitochondrion-encoded hypothetical proteins, by which the hypothetical gene murf1 (maxicircle unidentified reading frame 1) was assigned as nad2, encoding subunit 2 of NADH dehydrogenase (Complex I of the respiratory chain). This hypothetical gene occurs in the mitochondrial genome of kinetoplastids, a group of unicellular eukaryotes including the causative agents of African sleeping sickness and leishmaniasis. In the present study, we test this assignment by using bioinformatics methods that are highly sensitive in identifying remote homologs and confront the prediction with available biological knowledge.
Results: Comparison of MURF1 profile Hidden Markov Model (HMM) against function-known profile HMMs in Pfam, Panther and TIGR shows that MURF1 is a Complex I protein, but without specifying the exact subunit. Therefore, we constructed profile HMMs for each individual subunit, using all available sequences clustered at various identity thresholds. HMM-HMM comparison of these individual NADH subunits against MURF1 clearly identifies this hypothetical protein as NAD2. Further, we collected the relevant experimental information about kinetoplastids, which provides additional evidence in support of this prediction.

Genomics GM_Q0006


Title : Automated multi-dimensional purification of tagged proteins
Author : Jill A. Sigrell, P?r Eklund, Markus Galin, Lotta Hedkvist, Pia Liljedahl, Christine Markeland Johansson,Thomas Pless and Karin Torstenson
Year Publish : 2003
Place of Publish : Springer Netherlands
Abstract :

The capacity for high throughput purification (HTP) is essential in fields such as structural genomics where large numbers of protein samples are routinely characterized in, for example, studies of structural determination, functionality and drug development. Proteins required for such analysis must be pure and homogenous and available in relatively large amounts. ?KTAT 3D system is a powerful automated protein purification system, which minimizes preparation, run-time and repetitive manual tasks. It has the capacity to purify up to 6 different His6- or GST-tagged proteins per day and can produce 1–50 mg protein per run at >90% purity. The success of automated protein purification increases with careful experimental planning. Protocol, columns and buffers need to be chosen with the final application area for the purified protein in mind.

Genomics GM_Q0005


Title : Proteomics technologies and challenges
Author : William C.S. Cho
Year Publish : 2007
Place of Publish : Geno. Prot. Bioinfo
Abstract :

Proteomics is the study of proteins and their interactions in a cell. With the completion of the Human Genome Project, the emphasis is shifting to the protein compliment of the human organism. Because proteome reflects more accurately on the dynamic state of a cell, tissue, or organism, much is expected from proteomics to yield better disease markers for diagnosis and therapy monitoring. The advent of proteomics technologies for global detection and quantitation of proteins creates new opportunities and challenges for those seeking to gain greater understanding of diseases. High-throughput proteomics technologies combining with advanced bioinformatics are extensively used to identify molecular signatures of diseases based on protein pathways and signaling cascades. Mass spectrometry plays a vital role in proteomics and has become an indispensable tool for molecular and cellular biology. While the potential is great, many challenges and issues remain to be solved, such as mining low abundant proteins and integration of proteomics with genomics and metabolomics data. Nevertheless, proteomics is the foundation for constructing and extracting useful knowledge to biomedical research. In this review, a snapshot of contemporary issues in proteomics technologies is discussed

Genomics GM_Q0004


Title : Individual gene cluster statistics in noisy maps
Author : Narayanan Raghupathy and Dannie Durand
Year Publish : 2005
Place of Publish : Springer Berlin / Heidelberg
Abstract :

Identification of homologous chromosomal regions is important for understanding evolutionary processes that shape genome evolution, such as genome rearrangements and large scale duplication events. If these chromosomal regions have diverged significantly, statistical tests to determine whether observed similarities in gene content are due to history or chance are imperative. Currently available methods are typically designed for genomic data and are appropriate for whole genome analyses. Statistical methods for estimating significance when a single pair of regions is under consideration are needed. We present a new statistical method, based on generating functions, for estimating the significance of orthologous gene clusters under the null hypothesis of random gene order. Our statistics is suitable for noisy comparative maps, in which a one-to-one homology mapping cannot be established. It is also designed for testing the significance of an individual gene cluster in isolation, in situations where whole genome data is not available. We implement our statistics in Mathematica and demonstrate its utility by applying it to the MHC homologous regions in human and fly

Genomics GM_Q0003


Title : Involvement of potential pathways in malignant transformation from oral leukoplakia to oral squamous cell carcinoma revealed by proteomic analysis
Author : Zhi Wang, Xiaodong Feng, Xinyu Liu, Lu Jiang, Xin Zeng, Ning Ji, Jing Li, Longjiang Li and Qianming Chen
Year Publish : 2009
Place of Publish : BioMed Central Ltd
Abstract :

Background: Oral squamous cell carcinoma (OSCC) is one of the most common forms of cancer associated with the presence of precancerous oral leukoplakia. Given the poor prognosis associated with oral leukoplakia, and the difficulties in distinguishing it from cancer lesions, there is an urgent need to elucidate the molecular determinants and critical signal pathways underlying the malignant transformation of precancerous to cancerous tissue, and thus to identify novel diagnostic and therapeutic target.
Results: We have utilized two dimensional electrophoresis (2-DE) followed by ESI-Q-TOF-LCMS/ MS to identify proteins differentially expressed in six pairs of oral leukoplakia tissues with dysplasia and oral squamous cancer tissues, each pair was collected from a single patient. Approximately 85 differentially and constantly expressed proteins (> two-fold change, P < 0.05) were identified, including 52 up-regulated and 33 down-regulated. Gene ontological methods were employed to identify the biological processes that were over-represented in this carcinogenic stage. Biological networks were also constructed to reveal the potential links between those protein candidates. Among them, three homologs of proteosome activator PA28 a, b and g were shown to have up-regulated mRNA levels in OSCC cells relative to oral keratinocytes.
Conclusion: Varying levels of differentially expressed proteins were possibly involved in the malignant transformation of oral leukoplakia. Their expression levels, bioprocess, and interaction networks were analyzed using a bioinformatics approach. This study shows that the three homologs of PA28 may play an important role in malignant transformation and is an example of a systematic biology study, in which functional proteomics were constructed to help to elucidate mechanistic aspects and potential involvement of proteins. Our results provide new insights into the pathogenesis of oral cancer. These differentially expressed proteins may have utility as useful candidate markers of OSCC.

Genomics GM_Q0002


Title : The information nucleus–a new concept to enhance sheep industry genetic improvement
Author : N.M. Fogarty, R.G. Banks, J.H.J. van der Werf, A.J. Ball, J.P. Gibson
Year Publish : 2006
Place of Publish : Proc. Assoc. Advmt. Anim. Breed. Genet
Abstract :

The Information Nucleus is an innovation for sheep industry improvement that is a program of the new CRC for Sheep Industry Innovation. It will allow breeders to quickly exploit new technology and molecular information to achieve more rapid genetic improvement in the industry. Key young industry sires are progeny tested for an extensive range of traits in widely differing environments. Genetic information will be generated on new traits and those that are difficult to measure commercially. Some of the benefits will flow immediately to industry through enhanced accuracy of Australian Sheep Breeding Values (ASBVs) for current and new traits. The longer term benefits to industry will flow from development of sheep genomic technologies such as whole genome scans and molecular breeding values combined with existing quantitative ASBVs. The Information Nucleus involves mating 100 sires to 5000 ewes annually across 8 sites over the range of sheep production environments in Australia. The progeny represent the major Merino and crossbred types in the industry and will be evaluated for a wide range of growth, carcass, meat, wool, reproduction and parasite resistance traits.

Genomics GM_Q0001


Title : Direct sequencing and expression analysis of a large number of miRNAs in Aedes aegypti and a multi-species survey of novel mosquito miRNAs
Author : Song Li, Edward A Mead, Shaohui Liang and Zhijian Tu
Year Publish : 2009
Place of Publish : BioMed Central Ltd
Abstract :

Background: MicroRNAs (miRNAs) are a novel class of gene regulators whose biogenesis involves hairpin structures called precursor miRNAs, or pre-miRNAs. A pre-miRNA is processed to make a miRNA:miRNA* duplex, which is then separated to generate a mature miRNA and a miRNA*. The mature miRNAs play key regulatory roles during embryonic development as well as other cellular processes. They are also implicated in control of viral infection as well as innate immunity. Direct experimental evidence for mosquito miRNAs has been recently reported in anopheline mosquitoes based on small-scale cloning efforts.
Results: We obtained approximately 130, 000 small RNA sequences from the yellow fever mosquito, Aedes aegypti, by 454 sequencing of samples that were isolated from mixed-age embryos and midguts from sugar-fed and blood-fed females, respectively. We also performed bioinformatics analysis on the Ae. aegypti genome assembly to identify evidence for additional miRNAs. The combination of these approaches uncovered 98 different pre-miRNAs in Ae. aegypti which could produce 86 distinct miRNAs. Thirteen miRNAs, including eight novel miRNAs identified in this study, are currently only found in mosquitoes. We also identified five potential revisions to previously annotated miRNAs at the miRNA termini, two cases of highly abundant miRNA* sequences, 14 miRNA clusters, and 17 cases where more than one pre-miRNA hairpin produces the same or highly similar mature miRNAs. A number of miRNAs showed higher levels in midgut from blood-fed female than that from sugar-fed female, which was confirmed by northern blots on two of these miRNAs. Northern blots also revealed several miRNAs that showed stage-specific expression. Detailed expression analysis of eight of the 13 mosquito-specific miRNAs in four divergent mosquito genera identified cases of clearly conserved expression patterns and obvious differences. Four of the 13 miRNAs are specific to certain lineage(s) within mosquitoes.
Conclusion: This study provides the first systematic analysis of miRNAs in Ae. aegypti and offers a substantially expanded list of miRNAs for all mosquitoes. New insights were gained on the evolution of conserved and lineage-specific miRNAs in mosquitoes. The expression profiles of a few miRNAs suggest stage-specific functions and functions related to embryonic development or blood feeding. A better understanding of the functions of these miRNAs will offer new insights in mosquito biology and may lead to novel approaches to combat mosquito-borne infectious diseases

Genomics GM_J0010


Title : Immunology and functional genomics of Behcet's disease
Author : M. Zierhut, N. Mizuki, S. Ohno, H. Inoko, A. Gül, K. Onoé and E. Isogai
Year Publish : 2003
Place of Publish : Springer Berlin / Heidelberg
Abstract :

Behçet''s disease (BD) is a multisystemic inflammatory disorder. Although the cause and pathogenesis of BD are still unclear, there is evidence for genetic, immunologic and infectious factors at the onset or in the course of BD. This review focusses on the functional genomics and immunology of BD. HLA-B51 is the major disease susceptibility gene locus in BD. An increased number of    T cells in the peripheral blood and in the involved tissues have been reported. However, the T cells at the sites of inflammation appear to be a phenotypically distinct subset. There is also a significant    T cell proliferative response to mycobacterial 65-kDa heat shock protein peptides. Homologous peptides derived from the human 60-kDa heat shock protein were observed in BD patients. There is evidence that natural killer T cells may also play a role in BD.
Behçet''s disease - vasculitis - uveitis - experimental autoimmune uveitis - HLA - streptococcus - herpes simplex virus - heat shock protein - T cells - NK-T-cells - neutrophils - cytokines - endothelial dysfunction - coagulation and fibrinolytic pathway abnormalities

Genomics GM_J0009


Title : Pharmacogenetics/genomics of membrane transporters in cancer chemotherapy
Author : Ying Huang
Year Publish : 2007
Place of Publish : Springer Berlin / Heidelberg
Abstract :

Inter-individual variability in drug response and the emergence of adverse drug reactions are main causes of treatment failure in cancer therapy. Recently, membrane transporters have been recognized as an important determinant of drug disposition, thereby affecting chemosensitivity and -resistance. Genetic factors contribute to inter-individual variability in drug transport and targeting. Therefore, pharmacogenetic studies of membrane transporters can lead to new approaches for optimizing cancer therapy. This review discusses genetic variations in efflux transporters of the ATP-binding cassette (ABC) family such as ABCB1 (MDR1, P-glycoprotein), ABCC1 (MRP1), ABCC2 (MRP2) and ABCG2 (BCRP), and uptake transporters of the solute carrier (SLC) family such as SLC19A1 (RFC1) and SLCO1B1 (SLC21A6), and their relevance to cancer chemotherapy. Furthermore, a pharmacogenomic approach is outlined, which using correlations between the growth inhibitory potency of anticancer drugs and transporter gene expression in multiple human cancer cell lines, has shown promise for determining the relevant transporters for any given drugs and predicting anticancer drug response.


Genomics GM_J0008


Title : A rapid and efficient method for purifying high quality total RNA from peaches (Prunus persica) for functional genomics analyses
Author : LEE MEISEL, BEATRIZ FONSECA, SUSANA GONZ?LEZ, RICARDO BAEZAYATES, VERONICA CAMBIAZO, REINALDO CAMPOS, MAURICIO GONZALEZ, ARIEL ORELLANA, JULIO RETAMALES and HERMAN SILVA
Year Publish : 2005
Place of Publish: Biol Res
Abstract :

Prunus persica has been proposed as a genomic model for deciduous trees and the Rosaceae family. Optimized protocols for RNA isolation are necessary to further advance studies in this model species such that functional genomics analyses may be performed. Here we present an optimized protocol to rapidly and efficiently purify high quality total RNA from peach fruits (Prunus persica). Isolating high-quality RNA from fruit tissue is often difficult due to large quantities of polysaccharides and polyphenolic compounds that accumulate in this tissue and co-purify with the RNA. Here we demonstrate that a modified version of the method used to isolate RNA from pine trees and the woody plant Cinnamomun tenuipilum is ideal for isolating high quality RNA from the fruits of Prunus persica. This RNA may be used for many functional genomic based experiments such as RT-PCR and the construction of large-insert cDNA libraries.

Genomics GM_J0007


Title : Informatics center for mouse genomics
Author : Glenn D. Rosen, Nathan T. La Porte, Boris Diechtiareff, Christopher J. Pung, Jonathan Nissanov, Carl Gustafson, Louise Bertrand, Smadar Gefen, Yingli Fan and Oleh J. Tretiak, et al.
Year Publish : 2003
Place of Publish: Springer Berlin / Heidelberg
Abstract :

In recent years, there has been an explosion in the number of tools and techniques available to researchers interested in exploring the genetic basis of all aspects of central nervous system (CNS) development and function. Here, we exploit a powerful new reductionist approach to explore the genetic basis of the very significant structural and molecular differences between the brains of different strains of mice, called either complex trait or quantitative trait loci (QTL) analysis. Our specific focus has been to provide universal access over the web to tools for the genetic dissection of complex traits of the CNS—tools that allow researchers to map genes that modulate phenotypes at a variety of levels ranging from the molecular all the way to the anatomy of the entire brain.
Our website, The Mouse Brain Library (MBL; http://mbl.org) is comprised of four interrelated components that are designed to support this goal: The Brain Library, iScope, Neurocartographer, and WebQTL. The centerpiece of the MBL is an image database of histologically prepared museum-quality slides representing nearly 2000 mice from over 120 strains—a library suitable for stereologic analysis of regional volume. The iScope provides fast access to the entire slide collection using streaming video technology, enabling neuroscientists to acquire high-magnification images of any CNS region for any of the mice in the MBL. Neurocartographer provides automatic segmentation of images from the MBL by warping precisely delineated boundaries from a 3D atlas of the mouse brain. Finally, WebQTL provides statistical and graphical analysis of linkage between phenotypes and genotypes.

Genomics GM_J0006


Title : From genetical genomics to systems genetics: potential applications in quantitative genomics and animal breeding
Author : Haja N. Kadarmideen, Peter von Rohr and Luc L.G. Janss
Year Publish : 2006
Place of Publish: Springer Berlin / Heidelberg
Abstract :

This article reviews methods of integration of transcriptomics (and equally proteomics and metabolomics), genetics, and genomics in the form of systems genetics into existing genome analyses and their potential use in animal breeding and quantitative genomic modeling of complex traits. Genetical genomics or the expression quantitative trait loci (eQTL) mapping method and key findings in this research are reviewed. Various procedures and potential uses of eQTL mapping, global linkage clustering, and systems genetics are illustrated using actual analysis on recombinant inbred lines of mice with data on gene expression (for diabetes- and obesity-related genes), pathway, and single nucleotide polymorphism (SNP) linkage maps. Experimental and bioinformatics difficulties and possible solutions are discussed. The main uses of this systems genetics approach in quantitative genomics were shown to be in refinement of the identified QTL, candidate gene and SNP discovery, understanding gene-environment and gene-gene interactions, detection of candidate regulator genes/eQTL, discriminating multiple QTL/eQTL, and detection of pleiotropic QTL/eQTL, in addition to its use in reconstructing regulatory networks. The potential uses in animal breeding are direct selection on heritable gene expression measures, termed “expression assisted selection,” and genetical genomic selection of both QTL and eQTL based on breeding values of the respective genes, termed “expression-assisted evaluation.”

Genomics GM_J0005


Title : Rapidly developing functional genomics in ecological model systems via 454 transcriptome sequencing
Author : Christopher W. Wheat
Year Publish : 2010
Place of Publish: Springer Berlin / Heidelberg
Abstract :

Next generation sequencing technology affords new opportunities in ecological genetics. This paper addresses how an ecological genetics research program focused on a phenotype of interest can quickly move from no genetic resources to having various functional genomic tools. 454 sequencing and its error rates are discussed, followed by a review of de novo transcriptome assemblies focused on the first successful de novo assembly which happens to be in an ecological model system (the Glanville fritillary butterfly). The potential future developments in 454 sequencing are also covered. Particular attention is paid to the difficulties ecological geneticists are likely to encounter through reviewing relevant studies in both model and non-model systems. Various post-sequencing issues and applications of 454 generated data are presented (e.g. database management, microarray construction, molecular marker and candidate gene development). How to use species with genomic resources to inform study of those without is also discussed. In closing, some of the drawbacks of 454 sequencing are presented along with future prospects of this technology.

Genomics GM_J0004


Title : Sparse statistical modelling in gene expression genomics
Author : Joe Lucasa, Carlos Carvalhoa, Quanli Wanga,b, Andrea Bildb, Joe Nevinsband Mike Westa
Year Publish : 2006
Place of Publish: University Press
Abstract :

The concept of sparsity is more and more central to practical data analysis and inference with increasingly high-dimensional data. Gene expression genomics is a key example context. As part of a series of projects that has developed Bayesian methodology for large-scale regression, ANOVA and latent factor models, we have extended traditional Bayesian “variable selection” priors and modelling ideas to new hierarchical sparsity priors that are providing substantial practical gains in addressing false discovery and isolating significant gene-specific parameters/ effects in highly multivariate studies involving thousands of genes. We discuss and review these developments, in the contexts of multivariate regression, ANOVA and latent factor models for multivariate gene expression data arising in either observational or designed experimental studies. The development includes the use of sparse regression components to provide gene-sample specific normalisation/correction based on control and housekeeping factors, an important general issue and one that can be critical - and critically misleading if ignored - in many gene expression studies. Two rich data sets are used to provide context and illustration. The first data set arises from a gene expression experiment designed to investigate the transcriptional response - in terms of responsive gene subsets and their expression signatures - to interventions that up-regulate a series of key oncogenes. The second data set is observational, breast cancer tumour-derived data evaluated utilising a sparse latent factor model to define and isolate factors underlying the hugely complex patterns of association in gene expression patterns. We also mention software that implements these and other models and methods in one comprehensive framework.

Genomics GM_J0003


Title : Effects of tyrosine hydroxylase mutants on locomotor activity in Drosophila: a study in functional genomics
Author : Robert G. Pendleton, Aseel Rasheed, Thomas Sardina, Tim Tully and Ralph Hillman
Year Publish : 2002
Place of Publish: Springer Berlin / Heidelberg
Abstract :

The brain of the adult fruit fly, Drosophila melanogaster, contains tyrosine hydroxylase, the rate-limiting enzyme required for catecholamine biosynthesis, as well as dopa decarboxylase. Catecholamines, principally dopamine, are also present. We have previously shown that pharmacological inhibition of tyrosine hydroxylase with  -methyl-p-tyrosine results in a dose-related inhibition of locomotor activity in adult organisms. Similar results were found with reserpine, a well-known inhibitor of catecholamine uptake into storage granules. The drug-induced inhibition could be prevented in each case by the concomitant administration of l-dopa. The single-copy gene coding for tyrosine hydroxylase in Drosophila is pale (ple). Both null and temperature-sensitive loss of function mutant alleles of ple are recessive embryonic lethals. Heterozygous null mutant flies have normal locomotor activity demonstrating that only a single dose of the wild type form of ple is required to support normal function. Both hemizygous and homozygous temperature-sensitive ple mutants (ple ts1 ) also show normal locomotor activity at the permissive temperature for this mutant allele (18°C), which progressively declines as the temperature is increased to its restrictive level (29°C). These abnormal locomotor effects are reversible by l-dopa. Thus the effects on locomotor activity resulting from the pharmacological inhibition of catecholamine synthesis or storage are the same as those resulting from lack of tyrosine hydroxylase expression. These findings indicate that brain catecholamine loss decreases locomotor activity in the fly, as it does in mammals, and demonstrate the ability of functional genomic studies to mimic that of pharmacological inhibition of enzyme function or other similar processes.

Genomics GM_J0002


Title : Chromosome-based genomics in the cereals
Author : Jaroslav Dole?el, Marie Kubal?kov?, Etienne Paux, Jan Barto? and Catherine Feuillet
Year Publish : 2007
Place of Publish: Springer Berlin / Heidelberg
Abstract :

The cereals are of enormous importance to mankind. Many of the major cereal species – specifically, wheat, barley, oat, rye, and maize – have large genomes. Early cytogenetics, genome analysis and genetic mapping in the cereals benefited greatly from their large chromosomes, and the allopolyploidy of wheat and oats that has allowed for the development of many precise cytogenetic stocks. In the genomics era, however, large genomes are disadvantageous. Sequencing large and complex genomes is expensive, and the assembly of genome sequence is hampered by a significant content of repetitive DNA and, in allopolyploids, by the presence of homoeologous genomes. Dissection of the genome into its component chromosomes and chromosome arms provides an elegant solution to these problems. In this review we illustrate how this can be achieved by flow cytometric sorting. We describe the development of methods for the preparation of intact chromosome suspensions from the major cereals, and their analysis and sorting using flow cytometry. We explain how difficulties in the discrimination of specific chromosomes and their arms can be overcome by exploiting extant cytogenetic stocks of polyploid wheat and oats, in particular chromosome deletion and alien addition lines. Finally, we discuss some of the applications of flow-sorted chromosomes, and present some examples demonstrating that a chromosome-based approach is advantageous for the analysis of the complex genomes of cereals, and that it can offer significant potential for the delivery of genome sequencing and gene cloning in these crops.

Genomics GM_J0001


Title : Environmental genomics: exploring the unmined richness of microbes to degrade xenobiotics

Author : L. Eyers, I. George, L. Schuler, B. Stenuit, S. N. Agathos and Said El Fantroussi

Year Publish : 2004

Place of Publish: Springer Berlin / Heidelberg

Abstract :

Increasing pollution of water and soils by xenobiotic compounds has led in the last few decades to an acute need for understanding the impact of toxic compounds on microbial populations, the catabolic degradation pathways of xenobiotics and the set-up and improvement of bioremediation processes. Recent advances in molecular techniques, including high-throughput approaches such as microarrays and metagenomics, have opened up new perspectives and pointed towards new opportunities in pollution abatement and environmental management. Compared with traditional molecular techniques dependent on the isolation of pure cultures in the laboratory, microarrays and metagenomics allow specific environmental questions to be answered by exploring and using the phenomenal resources of uncultivable and uncharacterized micro-organisms. This paper reviews the current potential of microarrays and metagenomics to investigate the genetic diversity of environmentally relevant micro-organisms and identify new functional genes involved in the catabolism of xenobiotics.

Saturday, October 27, 2012

Computational Biology CB_Q0010


Title : Constrained LCS: hardness and approximation
Author : Zvi Gotthilf, Danny Hermelin and Moshe Lewenstein
Year Publish : 2008
Place of Publish : Springer Berlin / Heidelberg
Abstract :

The problem of finding the longest common subsequence (LCS) of two given strings A 1 and A 2 is a well-studied problem. The constrained longest common subsequence (C-LCS) for three strings A 1, A 2 and B 1 is the longest common subsequence of A 1 and A 2 that contains B 1 as a subsequence. The fastest algorithm solving the C-LCS problem has a time complexity of O(m 1 m 2 n 1) where m 1, m 2 and n 1 are the lengths of A 1, A 2 and B 1 respectively. In this paper we consider two general variants of the C-LCS problem. First we show that in case of two input strings and an arbitrary number of constraint strings, it is NP-hard to approximate the C-LCS problem. Moreover, it is easy to see that in case of an arbitrary number of input strings and a single constraint, the problem of finding the constrained longest common subsequence is NP-hard. Therefore, we propose a linear time approximation algorithm for this variant, our algorithm yields a1  mmin     approximation factor, where m min is the length of the shortest input string and |?| is the size of the alphabet.


Computational Biology CB_Q0009


Title : An interactive multi-user system for simultaneous graph drawing
Author : Stephen G. Kobourov and Chandan Pitta
Year Publish : 2005
Place of Publish : : Springer Berlin / Heidelberg
Abstract :

In this paper we consider the problem of simultaneous drawing of two graphs. The goal is to produce aesthetically pleasing drawings for the two graphs by means of a heuristic algorithm and with human assistance. Our implementation uses the DiamondTouch table, a multi-user, touch-sensitive input device, to take advantage of direct physical interaction of several users working collaboratively. The system can be downloaded at http://dt.cs.arizona.edu where it is also available as an applet.

Computational Biology CB_Q0008


Title : Hurdles hardly have to be heeded
Author : Krister M. Swenson, Yu Lin, Vaibhav Rajan and Bernard M. E. Moret
Year Publish : 2008
Place of Publish : : Springer Berlin / Heidelberg
Abstract :

As data about genomic architecture accumulates, genomic rearrangements have attracted increasing attention. One of the main rearrangement mechanisms, inversions (also called reversals), was characterized by Hannenhalli and Pevzner and this characterization in turn extended by various authors. The characterization relies on the concepts of breakpoints, cycles, and obstructions colorfully named hurdles and fortresses. In this paper, we study the probability of generating a hurdle in the process of sorting a permutation if one does not take special precautions to avoid them (as in a randomized algorithm, for instance). To do this we revisit and extend the work of Caprara and of Bergeron by providing simple and exact characterizations of the probability of encountering a hurdle in a random permutation. Using similar methods we, for the first time, find an asymptotically tight analysis of the probability that a fortress exists in a random permutation.

Computational Biology CB_Q0007


Title : Homology search with fragmented nucleic acid sequence patterns
Author : Axel Mosig, Julian J. -L. Chen and Peter F. Stadler
Year Publish : 2007
Place of Publish : : Springer Berlin / Heidelberg
Abstract :

The comprehensive annotation of non-coding RNAs in newly sequenced genomes is still a largely unsolved problem because many functional RNAs exhibit not only poorly conserved sequences but also large variability in structure. In many cases, such as Y RNAs, vault RNAs, or telomerase RNAs, sequences differ by large insertions or deletions and have only a few small sequence patterns in common.
Here we present fragrep2, a purely sequence-based approach to detect such patterns in complete genomes. Afragrep2 pattern consists of an ordered list of position-specific weight matrices (PWMs) describing short, approximately conserved sequence elements, that are separated by intervals of non-conserved regions of bounded length. The program uses a fractional programming approach to align the PWMs to genomic DNA in order to allow for a bounded number of insertions and deletions in the patterns. These patterns are then combined to significant combinations of PWMs. At this step, a subset of PWMs may be deleted, i.e., have no match in the current region of the genome. The program furthermore estimates p- and E-values for the matches.
We apply fragrep2 to homology searches for RNase MRP, unveiling two previously unidentified matches as well as reproducing the results of two previous surveys. Furthermore, we complement the picture of vertebrate vault RNAs, a class of ncRNAs that has not received much attention so far.

Computational Biology CB_Q0006


Title : Time and space efficient RNA-RNA interaction prediction via sparse folding
Author : Raheleh Salari, Mathias M?hl, Sebastian Will, S. Cenk Sahinalp and Rolf Backofen
Year Publish : 2010
Place of Publish : : Springer Berlin / Heidelberg
Abstract :

In the past years, a large set of new regulatory ncRNAs have been identified, but the number of experimentally verified targets is considerably low. Thus, computational target prediction methods are on high demand. Whereas all previous approaches for predicting a general joint structure have a complexity of O(n 6) running time and O(n 4) space, a more time and space efficient interaction prediction that is able to handle complex joint structures is necessary for genome-wide target prediction problems. In this paper we show how to reduce both the time and space complexity of the RNA-RNA interaction prediction problem as described by Alkan et al. [1] via dynamic programming sparsification - which allows to discard large portions of DP tables without loosing optimality. Applying sparsification techniques reduces the complexity of the original algorithm from O(n 6) time and O(n 4) space to O(n 4 ?(n)) time and O(n 2 ?(n)?+?n3) space for some function ?(n), which turns out to have small values for the range of n that we encounter in practice. Under the assumption that the polymer-zeta property holds for RNA-structures, we demonstrate that ?(n)?=?O(n) on average, resulting in a linear time and space complexity improvement over the original algorithm. We evaluate our sparsified algorithm for RNA-RNA interaction prediction by total free energy minimization, based on the energy model of Chitsaz et al.[2], on a set of known interactions. Our results confirm the significant reduction of time and space requirements in practice.

Computational Biology CB_Q0005


Title : An agent-based domain specific framework for rapid prototyping of applications in evolutionary biology
Author : Tran Cao Son, Enrico Pontelli, Desh Ranjan, Brook Milligan and Gopal Gupta
Year Publish : 2004
Place of Publish : : Springer Berlin / Heidelberg
Abstract :

In this paper we present a brief overview of the ?LOG project, aimed at the development of a domain specific framework for the rapid prototyping of applications in evolutionary biology. This includes the development of a domain specific language, called ?LOG, and an agent-based implementation for the monitoring and execution of ?LOG’s programs. A ?LOG program – representing an intended application from an evolutionary biologist – is a specification of what to do to achieve her/his goal. The execution and monitoring component of our system will automatically figure out how to do it. We achieve that by viewing the available bioinformatic tools and data repositories as web services and casting the problem of execution of a sequence of bioinformatic services (possibly with loops, branches, and conditionals, specified by biologists) as the web services composition problem.

Computational Biology CB_Q0004


Title : Regularization and noise injection for improving genetic network models
Author : Eugene van Someren, Lodewyk Wessels, Marcel Reinders and Eric Backer
Year Publish : 2006
Place of Publish : : Springer Berlin / Heidelberg
Abstract :

Computational Biology CB_Q0003


Title : Discovering relationships among dispersed repeats using spatial association rule mining
Author : Surya Saha, Susan Bridges, Zenaida Magbanua and Daniel G Peterson
Year Publish : 2008
Place of Publish : BioMed Central Ltd
Abstract :

Background
DNA in eukaryotic genomes is characterized, and often dominated, by repetitive, non-genic DNA sequences. Initially thought to be non-functional, repeats have been found to influence gene expression [1] and provide diversity to the genome via mutation. Mobile repeat sequences [2](transposons) have played a prominent role in the evolutionary histories of eukaryotic genomes [3,4], and their persistence in eukaryotic DNA indicates that they have, on the whole, been evolutionarily advantageous. While there are an increasing number of algorithms that have been developed for discovering novel dispersed repeats [5-7], significant analysis of the repeats and their relationships to other genome features will be required before we can truly understand the complex ways in which dispersed repeat sequences contribute to evolutionary fitness. We propose a spatial proximity rule based data mining technique to discover highly fragmented repeat regions for which only the conserved parts are reported by a computational repeat finder.


Computational Biology CB_Q0002


Title : The protein sequence design problem in canonical model on 2D and 3D lattices
Author : Piotr Berman, Bhaskar DasGupta, Dhruv Mubayi, Robert Sloan, Gy?rgy Tur?n and Yi Zhang
Year Publish : 2004
Place of Publish : Springer Berlin / Heidelberg
Abstract :

In this paper we investigate the protein sequence design (PSD) problem (also known as the inverse protein folding problem) under the Canonical modelon 2D and 3D lattices [12,25]. The Canonical model is specified by (i) a geometric representation of a target protein structure with amino acid residues via itscontact graph, (ii) a binary folding code in which the amino acids are classified as hydrophobic (H) or polar(P), (iii) an energy function? defined in terms of the target structure that should favor sequences with adense hydrophobic core and penalize those with many solvent-exposed hydrophobic residues (in the Canonical model, the energy function ? gives an H-H residue contact in the contact graph a value of –1 and all other contacts a value of 0), and (iv) to prevent the solution from being a biologically meaningless all H sequence, the number of H residues in the sequence S is limited by fixing an upper bound ? on the ratio between H and P amino acids. The sequence S is designed by specifying which residues are H and which ones are P in a way that realizes the global minima of the energy function ?. In this paper, we prove the following results:
(1) An earlier proof of NP-completeness of finding the global energy minima for the PSD problem on 3D lattices in [12] was based on the NP-completeness of the same problem on 2D lattices. However, the reduction was not correct and we show that the problem of finding the global energy minima for the PSD problem for 2D lattices can be solved efficiently in polynomial time. But, we show that the problem of finding the global energy minima for the PSD problem on 3D lattices is indeed NP-complete by a providing a different reduction from the problem of finding the largest clique on graphs.
(2) Even though the problem of finding the global energy minima on 3D lattices is NP-complete, we show that an arbitrarily close approximation to the global energy minima can indeed be found efficiently by taking appropriate combinations of optimal global energy minima of substrings of the sequence S by providing a polynomial-time approximation scheme (PTAS). Our algorithmic technique to design such a PTAS for finding the global energy minima involves using the shifted slice-and-dice approach in [6,17,18]. This result improves the previous best polynomial-time approximation algorithm for finding the global energy minima in [12] with a performance ratio of 1\over 2.

Computational Biology CB_Q0001


Title : Overview of the entity relations (REL) supporting task of BioNLP Shared Task 2011
Author : Sampo Pyysalo_ Tomoko Ohta_ Jun’ichi Tsujii
Year Publish : 2011
Place of Publish : Omnipress, Inc
Abstract :

This paper presents the Entity Relations (REL) task, a supporting task of the BioNLP Shared Task 2011. The task concerns the extraction of two types of part-of relations between a gene/protein and an associated entity. Four teams submitted final results for the REL task, with the highest-performing system achieving 57.7% F-score. While experiments suggest use of the data can help improve event extraction performance, the task data has so far received only limited use in support of event extraction. The REL task continues as an open challenge, with all resources available from the shared task website.

Computational Biology CB_J0010


Title : Data mining the yeast genome in a lazy functional languag
Author : Amanda Clare and Ross D. King
Year Publish : 2003
Place of Publish : Springer Berlin / Heidelberg
Abstract :

Critics of lazy functional languages contend that the languages are only suitable for toy problems and are not used for real systems. We present an application (PolyFARM) for distributed data mining in relational bioinformatics data, written in the lazy functional language Haskell. We describe the problem we wished to solve, the reasons we chose Haskell and relate our experiences. Laziness did cause many problems in controlling heap space usage, but these were solved by a variety of methods. The many advantages of writing software in Haskell outweighed these problems. These included clear expression of algorithms, good support for data structures, abstraction, modularity and generalisation leading to fast prototyping and code reuse, parsing tools, profiling tools, language features such as strong typing and referential transparency, and the support of an enthusiastic Haskell community. PolyFARM is currently in use mining data from the Saccharomyces cerevisiae genome and is freely available for non-commercial use at http://www.aber.ac.uk/compsci/Research/bio/dss/polyfarm/.


Computational Biology CB_J0009


Title : Agent-based modeling of ductal carcinoma in situ: application to patient-specific breast cancer modeling
Author : Paul Macklin, Jahun Kim, Giovanna Tomaiuolo, Mary E. Edgerton and Vittorio Cristini
Year Publish : 2010
Place of Publish : Springer Berlin / Heidelberg
Abstract :

Ductal carcinoma in situ (DCIS) of the breast is the most common precursor to invasive carcinoma (IC), the second-leading cause of death in women in USA. There has been great progress in modeling DCIS at both the cellular scale (e.g., using cellular automata and agent-based models) and the population scale (e.g., using partial differential equations or systems of ordinary differential equations), but these past efforts have been difficult to calibrate with patient-specific molecular and cellular measurements. We develop a biophysically justified, agent-based cellular model of DCIS that is well-suited to patient-specific calibration. The model is modular in nature and can thus be readily extended to incorporate more advanced biology. We give an example of recently developed, patient-specific calibration of the model and conduct parameter studies that generate testable biological hypotheses.