An exchange format for sequences, features, alignments, references
Suitable for Web services and programmatic libraries
Enhanced and standardised provenance metadata, and a new minor release in progress
We are now improving the provenance metadata in BioXSD, by implementing a rich but simple model that will be aligned with the W3C PROV standard. Thanks go also to the OpenBio Codefest for great input and discussions.
A new minor release of BioXSD 1.1, fully backwards compatible with all valid BioXSD 1.1 data, will include the provenance enhancements and standardisation, and a couple of more improvements. It will be the version 1.1.3.
For urgently desired additions, please contact firstname.lastname@example.org as soon as possible.
BioXSD version 1.1, release 1.1.2
A minor release of BioXSD 1.1, the version BioXSD 1.1.2, was released on the 13th of May, 2013.
As all minor releases of BioXSD are compatible with all previous minor releases within the last major release, the BioXSD 1.1.2 is backwards compatible with BioXSD 1.1 data (i.e. less restricted).
For the list of changes, please refer to the CHANGELOG in the BioXSD-1.1.xsd file.
The 50 BioXSD-compatible Web services challenge
Volunteers are still warmly welcome to join the 50 BioXSD-compatible Web services challenge!
The aim is to provide compatible Web services and other tools that adopt BioXSD as one of their input/output formats.
Implementations have already been ongoing at CBS in Greater Copenhagen (Denmark), Rostlab in Greater Munich (Germany), IBCP in Lyon (France), CBU in Bergen (Norway), and at a couple of more sites.
For more details, help, and consultation, please contact email@example.com.
BioXSD defines exchange formats of basic bioinformatics types of data. BioXSD aims to serve as the common, canonical XML format for the basic bioinformatics data.
Canonical data format does not mean "the only format", but an exchange format that can be common to several tools (as one of multiple formats the tools are supporting). Tools can produce and consume BioXSD directly, or BioXSD can be used as an intermediate canonical format rich enough to enable conversions among diverse formats. Using common exchange format enables smooth intergration of compatible tools into analysis workflows.
BioXSD is a rich but not too complicated XML-Schema-based exchange format for sequences, alignments, feature records, and references to external resources. Specialised standard XML formats such as for example SBML, MAGE-ML, GCDML, PDBML, PSI MI MIF, PhyloXML or NeXML are orthogonal efforts and should be used where applicable. BioXSD however aims at filling the gap between these specialised XML formats.
BioXSD enables deployment of globally and smoothly interoperable bioinformatics tools on the World Wide Web of Services. BioXSD supports WS-I compliant Web services and interoperates with ordinary SOAP and XML libraries for common programming languages, and naturally also with the REST architecture. No other infrastructure than standard HTTP, XML, and eventually SOAP is necessary for using BioXSD-compatible Web services.
BioXSD is an initiative coming from the scientific community: from the EMBRACE project partners.
The EMBRACE standards:
BioXSD data-type definitions are annotated with the EDAM (EMBRACE Data And Methods) ontology and with the main Semantic Web vocabularies. BioXSD thus offers ready-made building blocks for Web-service interfaces with a globally defined, controlled meaning (semantics).
BioXSD has been developed by analysing existing requirements, tools, Web services, data formats, and ontologies. Feasibility was tested at different pilot providers, using diverse libraries and programming languages.
BioXSD types can be used directly if applicable; or can be included in other standard or custom types, extended or restricted. With services that use other or proprietary formats, BioXSD can be used as the canonical intermediate exchange format.
Open collaboration within the community: BioXSD welcomes feature requests and new collaborations!
To submit your requirements, please write to firstname.lastname@example.org. A request-tracking system will be available in the future.
Please reference this publication if you use BioXSD:
Kalaš, M., Puntervoll, P., Joseph, A.,
Töpfer, A., Venkataraman, P., Pettifer, S., Bryne, J.C.,
Ison, J., Blanchet, C., Rapacki, K. and Jonassen, I.
BioXSD: the common data-exchange format for everyday bioinformatics web services.
Bioinformatics, 26, i540-i546.
doi: 10.1093/bioinformatics/btq391 PMID: 20823319
If you make use of the optimised sequence/genome feature representation, please reference also:
Gundersen, S., Kalaš, M., Abul, O.,
Frigessi, A., Hovig, E. and Sandve, G.K.
Identifying elemental genomic track types and representing them uniformly.
BMC Bioinformatics, 212, 494.
doi: 10.1186/1471-2105-12-494 PMID: 22208806
BioXSD 1.1 is available at http://bioxsd.org/BioXSD-1.1.xsd. This stable version is available for implementations and open for additions and further requirements. Suggestions for changes are welcome and may be reflected in the future versions. This is the canonical Schema location to be imported in document XSDs (such as in Web services' WSDLs) or to be synchronised with.
(There is no 'worked-around' version available yet for Web-service providers using Python for SOAP stack. It can, however, be available soon. Please contact email@example.com with urgent inquiries.)
BioXSD 1.0 is available at http://bioxsd.org/BioXSD-1.0.xsd. This is the canonical Schema location to be imported in document XSDs (such as in Web services' WSDLs) or to be synchronised with.
(For Web-service providers using Python for SOAP stack: Due to some important basic features missing in the Python Zolera Soap Infrastructure (ZSI) library, a special version for generating ZSI code is at http://bioxsd.org/BioXSD-1.0.zsi.Workaround.xsd. Do not forget to get the ZSI patch for the empty-complexType bug. This xsd is "SOAP-compatible" with the normal xsd. It means that the services in Python should be generated from WSDLs importing the BioXSD-x.x.zsi.Workaround.xsd Schema, but WSDLs of the deployed services should then be importing the normal BioXSD-x.x.xsd Schema.)
BioXSD Schemas are available under the Creative Commons BY-ND 3.0 license with additionally allowed inclusion, extensions and restrictions in user's XML namespace. Contributions to new canonical versions, in the bioxsd.org XML namespace, are welcome under supervision of the BioXSD consortium (in order to keep BioXSD a common, canonical data model).
For release information including CHANGELOGs, please refer to the bottom of the XSD files.
A concise quick overview guide to the BioXSD format is at ./QuickReference.
Full technical reference of BioXSD version 1.1 is at ./technicalDocumentation/BioXSD-1.1.
Documentation of BioXSD version 1.0 is available at ./technicalDocumentation/BioXSD-1.0.
Examples of feature data represented in BioXSD 1.1 format.
Examples of diverse types of bioinformatics data represented in BioXSD 1.0 format are available here. This example file contains examples of sequence records, annotated sequences, and multiple-sequence alignments.
Example workflows (analysis pipelines) combine multiple bioinformatics Web services using BioXSD.
Workflows show that such services are smoothly compatible.
Thanks to the providers of bioinformatics Web services who started adopting BioXSD as pilot users, the number of services compatible with BioXSD rises. Currently, adapted Web services are:
If you started using BioXSD for your services, libraries, or other tools, please let us know by sending an email to firstname.lastname@example.org. For maintenance and support purposes we need to know about the providers using BioXSD. A registration system will be available in the future.
Contributions from the community are warmly welcome and needed! (email@example.com)
Volunteers are especially welcome to join the 50 BioXSD-compatible Web services challenge.
BioXSD has been and is further being developed as a part of multiple collaborative projects. There has never been any funding directed exclusively to BioXSD.
CBU, University of Bergen, Norway:
Matúš Kalaš, Inge Jonassen, Pål Puntervoll (until 2013; now Uni Miljø, Bergen), Jan Christian Bryne (until 2010; now Oslo University Hospital), Armin Töpfer (until 2011; also CeBiTec, Bielefeld, Germany; now D-BSSE, ETH Zürich, Basel, Switzerland), Prabu Venkataraman (until 2012; now Fiskeridirektoratet, Bergen)
CBS, DTU, Greater Copenhagen, Denmark: Edita Karosiene, Kristoffer Rapacki
Oslo University Hospital, Norway: Sveinung Gundersen
IBCP, CNRS, Lyon, France: Christophe Blanchet, Alexandre Joseph (until 2010; now EFREI, Paris)
EBI, EMBL, Hinxton, U.K.: Jon Ison
Rostlab, TUM, Greater Munich, Germany: László Kaján
... and multiple supporters at diverse research institutions
Research Council of Norway (to eSysbio, to the FUGE Bioinformatics Platform, and ELIXIR.NO to the Norwegian Bioinformatics Platform)
Villum Foundation (to the Center for Disease Systems Biology)
l'Agence Nationale de la Recherche (to HIPCAL)
Alexander von Humboldt Foundation (through the German Ministry for Research and Education)
European Commission FP6 and FP7 (to EMBRACE and ELIXIR)
Last update: 2013-September-09