BioXSD: the common exchange format of everyday bioinformatics data

Contact: support at bioxsd.org


BioXSD defines exchange formats of everyday bioinformatics data types. BioXSD aims to serve as the common, canonical data model for bioinformatics Web services.

Everyday bioinformatics: lightweight to "medium-heavy-weight" data models of sequences, sequence annotations, alignments, references to resources.
Specialised, "heavy-weight" standard XML data models such as SBML, MAGE-ML, GCDML, or PDBML should be used always when applicable.

Interoperable Web services: BioXSD enables deployment of globally and smoothly interoperable bioinformatics tools on the World Wide Web of Services.
BioXSD supports WS-I compliant services and interoperates with ordinary SOAP libraries for common programming languages. No other infrastructure than WWW and SOAP is necessary.

BioXSD is an initiative coming from the scientific community: from the EMBRACE project partners.

Developed by analysing existing requirements, tools, Web services, data formats, and ontologies
Feasibility testing at different pilot providers, using diverse libraries and programming languages

BioXSD data-type definitions are annotated by the EDAM (EMBRACE Data And Methods) ontology. BioXSD thus offers ready-made building blocks for Web-service interfaces with a globally defined, controlled meaning.

BioXSD types can be used directly if applicable; or can be included in other standard or custom types, extended or restricted. With services that use other or proprietary formats, BioXSD can be used as the canonical intermediate exchange format.

Poster

Abstract

Open collaboration within the community: BioXSD welcomes feature requests and new collaborations. We offer full user support and consulting!


BioXSD 1.0 is available at http://bioxsd.org/BioXSD-1.0.xsd. This is the location to be referenced by the services and custom Schemas, or to be synchronised with.

(For service providers using Python: Due to some important basic features missing in the Python Zolera Soap Infrastructure (ZSI) library, a special version for generating ZSI code is at http://bioxsd.org/BioXSD-1.0.zsi.Workaround.xsd. Do not forget to get a ZSI patch for an empty-complexType bug. This xsd is "SOAP-compatible" with the normal xsd. It means that the services in Python should be developed using the zsi.Workaround xsd, but the WSDLs of the deployed services should then be exposed with the normal xsd)

Technical documentation of the BioXSD Schema is available at /technicalDocumentation/BioXSD-1.0


Existing services. Thanks to the providers of bioinformatics Web services who started adopting BioXSD as pilot users, the number of services compatible with BioXSD rises. Currently, adapted Web services are:
(If you started using BioXSD for your services or other tools, please let us know by sending an email to support at bioxsd.org)


Example workflows (analysis pipelines) combine multiple bioinformatics Web services using BioXSD.

Workflows show that these services are smoothly interoperable.


Examples of data represented in BioXSD 1.0 format. File contains examples of sequence records, annotated sequences, and multiple-sequence alignments.

Examples of data represented in beta BioXSD 0.4 format are here. (And examples in BioXSD 0.3 here)


Ongoing development: improvements in the automatically-generated comprehensive documentation (including ontology resolving); additional example/test-case workflows, including multiple programming/workflow languages and SOAP/XML libraries; addition of some newly desired data types, enhancement of the existing enumerations and simple-type patterns; additions of metadata (in particular for features, species, data resources, scores, .. | in particular semantic annotations, but also other metadata); improvements & refinements of the data examples; tools to translate between BioXSD and other common community formats (both XML, textual, & tabular formats).
Volunteers are welcome! (developers at bioxsd.org)


Latest update: 2010-05-23