A proposal for a ‘Data Working Group’ as part of the ‘Data Web Forum’ (now called the RD-Alliance)

Correction: the “Data Web Forum” will not be called the “Research Data Alliance” or “RD-Alliance” – not to be confused with “Research Data Australia” or “Resource Description & Access”).  Three RDAs all in the same sector 😉

So what is the ‘Data Web Forum’ – it is a W3C like governance group that wants to get IETF-like consensus on data “standards” (or in fluffier terms “rough consensus on data bridges”).

Minus the never ending debates that will naturally occur around the governance, I think this is a good idea. IETF is too busy with the lower end of the stack (e.g. packet, IP -like VoIP-, and other engineering stuff), and W3C is too busy with the code (e.g. HTML, CSS, XML, RDF).  So why not have a group that is getting the good and the great of the data world together to talk about ‘data standards’?  Enter the Data Web Forum.  I’m not saying that this forum might need to be more closely tied to the W3C or IETF (or Apache), but for the time being I think it is a way to start getting people *internationally* bootstrapped into talking about how we move data around so others can reuse it in a wide range of disciplines (the international cross-disciplinary problem being the hard nut to crack).

Accordingly, I’d like to put forward the first proposal to the DWF for one of their ‘Data Working Groups’.  Or rather -in the style of IETF- I’d like to make a plea to my fellow developers and tech-savvy friends to join a Working Group on the topic of ‘Data Transport Protocols’.

Why am I proposing this working group?  I feel that the Academic world has finally moved on (or at least they are starting to, especially here in Australia), and that data has become a first class research object – as important, if not more important than the research article itself (my words, not necessarily my employers).  Therefore, we need to start having discussing on how we can better enable researchers to not only publish their data on the Web, but to then have it in a reference format that any other researcher (from any other discipline) can come along, grab a link and import that data as easy as it is to import a news feed via RSS/Atom or the like into their favourite discipline specific tool.

Once we have the capability for researchers to:

  • a.) read research publications,
  • b.) be interested in the data that the research publication is talking about,
  • c.) have a visualisation or other means of easily looking at an overview of the data, and then
  • d.) importing that data into their own preferred discipline specific research tool for reuse; will we finally be in a world where research is actually ‘of the web’ (and not just ‘on the Web’).

Further information to come on the specifics of the proposal, but I believe there is a group of people who have previously participated in the likes of…

  • JISC’s SWORD deposit protocol,
  • developers from ANDS’ Data Capture projects, and
  • vendors like Microsoft Academic Research

…whom have a lot to talk about when it comes to data transfer protocols.  More coming soon.

Watch this space and please ping me via twitter @dfflanders if interested (or leave comments below).

~ by dfflanders on July 25, 2012.

8 Responses to “A proposal for a ‘Data Working Group’ as part of the ‘Data Web Forum’ (now called the RD-Alliance)”

  1. I think what you describe is already happening, but not in a new forum. There’s s little (not much) in the IETF, a fair amount in the W3C, and over the years a lot of spoliation standards have been developed in OASIS. I have to ask what yet another standards\\\\\\\ consensus-forming body is likely to achieve, and how would it be funded?

    You also suggest this is an extension of the academic community. My sense is that academics are not, on average, the bestbest people to do standardization. Academics are trained to present ideas that are novel, and push human understanding. A good standard is clear documentation of what might be considered the “bleedin’ obvious” – novelty in a standard is a disadvantage when it comes to adoption. This is, of course, a sweeping generalization – but to be effective standardization activity has to be driven by a cross section of stakeholders – especially including those with a product/operations perspective add well as a technology perspective.

    Finally, I note that the IETF/W3C style of specification model is somewhat being questioned today (e.g HTML5). As the community of use (of onternet/web and related tech) becomes more mainstream, more diverse, is it reasonable to expect that a single group, however smart, will come up with specsspecs that work for a significant

    • (crap Android interface… Hit post before i was done)

      Wanted to say that with every more widespread use, is the “one specification body to rule them all” model really a workable model?

  2. Firstly, can you please change “Correction: the “Data Web Forum” will not be called the “Research Data Alliance”” to “Correction: the “Data Web Forum” will now be called the “Research Data Alliance””.

    As to the more substantive second comment, the intention of the RD-Alliance is to foster WGs on topics where the members care, and to provide a form of imprimatur on the results. It’s not intended as a unitary standards body.

  3. One of the issues is the fact that some (not all) research data is collected with the goal of getting the summaries for the paper. (based on anecodtes from researchers)

    This sounds a bit like some of the crappy code I write in PHP for websites I build then forget about.

    At work, we’ve just introduced the concept of code peer-review — at least one colleague must sanity check production code for all the little wins that brings, security, readability, “Dude, didn’t you know that all that could be done with library XYZ?”. But just the idea that other people will read and judge your code makes you right better code.

    In research, I think the next problem we’ll encounter is that it’s really hard to work with other people’s data, even if it’s “open”. Look at the kakky spreadsheets on data.gov.uk — these are not really “data” in a way a computer programmer understands — they don’t allow immediate ingest and use. And this carrys over into research data. Obviously some types of data are very bespoke, but there’s got to be areas where just getting the community to agree to use a standard spreadsheet template would change everything.

    If, right from the start, you knew the peer review system would include your data as well as the final paper, then people would get into better (and more interoperable) habbits with less trauma than they think. It’s the difference between indenting and commenting your software as you go, verses doing it as the last job before publication.

  4. Don’t create another working group or standards body.

    Build a community of people doing interesting things with moving research data using whatever specs and systems already exist. Run some hack days, and share what you learn with the world.

    Standards will follow.

    Governance is always harder than you think.

    And you’re really good at making hack days happen 🙂

  5. First data working group proposed by ‘Research Data Alliance’: http://forum.rd-alliance.org/viewtopic.php?f=3&t=31

  6. When you’re ready to turn in your Internet-Draft, you submit it to http://datatracker.ietf.org/submit . The instructions on that web page will walk you through the needed steps, and there is also an email address there in case you need personalized help.

Leave a reply to dfflanders Cancel reply