Starting a heron-i2b2-analytics repository

Steve Simon

2018-03-29

I am working on a CTSA grant to develop repeatable downstream pipelines that directly access i2b2 and CDM. In order to promote this work and encourage others to participate, I was given a repository site on github, kumc-bmi/heron-i2b2-analytics. Right now, it is just a shell, but here’s what I want to do with it, short term and long term.

The first thing is to model the directory structure closely to that recommended in Wilson et al. Good enough practice in scientific computing. That means three files, in particular.

README.md. This will give an overview of the project. I like the style and format of the README.md files used by the kumc-bmi/grouse as well as the kumc-bmi/bc_qa repositories, and will probably aim for a mix of those two styles. A nice model from an external source is the README.md file used by the parisk/skel repository.

CONTRIBUTING. This file, according to Wilson et al “describes what people need to do in
order to get the project going and use or contribute to it.” A nice model for this file is at the opengovernment/opengovernment repository.

LICENSE. This is already in the repository. We are using the MIT License, which allows others to use use anything in the repository free of charge and without any substantial restrictions.

Eventually, I want to add a CITATION file, as well.

Wilson et al suggests a standardized directory structure with src for source code, and doc for documentation. This repository will probably not need some of the other recommended directories, at least at first.

The other important thing is to complement rather than duplicate the very good documentation about HERON. This repository talks about applications with direct access to the data rather than access through i2b2.