Open source software creeps in to health care through clinical research

by Andrew Oram

This was originally published on O’Reilly Media’s Strata blog, June 25, 2013.

Although open source has not conquered the lucrative market for electronic health records (EHRs) used by hospital systems and increasingly by doctors, it is making strides in many other important areas of health care. One example is clinical research, as evidenced by OpenClinica in field of Electronic Data Capture (EDC) and LabKey for data integration. Last week I attended a conference for people who use OpenClinica in their research or want to make their software work with it.

At any one time, hundreds of thousands of clinical trials are going on around the world, many listed on an FDA site. Many are low-budget and would be reduced to using Excel spreadsheets to store data if they didn’t have the Community edition of OpenClinica. Like most companies with open-source products, OpenClinica uses the "open core" model of an open Community edition and proprietary enhancements in an Enterprise edition. There are about 1200 OpenClinica installations around the world, although estimation is always hard to do with open source projects.

What is Electronic Data Capture? As the technologically archaic name indicates, the concept goes back to the 1970s and refers simply to the storage of data about patients and their clinical trials in a database. It has traditionally been useful for reporting results to funders, audit trails, printing in various formats, and similar tasks in data tracking.

Recently, EDC systems have responded to the current enthusiasm for patient engagement, offering portals. (A user experienced designer recently mentioned patient engagement as a key task of clinical trial design.) Someday they may also allow patients to input data as well. For instance, which do you suppose gives clinical researchers more accurate information: allowing patients to select a side effect on their cell phone in order to report it as soon as they experience it, or waiting a month for the patient to turn up at for her regular monthly report to the researcher and be asked to report symptoms?

Actually, many researchers prefer the latter. The issue came up even at the OpenClinica conference. All the familiar, classic arguments came up: patient data can’t be trusted, patients jump to conclusions about what causes a symptom, etc. But keynoter Doug Bain suggested that patient data be recorded along with its provenance (how the data was generated or who reported it). Researchers should then investigate the real significance of the report—which is more likely to happen if patients have more opportunities to report it.

OpenClinica in resource-poor areas

Although OpenClinica sees plenty of use in well-endowed environments, some of the communities that use it need to shepherd every resource. They may lack Internet connectivity, people educated in system administration, and other infrastructure. The same creativity that goes into an open source community drives these sites to find clever solutions to EDC. Some of the researchers speaking at the conference carry out trials in Africa and rural China.

Andy Lin, of the University of Michigan, explained how his researchers created a bootable UCB drive that ran Linux (the Lubuntu distribution) hosting OpenClinica. This allowed data collectors in China, many of whom were unfamiliar with OpenClinica and uncomfortable about loading the software on their computers, to run it on their laptops. Data was persisted so that laptops could be rebooted, and automatic backups were included. When the data collectors got to a place with Internet access, they uploaded the data to the project’s server.

One of the first constraints faced by the Chinese researchers was the need to collect data on paper forms. Not only do some site lacks computers, but some projects have requirements to keep paper records. But re-entering all the data by hand would be slow and error-prone. So they used Optical Mark Recognition (related to Optical Character Recognition) to take data from check-boxes on paper into OpenClinica.

Finally, to provide information to research subjects, audio files were included. Many of the researchers were non-native Chinese speakers, and many of the research subjects were uneducated and had very limited mastery of Mandarin. In addition to bridging the language gap, the audio files ensured that information was presented in a uniform way to all subjects.

I should note that OpenClinica itself provides text-to-speech capabilities in its patient portal now.

Combining results of clinical trials

As I explained in another article, combining data from different experiments in well-nigh impossible because each experiment has its own parameters. OpenClinica does not help with this task, even if you use the Datamart. Formedix, one company that drew a roomful of attendees, is one solution that claims "70% re-use of content across clinical trials." While not open source, they base their forms on the standard Operational Data Model put out by the Clinical Data Interchange Standards Consortium (CDISC).

Another way forward is LabKey, an open source project that claims to aid in moving software from an EDC system to other areas of clinical research. Both these projects have to deal with the proprietary, incompatible, and poorly structured formats of different EHR systems.

Author’s home page
Other articles in chronological order
Index to other articles