Key Takeaways
- The open cloud model allows businesses to innovate at the infrastructure level. The Massachusetts Open Cloud enables top universities partnering to build on this shared open cloud with industry.
- Open Cloud is an emerging alternative to expensive closed proprietary clouds like AWS that creates an even playing field for research, innovation and collaboration.
- Medical research is perfectly suited for real-time collaboration via open hybrid cloud architecture. Applying open source scalability and shareability to medicine has the potential to save lives by cutting image processing time down from hours to minutes.
- Even above cost savings, use cases continue to be the driving factor behind infrastructure decision making.
- With an open cloud, not only can you adapt applications to the cloud, but you can adapt the cloud to the application’s needs.
With the seemingly endless popularity of open source software and the often higher operational cost and inflexibility of proprietary cloud contracts, hybrid and open cloud models are arising as a promising and innovative alternative to giants like Amazon Web Services. Today we dive into a compelling end-to-end use case of how the radiology department at Boston Children’s Hospital is leveraging Red Hat open source, container-based technologies and the Mass Open Cloud to improve patient outcomes with more rapid diagnosis and data processing.
Open public cloud emerges as a viable way to cut costs and innovate at the infrastructure level
“The basic idea was today’s clouds are closed proprietary things controlled by one provider.”
This is how Orran Krieger, PhD, Boston University’s lead on the Mass Open Cloud, succinctly summarized the impetus behind this project in an interview.
Krieger says the vision is that there could be multiple providers standing up different services in a shared cloud to create a level playing field. Bringing together big names in Massachusetts academia like Harvard and MIT, some of the world’s biggest research institutions combined resources to build a 15-megawatt data center, in which they could stand up a shared, open cloud, called the Massachusetts Open Cloud, or Mass Open Cloud or MOC for short.
“The open cloud model is really advantageous for companies to innovate at the infrastructure level, but even to manage it,” says Krieger, who has seen services like those of Boston Children’s Hospital’s radiology programs built on top of it.
“There’s a huge cost in running experiments. When you’re a not-for-profit university, it’s even greater, with a huge capital investment to start anything on a cloud at sufficient scale.”
Krieger contends that it’s always cheaper for universities — at least in the United States — to purchase their own equipment than to rent servers on the private cloud.
But he said that “In the end the success will only come from something like the open cloud, if industry participates in it.”
Krieger says only an aggregate of all these universities could inject the necessary capital investment to start out a cloud at sufficient scale to allow for experimentation on top of it. Plus, he says that universities have a long history of standing up large-scale computational infrastructure long before today’s public clouds.
Once they dove in, compelling end-to-end use cases drove the project forward, including partnering with Red Hat to adapt the underlying technology and the cloud for the project’s feedback and needs.
End-to-end compelling use case drives project ahead
In the end, cost-saving is important, but it’s the stories that drive a tech project moves forward. Quicker response time for children’s medical imaging is certainly that kind of compelling use case.
Rudolph Pienaar, PhD, staff scientist at Boston Children’s Hospital and instructor of radiology at Harvard Medical School, as well as technical architect of this project, brought Boston Children’s into a partnership with the Mass Open Cloud, which in partnership with Red Hat infrastructure, led to an end-to-end use case.
Pienaar explained the problem they were trying to solve:
“How do we make analysis of radiology data very quickly, and how can we do that very simply, with the least amount of activation energy for the person using the system? Since we are architecting an infrastructure where data is analyzed on computers typically on the cloud, we need to hide this complexity as far as possible. A user need only ever choose a dataset and choose an analysis to perform on the data. The details of how the data is packaged, protected, de-identified, how it adheres to compliance standards, how it is sent to the cloud, analyzed, and then ultimately how the results are pulled back and presented to the user should be details that the end user need never have to think about. This of course, is still a complex problem."
The answer became ChRIS Research Integration Service, a web-based medical image platform developed using a combination of home-grown software developed at BCH that then was “seamlessly integrated” with Red Hat technologies on the Mass Open Cloud. ChRIS offers a standardized way of deploying imaging applications, which in turn reduces the barrier that currently exists between medical app developers and users like medical technicians who need quick access to tools but have a low level of technical knowledge. ChRIS runs on Red Hat OpenShift kubernetes container platform, so that app containers built for ChRIS come automatically stocked with all the required libraries for the user to install the app and start processing medical images in sometimes a life-saving matter of minutes instead of hours.
Their imaging use case is an intensive computing process that requires access to GPUs (graphic processing units) using the MOC hardware and Red Hat OpenStack.
Pienaar says this meant Red Hat had to “change the cloud to adapt to the application and change the application to adapt to the cloud. I think it’s exciting to see the evolution of the cloud based on the feedback for this project.”
When you are building applications for users that do not have a high-level technical background, it’s essential to build in a simple and intuitive way. Pienaar says this is an unusual endeavor for a children’s hospital because, while clinicians are interested in research, “a clinician is not very fluent in how to fire up a Linux terminal and how to analyze data.”
In order to accomplish this, they had to overcome some operational challenges, including varied and scattered access to data that is often complex and old. There’s a huge library of information available since the inception of the almost 150-year-old children’s hospital, but this data is usually scattered within isolated labs. The data needed to be cleaned up and anonymized so it could move across network boundaries and remain compliant. And it needed a user interface that enables a user like a radiologist to have a huge database for analysis.
Pienaar added another problem that their software is frequently written by post-doctorates in computational labs, but usually only until results are gathered and papers are published, so there is little sharing across departments or over long periods of time.
Someone like a clinician who could dramatically benefit from comparing radiological results and especially leveraging the big data behind them usually has little to no access to all this research and images. And, don’t forget that the healthcare industry is one of the most regulated because of its highly sensitive, highly personal data. Part of this project is working on sharing this data in aggregate while specifically hiding those identifying data points. Right now ChRIS is about moving that data securely into the cloud to be collected there and then analyzed and visualized.
Open source technologies are key to building innovation on the cloud
Pienaar says that it’s very important that it is all open source and, again, not just because of the cost savings. Having been using Linux from the start of the project, he believes they wouldn’t have access to the different development environments and languages they’d want to use if they were tied to a proprietary cloud.
“I very much am inspired by the idea that, with these open source approaches, we can build things that really affect data that has real connections to the world behind it,” Pienaar said.
“Right now if we were trying to collaborate deep down into the Amazon cloud, I would imagine we would have to set up a licensing agreement with Amazon. I wouldn’t be able to download the Amazon Cloud to run up my own environment. And while the full power of ChRIS lies in its connection to the Mass Open Cloud, nothing stops you from downloading and running ChRIS right now on your laptop. The entire ChRIS is available. Your experience is identical — albeit your laptop might not quite muster the grade for heavy computing. Still, you can troubleshoot and develop to your own mini-but-complete ChRIS in totality and then with a click deploy to any number of other ‘ChRISes’ that live out on clouds.”
He says they, of course, also have ongoing projects with big-name property healthcare technology providers, but, in these cases, the development cycle and the acceleration of development is a magnitude slower because of proprietary technology, with “very isolated sandboxes” and “very constraining boundaries.”
Pienaar gave the example of when he has needed to deploy an app or data on a proprietary cloud environment: “I wouldn’t be able to push it myself — I’d have to get permissions or use proprietary APIs. And once it’s deployed, I may not have access to it. And they may provide services on the cloud that I wouldn’t know what they are.”
He continued with a more specific example of Python libraries having to speak to an Amazon instance in the cloud, using credentials to authenticate.
“There’s always the fear in my head that Amazon will next week roll out another version of the API and change an aspect of the data or poll some stuff about it.” And Pienaar says it goes both ways: “If I don’t like Amazon anymore, I just want to be able to instantiate [to start up] my entire service on local machines.”
He says that, with open source, his team’s only restrictions are their own time and effort.
“With Red Hat’s OpenStack and OpenShift, I can do it anywhere. We can fire up our infrastructure in our own labs, and certain hospitals they may be extremely skeptical to put their data somewhere else, even the Mass Open Cloud, they can use the same experience and all the transparency control in-house. And then why don’t you take the step to consider going out to the Mass Open Cloud too.”
Pienaar contends that a fully open-source stack gives them more power, but with same level of control.
Hugh Brock, director of Red Hat’s Boston University initiatives, spent the last nine months getting the Red Hat side of things up and running on the Mass Open Cloud. He said they were enthusiastic about this use case that involves such intensive computing like rapid image processing that requires access to GPUs using Red Hat software like the Red Hat OpenStack Platform.
“At Red Hat, we want to show how we can help build a community around these kinds of problems, [so we can] try to make the software more consumable to community developers to try to help them understand how they can contribute,” he said.
Brock is referring to a community of users who work on medical image processing, like brain images, and who can do a little bit of code and want to test their stuff running on the cloud, as opposed to a machine running under their desks. These are also people looking to save money with the Mass Open Cloud and container infrastructure software, and the medical use case compels them to want to make it easier to distribute, something he says Red Hat OpenShift enables.
When asked if the people developing apps on top of the code must be on open source, Brock commented it wasn’t required, but everyone he knows in the neuro-imaging field uses open source anyway to not get caught on what he calls the “licensing hook.”
At Red Hat, “we already had the Mass Open Cloud out there and from its inception it has been running Red Hat OpenStack and Red Hat OpenShift. When we started learning about the ChRIS project, we realized it would be a really cool showcase for containers and OpenShift."
Where will they take it next?
Brock continued that “We want to figure out not only how we can run the image processing codes faster on our cloud software, but also things like how we do group computation across different sets of medical data,” like encrypted secure data without having full access to the data, and to leverage the availability of open data in imagery, like metastasizing cancer cells.
“As data gets larger and larger and we generate more and more of it, it is going to become increasingly important for researchers to have access to that data in a safe way that does not violate anyone's privacy.
“If you have to pay a large cloud for all of the access to the medical data that they have scooped up, then you cut out the possibilities of third parties getting access to data that is speculative that they may not want to be paying for,” Brock said, bringing up another compelling reason to build on a hybrid cloud.
At Red Hat, they are focusing on both open access to data and controlled access to privacy-encumbered data.
ChRIS runs on the Mass Open Cloud with Red Hat OpenShift kubernetes distribution. Brock says that ChRIS will work on that stack out of the box however to make it go faster, they did some additional work to pass through GPUs from the hardware.
“We have one cluster in the Mass Open Cloud where all the machines have a GPU card — then that card can pass from the hardware to the virtual machines. When you run one of these image processing codes, if it’s optimized for GPU processing, it runs much faster — in terms of days to hours.”
Open cloud plus ChRIS leads to aggregate data that can affect policy
Computer science PhD Ata Turk leads the big data and healthcare analytics teams at Mass Open Cloud. His goal for this collaborative project is to make healthcare available to everyone. He said that containers have become a key part of this goal as app developers can use and run multiple containers, organize their inputs and outputs, and make them available to other developers.
“Red Hat OpenShift sits on top of Kubernetes — that lets us do multi-tenancy, communicate with storage solutions, build on an openstack. OpenShift nicely packages for DevOps and kubernetes in terms of managing. Containers allow us to make it available to all app developers and allows them to run on top of real data in the hospital,” Turk said.
Turk offered the example of his wife Esra Abaci Turk, a research scientist at Boston Children’s, on researching the problem of oxygenation in the uterus for women pregnant with twins, an important issue around fetal brain development and growth. She has captured nine example cases at BCH while there are about six other cases at other Massachusetts hospitals, but there is currently no technical nor regulatory way for these entities to share data.
In another potential use case (as referenced in accompanying image) this author’s 15-month-old is an anonymous contributor to the Developing Human Connectome Project (dHCP) and its goal to map the developing brain. So far this has involved an MRI at 32 weeks gestation and another on the second day of his life. If you’ve ever had an MRI, you know how loud it is even with headphones. They were able to silence this mechanical world for a newborn, but that was impossible when they were snapping images of my incredibly active and irritated fetus’s brain. Turk said it can take “hours and hours” to layer those MRI images on top of each other and remove the noise of that movement. ChRIS offers a way to process that all monumentally faster by exploiting parallel computing and leveraging the cloud.
Similarly aggregating anonymous data around certain themes can save lives whether it’s discovering patterns in metastasizing tumors or examining how different cases were treated and tracked at the five first-tier Massachusetts trauma centers who responded to the 2013 Boston Marathon bombing.
“These aggregate questions have meaning and values in driving policies and making decisions, and we are building a platform for that.” Turk explained that now “We have this framework that enables people to share data from different entities that don’t want to share the data itself,” in effect, meeting regulatory standards without compromising research.
And this partnership has use cases past healthcare too. Turk offered how companies want to know the discrepancy between male and female salaries, but don’t want to expose the salary itself. Applications built with ChRIS, Red Hat’s stack, and the Mass Open Cloud allowed for aggregation of salary data for more than 500 Boston-based companies. Solutions to close that pay gap now can be discussed since ample, reliable data is now accrued.
Both this use case and the medical ones follow an operational pattern of bringing in the data and code, running on the optimum numbers of data, and sharing the input data and the temporary data that are required for the application itself. ChRIS also has mechanisms that can facilitate visualizing the data for clinicians.
The end goal is not to just make applications run faster on a single machine, but to open source data itself, while still remaining compliant to regulations like the U.S.’s HIPAA and Europe’s GDPR.
Where have you seen the rise of the open cloud? Tell us in the comments below.
About the Author
Jennifer Riggins is a tech storyteller and writer, where digital transformation meets culture, hopefully changing the world for a better place. Follow her on Twitter @jkriggins.