Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Podcasts Dr Pamela Gay from the Planetary Science Institute on Citizen Science

Dr Pamela Gay from the Planetary Science Institute on Citizen Science

In this podcast recorded at QCon San Francisco 2019, Shane Hastie, Lead Editor for Culture & Methods, spoke to Dr Pamela Gay from the  Planetary Science Institute about her work using human wetware and machine learning in scientific research, fostering citizen science and ethical behaviours. 

Key Takeaways

  • Space science involves processing petabytes of data every day  
  • Neither human nor machine learning can analyze all of that data effectively 
  • There is a need to produce detailed maps of the lunar surface and other bodies in the solar system
  • The value of using human wetware to map other worlds 
  • The importance of ensuring appropriate permissions, attribution and ethical behaviour when conducting research 


00:21 Introductions

00:21 Shane Hastie: Good day folks. This is Shane Hastie for the InfoQ Engineering Culture Podcast. I'm at QCon, San Francisco, 2019, and I've got the privilege of sitting down with Dr. Pamela Gay. She's the senior scientist at the Planetary Science Institute. Pamela, thanks so much for taking the time to join us.

00:38 Pamela Gay: Thank you so much for asking me to join you. It's been amazing just getting to talk in the lead-up to doing this.

00:44 Shane Hastie: You are doing the closing keynote at the conference today, When Machine-Learning Can't Replace The Human. How does this come in with planetary science?

00:54 When Machine-Learning Can't Replace The Human

00:54 Pamela Gay: Well, we often forget that in space science, we are getting petabytes of data per day, raining down on our heads and attacking us through the internet from our telescopes on and above the world. And there's so much data that we can't analyze all of it. Just a few grad students here, a few undergrads there, and it turns out machine learning can't do it either. And so we have to look for novel solutions and sometimes our novel solution is saying, "Hey human, can I train you to be part of my algorithm?"

01:37 Shane Hastie: There's all sorts of problems with algorithms at the moment. Isn't that a dirty word today?

01:41 The need to develop detailed maps of the moon

01:41 Pamela Gay: I have no such prejudice against language, but we do have this situation where we have data coming in from our space craft and simple problems like, where is it safe to land my spacecraft on the moon requires extraordinarily detailed maps that tell us, "Well, over here there's a crater, over there is a ridge. That thing over there is a former volcano." And we need to avoid all of those. And as we're landing more and more things in more and more places we need to figure out how do we develop detailed maps? We have images of the moon that are at tens of centimeter resolution, which means if a person laid down on the moon, we'd see them as a dark splurge on our image. Now our global maps are the kilometer scale and a person's a whole lot smaller than a kilometer. And it would be amazing if we could figure out how do we get to those Google maps of the moon at the same resolutions that we have the data.

02:48 Shane Hastie: So how do we do that?

02:49 The challenges of using machine learning to identify lunar topography and how people help improve the algorithms

02:49 Pamela Gay: Well, this is where we're still struggling. When you feed software training data of just craters, the large circular basins made in the ground when a rock from space hits a solid surface. Those craters, they change radically in appearance as the sun moves through the sky. They look different in different kinds of soil. They look different at different sizes and different ages. And so what we've been doing is inviting the public to come in and mark, over the course of seven years now, crater after crater after millions of craters. And we've been systematically feeding this into our machine learning algorithms and learning what are the tricks that we need to employ.

03:36 Pamela Gay: And the trick seems to be, we need to get people to, "Well, map a little bit of this kind of geology, map a little bit of this color of soil, map a little bit of every different combination of features." And we keep improving our machine learning one step at a time. And this symbiotic relationship between our volunteers around the world, who are getting to contribute to new understandings about the moon. While we similarly train our computers to take over the task, so that we can move over to Mars, Mercury and other worlds throughout our solar system.

04:11 Shane Hastie: You used a fascinating term earlier that I'd like to explore a little bit further. I think we're going there. You use, "Human wetware to map other worlds?"

04:20 Pamela Gay: Yes.

04:21 Shane Hastie: So you're presenting photographs?

04:23 Human wetware to map other worlds

04:23 Pamela Gay: Yes. So, in an ideal situation, the way we learn from each other is through conversation and through example and with computers, we just give them examples. But you, and pretty much humans in general. If I say here are pictures of five craters, 20 craters if I'm feeling industrious, your brain can generalize what you've seen and look at a myriad of different surfaces and say, that's a crater, that's a crater, that's a crater. And we do this with a neural network in our head, that is a soggy, glitchy, gelatinous mess we call a brain and we're interfacing your brain through your eyeballs and your keyboard into the data that we're getting off of our hardware in space, off of our various cameras that we're sending through the vacuum, through our atmosphere, processing with our software, feeding to you through a website. And it becomes this mix of hardware, software, wetware, that gets us an answer.

05:28 Shane Hastie: How many people are involved at the moment?

05:29 Pamela Gay: Well, in our projects, which are slow and tedious, we measure our audience in the tens of thousands, but across the globe, working on not just our mapping of rocks, but of mapping of oceans, of proteins, of all the different kinds of image data that's out there. There are literally millions of volunteers, people who are using their spare time for scientific good.

05:53 Shane Hastie: For no reward?

05:54 The reward of discovery and community

05:54 Pamela Gay: Well, there's that reward of discovery of community. We do give people badges if they do enough stuff, just like you might earn a badge in Wizards Unite or Pokemon, but all of us at a certain level as little kids were interested in space, we were interested in dinosaurs and it was an asteroid that killed the dinosaurs, so space won. And there's an amazing excitement in looking at these images and knowing I might be the first human being to really look at this in detail and not knowing what that next image is going to be, what unknown thing you're going to find, what lost spacecraft on the moon is just going to happen to be a click away. And so people are driven by knowing that they're helping. We just recently helped the OSIRIS-REx spacecraft figure out where it's safe to land on Bennu. And that is a reward to watch the news and know, "I help keep that little spacecraft safe."

06:53 Shane Hastie: You were saying, in fact, talking about that particular mission. At around about the time that this is coming up, that's probably going to be happening?

07:00 The mission to Bennu

07:00 Pamela Gay: Yes. So right now, as we're talking, my team of citizen scientists, we did an initial map of the entire asteroid. Similar work was done by professional scientists around the world. All the different teams looked at different factors. The data was combined. We zeroed in on four different, potentially safe places to put our spacecraft. We have a spacecraft that's about two and a half meters on a side. We needed to find an area on the surface that was boulder free for hopefully at least five meters across, don't want to be knocking our solar panels off on a rock or anything like that.

07:36 Shane Hastie: Probably not a good idea.

07:38 Pamela Gay: Really bad idea. So we went through and we did initial mapping of the entire world. We found four different regions that look reasonably safe, and our spacecraft dropped down to a lower orbit, is taking higher resolution images. We just had a group of eight super users, volunteers working from their home who were found to be just as good as professional researchers, who went through and mapped those four regions in detail. Any day now, they're going to announce which of those four regions we're going to drop our spacecraft down to and grab a sample of soil. That will happen in December and when the folks out there are listening, our spacecraft, if everything goes well, is going to be on its way back to earth caring soil from another world that we can do research on. We can figure out, "Is it worth mining asteroids? What is the composition?" There's so many things we can't learn without actually reaching out and touching something.

08:37 Shane Hastie: Exciting stuff. Let's come down a little bit, come down to earth, come down to telecommunications, internet, machine learning, and explore, I made the comment about algorithms earlier. What are we doing? I know that have a deep, personal and professional interest in machine learning in the use of technology for good. What's happening?

09:04 Machine learning and the use of technology for good

09:04 Pamela Gay: Well in my perfect future, which I'm hoping to build. We are working to define a way to take all of the image based problems that are out there and take all the people who just want to contribute to expanding human knowledge and bring them together in a community where people who join us have a firm understanding of how their data is going to be used, how they're going to be credited for their efforts. Where scientists are coming in, respecting the people that they're working with and understanding, my research is only made possible thanks to these people. Through these interactions, we knock out the problems, like mapping tiny asteroids where your asteroid isn't big enough that even the entire surface would train a machine learning algorithm. So really that just has to be done by humans. But then we're able to, with bigger objects, moon, Mars, Mercury, take these consensually given data points, use them to train machine learning and extend our ability to map other worlds.

10:08 Pamela Gay: And at this point we don't even know what we don't know. A number of years ago, there was a scientist working on a Mars reconnaissance orbiters high rise camera, which is an extraordinarily high resolution instrument. It takes beautiful photos. And quite often scientists will say, "I'm interested in this latitude and longitude. And we jumped on the surface of Mars to the things we know are there. And she just wanted to see what all is coming off the spacecraft right now. She just sent a bunch of images to the printer and in flipping through them, she found that the spacecraft had caught an image of an avalanche on Mars in the process of happening. That just happened to be in the data. We don't know what we don't know. You can't train software, no matter how hard you try, to say there's a weird thing here. And this is where the human mind can be such a powerful tool for guiding our exploration as, well, we set the software down to do all the things we do get and come in and correct it for the things it doesn't get.

11:12 Shane Hastie: You made a point there, "Consensual use of that data." But isn't it just pictures?

11:18 The importance of consensual use of data

11:18 Pamela Gay: Well, it is. But imagine you go to a website and this is a potential with my own website and it's something that mortifies me and I desperately want to make sure this never ever happens. It's possible for someone to bounce in, not really sure what they're looking at. The tutorial explains it to them, but maybe they think it's a game instead of understanding it's science. And they go through mapping rocks and they get fed up because they don't know why they're doing it and they bounce away, but they've now contributed data to a science project that's going in an archive, that will go in a published catalog and they may three months down line be watching TV, see an announcement and realize, "I contributed data to that. Why wasn't I ever given credit? Where's my name in this discovery." And I don't want anyone out there to ever feel they're being taken advantage of

12:07 Pamela Gay: I've talked to people even here at this conference who, when they hit a capture on a page and they're asked to identify the signs, the buses, the cars, we as programmers often recognize what we're actually doing is training computer vision to be used by self-driving cars in the future. We are part of Google's training that will someday bring in great profit to the company. And we recognize, as programmers, that we are gaining free services by giving up this data. It's a two-way street, but not everyone understands this handshake, this I give you give. That is what essentially pays for all the services we get. And I don't want anyone to ever feel used. I want them to feel instead, "I'm contributing to something greater than myself. I'm contributing to helping us understand this great universe," that when we're having a bad day can maybe inspire us to think beyond ourselves and think far into the future.

13:10 Pamela Gay: You're bringing up some pretty interesting conundrums here in terms of, there's the consent. Do we know? Do we realize? And as you say, as programmers we recognize that, that's what's being done with a capture, for instance. Probably most of the people who're using it to just authenticate, "I'm a real human, dammit." Don't realize that, that's what's being done with it.

13:33 Pamela Gay: Exactly.

13:34 Shane Hastie: This starts to touch into the ethical areas. How do we balance ethics and research, ethics and growth?

13:42 Balancing ethics and research, ethics and growth

13:42 Pamela Gay: This is where we really need to learn from people in other fields. The medical field has been struggling for a long time with how do you ethically give people experimental drugs, experimental treatments, and look to see, "Well, am I going to accidentally kill them, am I going to extend their life?" And in doing this, there's been protocols put together over the years, there's been ideals that have been put together. And most interesting to me personally is there's been a lot of effort lately to make sure that when people are going in for medical treatment, they actually understand the risks that they're going to be facing. And we all assume we're not going to be the 30% that experiences a side effect. We're not going to be part of the 95% that don't experience a side effect. So we don't read. We've all faced the fine print on the prescription our doctor gives us and we shove it in a drawer or sometimes just straight into the recycling bin.

14:41 Pamela Gay: I don't want people to do that. I want them to take the time for the important issues and understand, "Well, the things that I'm contributing, these are actually going to help people's jobs. These are going to promote careers. This is going to lead to research papers." There's side-effects of what I am doing. And what I've seen to learn from is there a doctor's offices now that actually have cartoon images that are a quick, "Here are the things that can happen. Here is what we're going to do." Let us explain through a cartoon, a video, multimedia of some sort. And this kind of thing captures our brain in a different way. It's not that wall of text. It's literally the TLDR of consent.

15:25 Drawing influence from other disciplines, such as medicine

15:25 Pamela Gay: And this is also being experimented with in psychology and biomedical research, where people are being asked to click through and interact in a tutorial leveling up way to prove they actually understand what's going to happen. And I love this concept of learning from our colleagues in multimedia on what communicates most effectively, learning from our colleagues in biomedical. What are the concerns we need to worry about? What are the sensitive populations that we need to be careful about? And often we don't understand what problems we're creating until we talk to people from other populations.

16:04 Shane Hastie: You mentioned in one of those sessions where we were in together yesterday, one of the open space conversations about working with an ethicist. Do tell.

16:15 Pamela Gay: Learning from an ethicist. Yes. I work with Alison Reiheld. She is a professor at Southern Illinois University, Edwardsville and throughout her career she's worked on a number of topics in scientific ethics, medical ethics. Looking at most recently how people of different genders are treated in medicine, where sometimes being a transgendered individual means, you don't get access to normal healthcare due to bigotry inherent in human beings and doctors unfortunately are human beings. And we got to talking because she works with these at-risk communities about how they feel when they're essentially experiments all the time. And I made a passing comment somewhere along the lines of, "I'm so glad I don't have to deal with all these issues with the work I do." We're just mapping other worlds. And I had my eyes opened. There's simple things we don't think about. I don't deal with kids a lot, but there are teachers out there who want to give their kids authentic science engagements and this is an amazing concept.

17:21 Pamela Gay: But kids are by nature self-righteous. They want credit for what they do. They want to fix the world. They're very interested in justice and very black and white. And there are kids out there who are being asked by teachers to do citizen science as a mandatory classroom exercise. So the first time they're encountering doing science, they're being asked to do the busy work of professional scientists in an environment that just gets them a grade and gets them no credit.

17:51 Pamela Gay: And imagine if the first time you ever did something, it was essentially a slave labor for this person in an ivory tower. And she pointed out that there's a very fine line between saying, "Come, do this amazing stuff," and sounding like Tom Sawyer, advertising, "Come paint this fence. It's an amazing opportunity." And taking advantage of people and actually breaking down the barriers and saying, "I want you as my peer. I want you as my colleague. I want to give to you, as a member of my community, as much knowledge and understanding as I possibly can in exchange for you helping me with this task. I can't do any other way. And especially when we're dealing with children and classrooms assignments, we need to make sure we aren't Tom Sawyer. We aren't painting that fence and disillusioning children about what the process of science is. And I would never have thought of that, except I made a stupid comment because I needed to be woken up. And this is what happens when we talk to other people in other fields and think interdisciplinarily.

19:03 Shane Hastie: Spanning boundaries, listening to others. Who would have thought? It will never work.

19:09 Pamela Gay: It's crazy. I know.

19:09 Shane Hastie: Crazy talk.

19:11 Citizen Science as a driver of inclusion 

19:11 Pamela Gay: And citizen science is one of those things where our universe is infinite. Our data is measured in petabytes, and we need to open the door to all kinds. And this is where we also need to figure out how do we make communities that are friendly to all kinds? We've realized that there are a lot of people who are on permanent disability, who find meaning in their day, by doing citizen science. By contributing to our community as moderators, as people in our forums, just keeping the conversation going.

19:44 Pamela Gay: There are a lot of people who live in remote areas and are the only geek where they are. And we're giving these people community. And here scientists are not necessarily trained to build community unless it's around being one of the 5,000 people who helped understand a neutron star, neutron star merger and then everyone had a reason they needed to be there. Now, we're just inviting everyone in. Come as they are, and let's build a place where they're accepted. So we're now having to learn from the social scientists. And it turns out that if it takes a village to raise a child, it takes a global community to raise our understanding of the universe.

20:24 Pamela Gay: Inspiring stuff. Dr. Gay, if people want to continue the conversation, where do they find you?

20:30 Pamela Gay: is our website. And we are cosmoquestX on pretty much every platform out there. The X marks the science, and we invite you to join us, Twitch, Twitter, just on our website, on our Discord. We try and be all the places that people hangout in their spare time on the internet.

20:49 Shane Hastie: Thanks so much.

20:50 Pamela Gay: My pleasure.





More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and the Google Podcast. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Previous podcasts

Rate this Article