Researchers continue to gather data from social media and other sources to assess information for businesses, universities and maybe public policy. And guess what? Some of the data may be about you.
A forum Wednesday in Milwaukee looked at whether enough ethical standards are in place to protect the public.
The forum was at Northwestern Mutual, which last year started a Data Science Institute. It's a $40 million collaboration with Marquette University and UW-Milwaukee that aims to advance southeastern Wisconsin as a technology hub and do more data-based research. As the institute ramps up, Northwestern Mutual and Marquette played host to an annual symposium on the ethics of so-called "big data."
Keynote speaker Michael Zimmer, a professor of information studies at UW-Milwaukee, began by quoting the late author and educator Neil Postman.
" 'Technology is a Faustian bargain. Technology is something that gives and takes away, but not always in equal measure, ' " Zimmer recalled Postman saying.
Zimmer says he supports responsible practices. But he says there's new tech all the time that doesn't seem to protect the user. That ranges from several controversies involving Facebook to one involving a creative computer code writer and the dating site OkCupid.
"You share information to try to find a match. All kinds of intimate information, personal information, to try to find the right partner for your life," Zimmer said. “A researcher, actually a student researcher traded an OkCupid account, wrote a script to scrape 70,000 accounts, did some analysis and published all that data, without even doing any attempt to anonymize," Zimmer told the audience.
Computer ethicists say it's essential to keep research subjects anonymous, or de-identify them by masking their demographics. Zimmer also says there are some people who intentionally go on social media not understanding that their words and photos can last a long time and possibly be scrutinized by people outside their social network.
“All of a sudden you realize that, 'Wait, my tweets are being used in a data set, you know, to study whether or not you can predict if someone is suicidal? Wait, my picture is being used to determine whether someone is gay or straight by some weird 18th century theory about the way that you look predicts something about you?’ " Zimmer said.
Another computer expert, professor Kyle Jones of Indiana University-Indianapolis, says it's not just the private sector wanting to collect and analyze data. He says educational institutions do it, too, in a practice often called “learning analytics.” Jones says the research is sometimes targeted.
"We could probably argue who's going to be surveilled more than other student populations are those who are already known to be disadvantaged. Minority students. Students who come from families who don't have an educational background. Students who are disenfranchised are going to feel or the underprivileged are going to feel, greater surveillance than those who are privileged,” Jones said.
He says it may be that the schools think they are legitimately trying to help boost student performance, while allocating scarce resources in difficult political times.
"We're trying to reduce the political pressures from outside stakeholders. Perhaps not as much at Marquette, but Milwaukee and Madison feel the burn from state legislators down in Madison," Jones speculated.
He says it's vital that students be able to trust that universities won't release personal data to the public, and that the schools get the students to consent to collecting information or allow them to opt out of disclosure.
UW-Milwaukee Data Services Librarian Kristin Briney says researchers can also avoid a lot of professional headaches if they limit their data gathering because information often leaks.
"Security isn't perfect. Things get out. Technology advances. Holes happen," she warned.
Briney says sometimes the leaks come from within the company or institution. She also encourages researchers to have a clear plan for deleting data when use is completed.
Keri McConnell, of the Data Science Institute, says it uses Northwestern Mutual's privacy policies and has a code of conduct. But she says carrying out those principles can be challenging.
"We are constantly trying to find that right balance between advancing our craft and getting insights from the data, and respecting the individuals that data represents. And that's not an easy line to walk," McConnell said.
McConnell says she's urging her team to understand that ethics are a core component of their work.
Do you have a question about innovation in Wisconsin that you'd like WUWM's Chuck Quirmbach to explore? Submit it below.