There is no standard of ethics in computer security research

August 3, 2014 ∞

Research ethics has been in the news lately, beginning with Facebook’s experiments on causing unhappiness in users, and dating site OkCupid’s experiments on users. Some consider this just standard A/B testing, while others see this as actively harming users without informed consent.

The security angle comes up in the recently cancelled Blackhat talk by CERT researchers that appears to have been about methods for de-anonymizing Tor users. The Tor project found evidence that its users had been targeted, possibly by the researchers.

This has caused a bunch of security researchers to condemn the work as unethical. I think that’s a farce: plenty of respectable, academic security researchers do work that could as well be considered unethical.

Most academic security research does not go through a formal ethical approval process, unlike, for example, biomedical research in the US which must pass an institution review board, or IRB (this is mandated by the Department of Health and Human Services, which provides much of the funding). In security, ethics is left up to the researcher, which means, there are no standards: researchers have wildly varying opinions on what is ethical.

Here are a few places where ethical questions come up in security research.

Botnets

A botnet is a collection of computers that have been infected with malware. The bots are controlled by someone who can use them to make money (e.g., sending spam) or mischief (e.g., distributed denial of service attacks).

Some botnet research involves interacting with bots: for example, to see how many bots are in the botnet you might try contacting them all, which involves asking a bot that you have identified to tell you who else is in the botnet.

Crucially, this involves interacting with a victim’s computer. This is the point at which I would have to stop and think carefully. Is it OK to probe a victim’s machine without getting permission in advance? Is it possible that this could have adverse effects on their computer? If I have identified a victim do I have an ethical obligation to inform them of the problem?

These questions are typically ignored by researchers. Consider, for example, “Storm: when researchers collide”, which describes security research on the Storm botnet. It contains this remarkable passage:

The trouble with having so many easily performed attacks is that invariably they are used often. Indeed, at one point or another all of these attacks have been performed in the Storm network—often several of them concurrently. To successfully crawl the network a researcher must put extensive engineering time into detecting and reducing the effectiveness of each of these attacks. This, in turn, encourages other researchers to perform more stealthy and difficult to detect attacks. There was a joke at a recent security conference [7] that eventually the Storm network would shrink to a handful of real bots and there would still be an army of rabid researchers fighting with each other to measure whatever was left!

Notice that they describe the activities of some researchers as “attacks,” which must be countered by other researchers. When you are faced with questions like, “Which attacks are ethical?”, I think that someone has probably crossed the line.

Botnet victims are not the only ones affected by research. In another paper the same authors elaborate:

Before October 2007, Storm shared its overlay network with users of Overnet-based file-sharing programs such as MLDonkey. As a result, a crawler needs to differentiate between nodes participating in the Storm protocol and other applications.

In other words, the researchers not only interacted with bots, they interacted with perfectly ordinary machines that happened to use the same peer-to-peer network.

I don’t want to pick on these particular researchers. In fact, their paper appeared in a conference session titled, “Measurements, Uncertainties, and Legal Issues,” and another paper in the session was called Conducting Cybersecurity Research Legally and Ethically.

Peer-to-peer networks

P2p networks in general are popular in security research because they are often used by malware and they are often used to commit (what some consider) copyright violation. They are also used for legitimate purposes (downloading software packages, etc.). The ethical problem for security researchers is, as above, that some research on bad uses of the network could affect legitimate users.

Reverse engineering

Reverse engineering is a useful tool for security researchers. Besides its use in understanding malware it can be used to find security vulnerabilities in non-malware, e.g., commercial software.

The ethical problem here is that most commercial software comes with a licensing agreement that expressly forbids reverse engineering. Licensing agreements often contain unreasonable conditions on their use, and these might be a good example. But that doesn’t mean that academic security researchers are free to ignore these conditions.

Web scraping

I’ve had some research ideas that involved web scraping that I’ve had to abandon. For example, I wanted to explore ways of identifying mobile malware by analyzing mobile app stores. Unfortunately, the only way to get information out of these stores is to use web scraping, which is either explicitly disallowed in the terms of service (Apple), or which is watched for and explicitly shut down (Google). Moreover, many web sites explicitly forbid scraping (e.g., using a robots.txt file).

If someone tells me explicitly that I’m not allowed to gather information through their web site, I’m going to find something else to do. But I’ve seen plenty of security research that appears to have relied on gathering information from app stores, seemingly without the cooperation of Apple or Google.

Is that ethical?

Port scanning

Port scanning is the practice of connecting to computers on the network (potentially, all computers on the internet) to see what services they provide. It’s often used with ill intent, for example, to identify vulnerable computers to attack. It also has many good uses, for example, to identify vulnerable computers so that they can be fixed. In any case, it’s something that many internet service providers forbid: they don’t want their customers to scan other networks, and they don’t want anyone else scanning their network.

Robert Graham, the author of masscan, says that

It’s not that scanning is intrinsically bad or illegal, it’s just that it’s associated with hackers/scammers/spammers.

And later,

The Internet is designed to be an “end-to-end” network, where such massive scanning is as normal as spidering websites for search engines. A lot of people aren’t happy about this, of course, but such scanning is intrinsic to the design of the Internet.

I see a lot of problems here, starting with historical revisionism. More importantly he admits that many people object very strongly to being scanned:

But frankly, being the good guy is a lot of hassle. I don’t mind being called a “fucking asshole” in the abuse complaints (which happens), but I do mind the legal threats. I’m extremely open and transparent about my scans, documenting when I do them, what I’m doing, the raw results, and the source code of the tool I use. Yet, all this can become evidence in a trial in the modern climate of over-prosecution of researchers.

Cyber security laws are overly broad and overly prosecuted, but that doesn’t mean that one person or one community can unilaterally declare them null and void. When the disagreement is this strong, you are well into an ethical gray zone.

For a particularly apt example, he explains how he used masscan to look for computers vulnerable to Heartbleed here; I’ve already written here about someone who was arrested for similar behavior.

I happen to agree with Graham when he says,

Moreover, the more we know about the state of the Internet, the better we can secure it. It’s astonishing how little people know about what’s listening on the Internet. The more we do this, and publish our results, the better off the Internet will be.

but many people will see this as arguing that the end justifies the means, a classical ethical dilemma.

PR

Computer security is the first academic computer science field I encountered that issues press releases to advertise publications. Remarkably, there is even a press presence at conference and workshops.

This breeds sensationalism rather than truth, and it results in a focus on attacks rather than defenses. I don’t think this improves the field, and hype over security issues can even be unethical.

Disclosure

Within the field there are arguments over when and how security vulnerabilities should be disclosed to the public. Security researchers often point to vendors as being unethical by ignoring and hiding vulnerabilities, but it must also be said that vendors complain about the ethics of security researchers; for example, that releasing a vulnerability before a patch has been applied puts users at risk.

Retaliation by vendors has been so strong that some researchers find it more appropriate to sell vulnerability details on the open market.

This is another area where people simply don’t agree on what is ethical. (MailPoet vs. Sucuri is a recent example.)

Back to CERT

I see a lot of researchers with a holier-than-thou attitude towards the CERT research (“CERT has some explaining to do“).

But as I’ve shown, there really is no agreed upon ethical standard for the field as a whole, and it’s doubtful that researchers would be happy if we tried to impose one. That would inevitably look like the IRBs that apply to medical research, of which danah boyd says

IRBs are an abysmal mechanism for actually accounting for ethics in research. By and large, they’re structured to make certain that the university will not be liable. Ethics aren’t a checklist. Nor are they a universal. Navigating ethics involves a process of working through the benefits and costs of a research act and making a conscientious decision about how to move forward. Reasonable people differ on what they think is ethical. And disciplines have different standards for how to navigate ethics. But we’ve trained an entire generation of scholars that ethics equals “that which gets past the IRB” which is a travesty. We need researchers to systematically think about how their practices alter the world in ways that benefit and harm people. We need ethics to not just be tacked on, but to be an integral part of how everyone thinks about what they study, build, and do.

IRBs, in other words, are a lowest-common denominator, equivalent to “no bad PR” and “no legal liability.” Most security researchers I know don’t want to go by those standards.