In a astonishing victory for companies whose business model depends on scraping data from the web, the U.S. Ninth Circuit Court of Appeals held this week that such activity does not indeed violate the U.S. Computer Fraud and Abuse Act. The decision, which allows hiQ Labs, a company offering “a data science company, informed by public data sources, applied to human capital”, to continue scraping publicly available profile information from LinkedIn for its own business purposes, pointing out the divide around the notion of privacy and data protection between Europe and the U.S. It also brings into sharp view the fault lines between privacy and competition policy, particularly in the context of major tech platforms and the data ecosystems they nurture.
In this case, in addition to that “sign” LinkedIn handed the “visitor” a personal note — the cease-and-desist — warning they’re not welcome on the property. But the court rejected such an argument, at least insofar as the encroachment is regarded as a violation of the criminally enforced CFAA.
The court distinguished between access to publicly available profile information on LinkedIn, which cannot be “unauthorized,” and access to information on Facebook, which is restricted to users who sign-in to the platform with their username and password. Circumventing such password restrictions to scrape data could be a violation of CFAA (Facebook v. Power Ventures).
To a European bystander, the result of the decision may seem odd. How could hiQ possibly be allowed to scrape individuals’ personal data and use it for “people analytics”? What is the legal basis for this? Of course, even if the information is publicly available, individuals have not consented to such a use; and they do not have a contract with hiQ.
Cross GDPR Articles 6(1)(a) and (b) from the list.
Herein lies the trans-Atlantic divide on privacy and data protection.
In Europe, a company needs a legal basis, that is, positive permission, to process data. You are allowed to do only what the law explicitly sanctions. Whereas in the U.S., the opposite is true. A company — anyone really — is allowed to do anything with data, as long as the law doesn’t prohibit it. And indeed, the hiQ court held that the law, or at least CFAA, doesn’t prohibit access to an area that is open to the public. The differences in views around privacy are particularly stark in connection with publicly available information, since in the U.S. any limitation of collection and use of such data also triggers First Amendment concerns.
The privacy implications of the decision were not lost on the Ninth Circuit. To LinkedIn’s argument that hiQ should be enjoined from accessing data to protect users’ privacy, the court replied that such privacy interests are outweighed by hiQ’s right to conduct business. The court stated that “there is little evidence that LinkedIn users who choose to make their profiles public actually maintain an expectation of privacy with respect to the information that they post publicly, and it is doubtful that they do.”
More saliently, the court questioned the bona fides of LinkedIn’s argument, given that LinkedIn itself offered to recruiters similar analytics services to those of hiQ. To that effect, the court quoted a CBS interview with LinkedIn CEO Jeff Weiner, who expressed the platform’s intent to “leverage all this extraordinary data we’ve been able to collect by virtue of having 500 million people join the site.”
It’s worth quoting the court on this issue:
“HiQ points out that data scraping is a common method of gathering information, used by search engines, academic researchers, and many others. According to hiQ, letting established entities that already have accumulated large user data sets decide who can scrape that data from otherwise public websites gives those entities outsized control over how such data may be put to use.” — And, “We agree with the district court that giving companies like LinkedIn free rein to decide, on any basis, who can collect and use data—data that the companies do not own, that they otherwise make publicly available to viewers, and that the companies themselves collect and use—risks the possible creation of information monopolies that would disserve the public interest.”