Is Big Data Corrupting the U.S. Election Process? | Center for Digital Ethics & Policy

Molly Kozlowski

As the 2020 election cycle ramps up, voters can expect a flurry of targeted advertisements fueled by big data on their doorsteps, inboxes and social media feeds. While microtargeting based on demographic information is not a new trend in campaign strategy, campaigns traditionally relied on analyzing voter behavior within broader categories such as age or gender before big data was easily accessible. Now, similarly to how the private sector employs data to target consumers, campaign strategists utilize data to micro-target specific voters through their advertising to fit the specific needs and goals of their candidate or platform. However, these practices may jeopardize the integrity of our electoral process.

To build these databases, campaigns rely on voter information collected by states when an individual registers to vote, such as phone numbers, gender and party affiliation. An alternative way to acquire this intelligence is through response data, which consists of information given voluntarily by voters through door-to-door canvassing, telephone conversations and websites. These are organized within digital databases such as the Voter Action Network (VAN), a database employed by Democratic candidates that uses 20 million voter profiles to support estimates, craft fundraising and volunteer asks and link voters’ emails to their social media profiles. Commercial analysts also assist in data collection, like Amazon Web Services (AWS), which partnered with the Obama 2012 campaign to develop models and analytics for targeted media advertisements.

During the current 2020 cycle, both parties have been expanding their data collection and analysis efforts. For example, President Trump’s re-election campaign is using the data of 1.6 million volunteers to build a large-scale, ground game operation. In 2016, the Trump campaign didn't gain access to the RNC's data and volunteer operations until after he secured the nomination midway through 2016. Now, data is at the core of how the Trump team hopes to organize its campaign operations, using big data in real-time to determine which cities are most effective for rallies or surrogate stops and which issues they should focus on in these cities. Campaign events are also being designed to serve as data harvesting opportunities for SMS text contacts.

On the Democratic side of the race, billionaire and former New York City mayor, Michael Bloomberg, is leading an initiative to help transform the Democrat’s data infrastructure by building a new data-centric political operation or “tech stack” to help process and apply data. The team “wants to collect data about voters on an unprecedented scale, match those data with consumer data and then hire a team of engineers to do high-level analyses.” Meanwhile, the DNC is spearheading a new database called the Democratic Data Exchange, an “independent, for-profit enterprise that will allow the national party, state parties and some independent groups, like Planned Parenthood or labor unions, to share data.”

Although having access to this type of data allows candidates up and down the ballot to understand their voters better, it may be harmful to the security and integrity of the U.S. election process in the long term. During the 2016 election cycle, there were multiple examples of how the personal information of voters can easily be accessed and abused by campaigns.

Voter security, in particular, became an area of great concern when a high-profile voter data breach occurred during the Democratic primary in 2015, allowing staffers from the Bernie Sanders campaign access to private campaign data from the Hillary Clinton campaign due to a glitch in the software’s security. Stored on the VAN networks, this data included not only information about voters but also proprietary data collected throughout the campaign.

The Cambridge Analytica consulting firm was involved in another highly controversial use of data in the 2016 election cycle. Using the research of Aleksandr Kogan of the University of Cambridge, the firm was able to harvest data from 50 million Facebook users, which they sold to the Trump campaign to help make “targeted online ad buys” during the election. Kogan acquired the data by way of a survey to 270,000 Facebook users, which also enabled access to the data of the survey-takers’ Facebook friends through a loophole in the platform’s API.

Additionally, on Twitter, pro-Trump bots, or “autonomous software applications,” automatically “sent targeted messages to voters and generated one-quarter of all Twitter traffic,” outnumbering Clinton bots five to one. The data these bots collected enabled them to create strategic messages based on the fears and anxieties expressed from voters surrounding Second Amendment rights.

However, as 2020 election candidates continue to rely on data to fuel their outreach and media, so does the risk of the same security and privacy violations we saw in 2016, potentially on a greater scale. None of the revelations of the past cycle have led to serious action by public officials to restrict and regulate data collection and its use on the campaign trail.

Currently, most state voting laws fail to abide by the basic requirements laid out by the Fair Information Practices (FIPs), which establish appropriate controls over the collection, use, and disclosure of personal data, including protecting data against security threats and establishing accountability measures. These laws do not notify the public about the collection of SSNs, allow for optional data fields, and offer few choices to citizens who may wish to redact certain information or limit secondary uses. Some states have taken action to address this issue, including Vermont, which implemented a new law that requires data brokers that buy and sell data from third parties to register with the state. In California, a law was passed that would give residents the ability to opt-out of having their data sold. A few states have similar bills in the works that have yet to be passed.

Another avenue of regulation can come from amending the loopholes present in our current federal privacy laws, like the Federal Election Campaign Act. One component of this reform would require political actors to disclose their campaign data practices to the public, including the harvesting of voter information. The second component of the proposal would require campaigns to provide a disclaimer identifying ads that were produced using targeted data practices.

Despite clear pathways towards regulating data usage, the true challenge in strengthening voters’ control over the content they consume is policymakers’ own reliance on data. Congress and other legislative bodies hesitate to restrict the tools and strategies that help them get elected, making it difficult to gain significant strides in privacy policy. Evan Halper of The New Republic attributes this difficulty to the lack of understanding amongst legislators regarding the technology. Halper says it fuels unethical data usage which acts as a barrier to true reform, making it nearly impossible for lawmakers and the average voter to understand the implications of unregulated data.

Additionally, as data collection continues to be profitable for media giants like Facebook, little transformative changes by the platforms themselves have been made to improve voter’s privacy and transparency. In March of 2018, Zuckerberg stated that Facebook would investigate “all apps that had access to large amounts of information,” such as the ones used by Cambridge Analytica. By the following August, the company reported that it had investigated thousands of third-party apps and suspended more than 400. However, Facebook has not reported any other significant strides towards increasing its total suspensions since then.

Facebook also has done little to allow voters to chose what information campaigns can potentially mine from their profiles and activity. Last May, the platform announced that they were in the process of creating a “clear history” tool that would enable users to erase all the information Facebook had gathered about their personal browsing history. At the time, Facebook said the tool would “take a few months to build” but have yet to announce a hard release date. Facebook’s “pivot to privacy” plan also involves the integration of all three of the company’s messaging apps – WhatsApp, Instagram, and Messenger – into a singular platform creating end-to-end encryption. However, like the “clear history tool,” the project’s timeline has remained unclear. Thus, Facebook’s confusing and slow steps towards reform post-Cambridge Analytica raises concerns over whether tech companies are truly interested in giving their users control over their use of the platform.

As campaigns continue to increase their dependence on data, significant threats to the safety and privacy of voter’s information will grow as campaign databases and social media platforms are left unregulated and vulnerable to potential leaks. Thus, voters must challenge their legislators, commercial data vendors, candidates, and social media platforms to respond to the need for stricter security and transparency to protect the integrity of the American electoral process.

Molly Kozlowski

Molly Kozlowski was born and raised in Sonoma County, California. She is currently in her third-year as a communications and political science student at Loyola University Chicago. In addition to her studies, she acts as the Communications Director for Indivisible Loyola and enjoys competing on LUC’s club cross-country team.

Add new comment

Your name

Comment

About text formats

Restricted HTML

Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
Lines and paragraphs break automatically.
Web page addresses and email addresses turn into links automatically.