Mar 28, 2019

Data, Rivalry and Government Power: Machine Learning Is Changing Everything

By Steven Weber and Gabriel Nicholas

It wasn’t that long ago that the digital economy was thought to float on a plane above conventional geopolitics and economics. The global Internet, aspiring to be “free and open” and surprisingly close to that in reality, was a general-purpose technology as revolutionary as the internal combustion engine, telephony, electricity and industrial machines all wrapped into one. Phrases such as “the death of distance,” “the end of geography” and “organizing without organizations” might have been perceived as a bit over-hyped, but not entirely mistaken. Directionally, at least, they signaled a profound new reality of political economy and social life. Competition would be immense and intense in this new reality, in part because the advantages of incumbency in the pre-Internet era had become the legacy burdens of the obsolete installed base in the new era, and everything was up for grabs from just about anywhere that had a decent bandwidth connection.

That kind of market hyper-competition left no room for rivalry — zero-sum confrontations buttressed by the power of governments. Why battle with rivals over a limited pie when new value could be created by simply routing around or transcending old conflicts? Why have governments steer development when the only limit on private innovation is Moore’s Law? This hopeful narrative about digital technology reached its apogee in John Perry Barlow’s 1996 Declaration of Independence in Cyberspace: “Governments of the Industrial World, you weary giants of flesh and steel, I come from Cyberspace, the new home of Mind. On behalf of the future, I ask you of the past to leave us alone. You are not welcome among us. You have no sovereignty where we gather.” It seems mildly absurd in 2019, but Barlow’s words were sufficiently mainstream to be taken seriously when he delivered them at the premier convening of incumbent power, the World Economic Forum’s Annual Meeting in Davos, Switzerland.

Fast forward to the present, and the narrative around digital technology seems almost entirely different. Data have become a strategic commodity that companies and governments are trying to protect, defend and even hoard to the exclusion of others. In 2018, it became standard practice for countries around the world (even countries that have very little research or commercial activity to speak of in this area) to write and announce national “AI strategies.” Developing human capital in data science and machine learning is becoming a strategic priority for governments. The multiple-winner optimism of market competition has receded in favor of a rivalrous clash for technological hegemony, and governments are no longer standing on the sidelines. The leaders of Russia, China and the United States have all said in one way or another that dominance in AI will over the next decade translate into dominant national power and leadership in the global economy and security.

How did we get here?

At the frontiers of technology, competition has transformed into rivalry. How did this happen? And is this a transition phase, or is it rather a new geopolitical reality?

One interpretation is that we’ve seen this story or at least one very much like it unfold before in digital technology, and that calm is warranted as we transit through a zero-sum, government-led rivalrous phase that will not become anything like a persistent condition. The battles over the semiconductor industry between the US and East Asia during the 1980s had some of these characteristics, but it didn’t last. After years of subsidization, the creation of national consortia like Sematech, contentious trade negotiations around business practices like the Structural Impediments Initiative, and much hand-wringing about losing the race for predominance, the world came to an intellectual and political synthesis that significantly turned down the heat of national rivalry in semiconductors.

Japan suffered through an economic collapse that was cushioned little, if at all, by its digital technology assets. Chinese manufacturing rose to the fore and made possible a hyper-efficient global supply chain for computers and other electronics (and later mobile phones) that massively accelerated the dissemination and deployment of today’s digital infrastructure around the world. The rise of open-source software and the globalization of the supply chain for software engineering (led by IBM, a US-based multinational that re-constituted itself in the early 2000s as a “globally integrated enterprise”) took that part of the digital value chain out of the national rivalry mindset.

When Thomas Friedman in 2005 proclaimed that the world was “flat,” he captured the mood of that moment accurately and succinctly — but almost precisely at the moment when this period of relative calm was coming to an end. It might have been natural to assume that the next generation of digital technology — machine learning above all — would continue to fit Friedman’s template for a globally competitive level playing field. But machine learning is different.

Machine learning broadly refers to the science and technology of machines capable of sophisticated information processing that are not “programmed” in a traditional sense by people writing sets of instructions (code) that the computer then executes. Machine learning instead uses a set of methods that enable computers themselves to extract patterns from large data sets and evolve their own algorithms (decision rules) that the machine then runs against new data, problems and questions. To recognize a face, for example, a machine-learning system does not run code that a programmer wrote to describe what eyes, ears and noses look like. Instead, it derives an algorithm that infers what faces look like from a training data set according to methods and rules that were the human contribution. The facial-recognition algorithm is a property of the machine-learning system, and even in a case that is only moderately sophisticated, may involve a set of parameters that are orders of magnitude larger and more complex than a human can comprehend.

Data is the fuel that makes the system run. But the analogy only goes so far, because the engine and the fuel both fade into the background as the system learns and improves its performance. And this is where the new dynamics of rivalry come to the fore, because data and machine learning systems together constitute a positive feedback loop where the leaders will tend to accelerate ahead of the laggards at an increasing rate.

The methods and rules that humans contribute to the system are its engine, sitting at the intersection of advanced statistics and computer science. Data is the fuel that makes the system run. But the analogy only goes so far, because the engine and the fuel both fade into the background as the system learns and improves its performance. And this is where the new dynamics of rivalry come to the fore, because data and machine learning systems together constitute a positive feedback loop where the leaders will tend to accelerate ahead of the laggards at an increasing rate. Those leaders and laggards can be countries just as easily as they can be companies, replacing corporate competition with national rivalry that itself can become a positive feedback loop and something of a self-fulfilling prophecy, at least for a time.

The logic of the positive feedback loop is conceptually simple. Consider this abstraction: Users in country X send “raw” data to machine-learning companies in country Y as they use digital products. Those companies use the “imported” data as inputs to their systems that, in turn, create higher value-added data products. These might be algorithms that tell farmers precisely when and where to plant a crop for top efficiency; business process re-engineering ideas; healthcare protocols; annotated maps; consumer predictive analytics; insights about how a government policy actually affects the behavior of businesses or individuals and more. These value-added data products are then exported from companies in country Y back to users in country X.

Here’s a concrete example: Imagine that a large number of Parisians use Uber on a regular basis to get around the city. Each passenger pays Uber a fee for her ride. Most of that money goes to the Uber driver in Paris. Uber itself takes a cut, but it’s not the money flow that really matters here. Focus instead on the data flow that Uber receives from all its Parisian “customers” (best thought of here as including both “sides” of the two-sided market; that is, Uber drivers and passengers are both customers in this simple model). Each Uber ride in Paris produces a quantum of raw data — for example, about traffic patterns, or about where people are going at what times of day — that Uber collects. This mass of raw data, over time and across geographies, feeds the further development of Uber’s algorithms. These, in turn, are more than just a support for a better Uber business model (although that effect in and of itself matters because it enhances and accelerates Uber’s competitive advantage over traditional taxi companies). Other, more ambitious data products will reveal highly valuable insights about transportation, commerce, life in the city, and potentially much more (what is possible stretches the imagination). Now, if the mayor of Paris in 2025 decides that she needs to launch a major re-configuration of public transit in the city to take account of changing travel patterns, who will have the data she’ll need to make good decisions? The answer is Uber, and the price for data products that could immediately help determine the optimal Parisian public transit investments would be (justifiably) high.

Stories like these could matter greatly for longer-term economic development prospects, particularly if there is a positive feedback loop that creates a tendency toward natural monopolies in data platform businesses. It’s easy to see how this could happen, and hard to see precisely why the process would slow down or reverse at any specific point. The more data that machine-learning companies absorb, the faster the improvement in the algorithms that transform raw materials into value-added data products. The better the data products, the higher the penetration of those products into markets around the world. And since data products generate more data as they are used, the greater the character of data imbalance would become over time. More raw data moves from country X to country Y, and more data products move from the country Y back to country X, in a positive feedback loop.

This simple logic doesn’t yet take account of the additional complementary growth effects that would further enable and likely accelerate the loop. Probably the most important is human capital. If the most sophisticated data products are being built in a few particular places, then it becomes much easier to attract the best data scientists and machine learning experts to those places, where their skills would then accelerate further ahead of would-be competitors in the rest of the world. Other complements (including basic research, venture capital, and other elements of the technology cluster ecosystem) would follow as well. The algorithm economy is almost the epitome of a “learn by doing” system, with spillovers and other cluster economy effects.

No positive feedback loop like this goes on forever. Machine learning systems may run into limits and diminishing marginal returns at some point. Bespoke hardware for machine learning systems may offer another way into positive feedback loops. Some machine learning technologies can become less dependent on data over time, as they create models of their environment which run in simulation and generate “data” endogenously. But none of these compensatory mechanisms is yet visible and viable enough to matter. Without a clear argument as to why, when and how positive feedback loops would diminish or reverse, there’s justification for concern about natural monopolies, with real consequences for national rivalry. It’s possible to imagine at the limit a vast preponderance of machine learning business being concentrated in one or a very few countries. These countries would then own the upside of data-enabled endogenous growth models. They would combine investments in human capital, innovation, and data-derived knowledge to create higher rates of economic growth, along with positive spillover effects into other sectors. In the parlance of US economist Paul Romer, these countries would be advantaged in both making and using ideas.

And they would almost certainly enjoy an even greater and more significant advantage in what Romer called “meta-ideas,” which are ideas about how to support the production and transmission of other ideas. What are the best means of managing the intellectual property around algorithms? What are the most effective labor market institutions that can support the growth of algorithm-driven labor demand? Even if they don’t have the same kind of exclusivity as raw data, these kinds of meta-ideas can keep the positive feedback loop going, and they are more likely to emerge in countries and societies that are already ahead in the data economy.

Put this together, and the stakes of national rivalry make a kind of unfortunate sense. What is now possible in machine learning, without yet appealing to science-fiction visions of artificial intelligence, hits directly at the sources of national power and social coherence. Powerful nations cannot afford the political or economic cost of being outside the positive feedback loop, and even a small gap behind a competing nation could turn into a technological chasm. And this is without yet addressing the military applications of machine learning advantage, which are considerable and could be decisive. A leader in autonomous vehicles, facial recognition, and predictive analytics for consumer behavior is also going to be a leader in autonomous weaponry applications and advanced battlefield artificial intelligence systems.

Big Rivalry

Technology rivalry is different from normal market competition. Rivalry invokes the power and interests of governments not simply as umpires and regulators but as stewards, principal users, direct funders, and sometimes full owners of technology. Machine learning is moving the digital environment overall closer to rivalry, with governments back at the center of the game. A notable indicator of this shift is simply the degree to which discourse around digital technology in national capitals and also in general-interest news and media has in the last few years become almost fully nationalized. Just two or three years ago, the “free and open Internet” narrative that placed governments squarely in the background was still robust (even if it was always somewhat naive). That ideology is mostly gone now, and the new narrative centers on digital technology firmly yoked to the goals of national power as seen through the eyes of governments.

That may be more historically familiar, but it is also a significant discontinuity for the Internet and the digital economy. The transformation itself sets up some thorny challenges that governments and businesses will have to navigate in the very near future. An example is “norm talk” — the notion that shared expectations and rules of the road for companies and governments in the digital environment can be identified and congealed through dialogue and negotiation. That seems unlikely at this very early stage of a new and robust rivalry, when the terms of advantage and disadvantage are still so nascent. Norm talk in more established domains has failed over less. Even though cyber conflicts have been a fixture of international relations for over two decades, the United Nations can barely get the major cyber powers to convene, much less agree on definitions (after nine years, its group of information security experts has agreed on little more than that norms should indeed be established.) Norms and rules of the road are more likely to emerge right now (if at all) through highly visible action (and restraint of action) by the most prominent and powerful of players — the US, China and the data-platform companies in each. But that suggests normative power is also becoming more concentrated in the two leaders, for whom norms serve as another way of reinforcing the positive feedback loops that keep them racing ahead of others.

The emerging rivalry landscape won’t support the continuation of light-touch regulation and permission-less innovation that governments and business had carved out together as a foundation for the digital economy over the last 20 years. The freedom to develop and deploy new technologies, unless and until it is shown definitively that those technologies are dangerous, was a great formula for private sector innovation, but it is not a great formula for state-based rivalries and it has not shown itself to be a route to improved digital security. We should now expect more rapidly diverging experiments in new regulatory regimes around the world, which means additional space for countries to express their particular values in the digital economy and society, but less common market infrastructure for businesses at global scale.

This also suggests that digital geopolitics should not be seen as a layer superimposed on conventional geopolitics, but as a major geopolitical force itself that will create its own new alignments among new actors, and not only states. Concretely, if you now hold the belief (as many do) that “no one really goes to war over a cyberattack and if they do, it would be about the kinetic consequences of the attack, not about the cyber part of it,” it’s time to revisit that belief. Data and IP theft are now a foreign-policy problem for states, not just a business-model problem for companies to manage. The attribution of cyberattacks will focus as much on the US National Security Agency as on parastatal and criminal groups. States such as Denmark have already appointed a formal ambassador to the technology sector. Definitions of what constitutes criminal activity and who or what is a criminal are up for grabs. The boundaries are blurring along almost all the key dimensions that defined the core geopolitical alignments of the post-Cold War era, and it may just be the rise of machine learning rivalry that puts the final nail in the coffin of that old order.

Machine learning will make old jobs obsolete more quickly than it creates new ones, and the transition period to some new equilibrium will bring fundamental breakdowns and failures in labor markets, and consequently in politics.

A third significant manifestation of rivalry is the increasingly zero-sum nature of job displacement and inequality. Machine learning will make old jobs obsolete more quickly than it creates new ones, and the transition period to some new equilibrium will bring fundamental breakdowns and failures in labor markets, and consequently in politics. Many Asian countries seem to have a higher level of confidence that their societies can endure these changes, built on their experience of resilience and cohesion in the face of possibly comparable challenges (revolutionized labor markets) just a generation or so back in time with industrialization. That confidence could very well be tested by populist movements not unlike those in the US and Europe. More troubling still is the recognition that the success stories of the developing-country model (low wage manufacturing evolves toward higher value-added jobs along with capital accumulation) may now be dominated by data flows and machine learning products that make that ladder obsolete. Transnational movements of distressed labor could be a new factor — whether or not they cross borders, their ability to organize across borders would be an important part of the new security landscape.

Probably the most consequential decisions that immediately face states revolve around the status of the platform companies, whose relationships with governments, consumers, and societies need special assessment and possibly oversight and regulation. Geopolitical rivalry is coming to shape these debates as much or more than privacy, consumer welfare or other competition policy concerns. Market power and oligopoly are now an assumption in most of the world, but the mood and views around these is different in the US, Europe, and throughout Asia.

When oligopolies serve the national interest, particularly both economic growth and national security interests as governments view them, the tolerance for anti-competitive behavior in markets and politics takes on a different significance. Nobody right now refers to the platform companies as national champions; and almost nobody would think of Google and Baidu as relating to their respective home governments the way in which major defense contractors do. Technology rivalries may surprise both sides by pushing them in that direction.

Rhetoric in the public sphere around technology, particularly machine learning, has been steadily calcifying over the last several years. “Permission-less innovation,” “XYZ-as-a-service,” “disruption” — these terms reflect a steadfast faith in market competition and a nimble private sector that can stay one step ahead of lumbering governments. But that rhetoric no longer reflects reality. As the consequences of falling behind in machine learning take on geopolitical dimensions, governments are no longer taking a back seat. The shift from competition to rivalry at the frontiers of technology is well under way, and it is changing not only the private sector, but the ways in which nations jostle for power.

Originally published as “Data, Rivalry and Government Power: Machine Learning Is Changing Everything" in GlobalAsia on March 28, 2019. Reprinted with permission.

Steven Weber is a professor at the School of Information and faculty director of the Center for Long Term Cybersecurity.

Gabriel Nicholas is a School of Information MIMS '18 graduate and is currently a fellow at the NYU Center for Cybersecurity.

The opinions expressed in this article are the authors' own.

Steven Weber

Gabriel Nicholas

Last updated: April 9, 2019