When I first came to understand Vietnam’s rigid requirements surrounding data localization I thought the requirements far too rigid and expensive to be realistically enforced. Requiring data collectors to open a branch office or representative office and to refrain from transferring any data collected on Vietnamese citizens out of the country. I’m not the only one who saw this as an overly burdensome regulation and looked forward to guiding decrees that might slacken the enforcement.
One such decree is currently in process and the Ministry of Public Security have verbally stated they will only enforce the provisions against e-commerce platforms that fail to comply with requests to take down conduct that violates the cybersecurity law. On paper, however, the authorities remain able to enforce the letter to the most stringent standards. But the more I consider the regulations the more I question my initial reaction.
Why? You may ask, and rightly so, for many individuals involved with high technology or internet companies suggest that the rule makes little sense and does little to protect against abuses.
As I’ve written over the past two weeks there are a plethora of problems which arise when sharing data internationally. From large organizations like Facebook or TikTok to foreign governments like the USA or China, the potential abuse is nearly unlimited. In view of such obstacles to a truly international forum in which to enforce data regulations against infringement, one might suggest that a strict statutory data localization regulation makes sense if for no other reason than to protect citizens’ rights to their own data. But another justification might also motivate regulators.
In order to train the artificial intelligence through machine learning the computers need access to vast amounts of data. We’re talking data for millions of users. Only with such access can the AI be properly calibrated to understand how to conduct itself and what conclusions it should make. But what about cultural, linguistic, and anthropological differences that arise when dealing with foreign countries?
Take, for instance, a self-driving car. In Arizona where an entire town has been integrated with AI steered automobiles, the traffic regulations are unique. Each state in the United States has its own peculiarities and specific regulations that to train an AI to drive in Arizona might not make the car capable of driving in Utah, or someplace with an even bigger population like New York or Los Angeles.
Now extend that obstacle across international boundaries and you find that the AI necessary to drive a car in a foreign country is vastly different from the AI in Arizona. And if the rules and the language spoken are different, can we really trust Elon Musk to guide our cars in such a way as to deliver us from point A to point B without accident or incident?
What is required instead is natively developed machine learning that is customized to each country and that is taught how to interact in that country’s environment. And that’s just one example of how data sets between countries might differ. What about data used to predict political associations or attitudes toward different brands of toothpaste. American trained AI would come to Vietnam and stall simply by virtue of the large scale anomalies that present themselves.
By preserving the country’s data within its borders, the government can maintain some level of control over what machine learning takes place and how AI might achieve its stated results. It also gives mathematicians and programmers jobs that they would not be able to perform in America because of language barriers or cultural conflicts. And it allows the AI to properly learn about Vietnam and not some primrose strewn community in Arizona.
Maybe data localization isn’t quite as bad as I first thought and maybe the National Assembly knows more about technology that I sometimes give them credit for demonstrating. Good on you, as the Aussies say.