A massive Chinese database storing millions of faces and vehicle license plates was left exposed on the internet for months before it quietly disappeared in August.
While its contents might seem unremarkable for China, where facial recognition is routine and state surveillance is ubiquitous, the sheer size of the exposed database is staggering. At its peak the database held over 800 million records, representing one of the biggest known data security lapses of the year by scale, second to a massive data leak of 1 billion records from a Shanghai police database in June. In both cases, the data was likely exposed inadvertently and as a result of human error.
The exposed data belongs to a tech company called Xinai Electronics based in Hangzhou on China's east coast. The company builds systems for controlling access for people and vehicles to workplaces, schools, construction sites and parking garages across China. Its website touts its use of facial recognition for a range of purposes beyond building access, including personnel management, like payroll, monitoring employee attendance and performance, while its cloud-based vehicle license plate recognition system allows drivers to pay for parking in unattended garages that are managed by staff remotely.
It's through a vast network of cameras that Xinai has amassed millions of face prints and license plates, which its website claims the data is "securely stored" on its servers.
But it wasn't.
Security researcher Anurag Sen found the company's exposed database on an Alibaba-hosted server in China and asked for TechCrunch's help in reporting the security lapse to Xinai.
Sen said the database contained an alarming amount of information that was rapidly growing by the day and included hundreds of millions of records and full web addresses of image files hosted on several domains owned by Xinai. But neither the database nor the hosted image files were protected by passwords and could be accessed from the web browser by anyone who knew where to look.
The database included links to high-resolution photos of faces, including construction workers entering building sites and office visitors checking in and other personal information, such as the person's name, age and sex, along with resident ID numbers, which are China's answer to national identity cards. The database also had records of vehicle license plates collected by Xinai cameras in parking garages, driveways and other office entry points.
Photos of vehicle license plates tracked across China. Image Credits: TechCrunch (composite)
TechCrunch sent several messages about the exposed database to email addresses known to be associated with Xinai's founder but our emails were not returned. The database was no longer accessible by mid-August.
But Sen is not the only person to have discovered the database while it was exposed. An undated ransom note left behind by a data extortionist claimed to have stolen the contents of the database, who said they would restore the data in exchange for a few hundred dollars worth of cryptocurrency. It's not known if the extortionist stole or deleted any data, but the blockchain address left in the ransom note shows it hasn't yet received any funds.
China's surveillance state sprawls deep into the private sector, giving police and government authorities near-unfettered access and capabilities to track people and vehicles across the country. China uses facial recognition to track its vast population in smart cities but also uses the technology for mass surveillance of minority populations that Beijing is long accused of oppressing.
China last year passed the Personal Information Protection Law, its first comprehensive data protection law that is seen as China’s equivalent of Europe’s GDPR privacy rules, which aims to limit the amount of data that companies collect but broadly exempts police and government agencies that make up China's vast surveillance state.
But now with two mass data exposures in recent months, both the Chinese government and tech companies are finding themselves ill-equipped to protect the vast amount of data that their surveillance systems collect.