On Wednesday (22 May) Transport for London (TfL) announced that from July passengers on the Underground will be tracked via their smartphones’ Wi-Fi signals.
The initiative will provide data about how passengers travel through stations and across different lines. While TfL can use ticket barriers to assess where passengers enter and exit the Underground, it has not been able to see which of the many possible routes people take in between.
The aim of the project is to give customers more precise information about the network so they can identify the easiest way to get around the capital. Crowding data will also be fed into TfL’s API, which is used by apps such as CityMapper and Google Maps to relay near real-time travel information to passengers.
Another advantage of being able to see how passengers travel through stations, TfL says, is that it will provide data about their exposure to different adverts, with the ultimate aim being that this will increase revenues.
The roll-out comes after a trial of the system sparked a privacy backlash in 2016, but TfL has said it worked closely with the Information Commissioner’s Office to ensure data protection issues were addressed in the full roll-out.
“While I am excited about the potential of this new dataset, I am equally mindful of the responsibility that comes with it,” said TfL’s chief digital officer Lauren Sager Weinstein. “We take our customers’ privacy extremely seriously and will not identify individuals from the Wi-Fi data collected.
“Transparency, privacy and ethics need to be at the forefront of data work in society and we recognise the trust that our customers place in us, and safeguarding our customers’ data is absolutely fundamental.’
One of the key differences between the trial and full deployment is that customers’ movements will only now be tracked if they have signed up to TfL’s free Wi-Fi service, which is provided by Virgin Media. But Eerke Boiten, a professor of cyber security at De Montford University, said TfL should have gone a step further. “Once people sign up or connect to the Wi-Fi, they could give an option to sign out of being tracked.”
While TfL stressed that smartphones’ MAC addresses would be depersonalised, Boiten added that it may be possible to link the MAC address customers use when they sign up for WiFi to the code which is generated to protect their identity.
Although the code cannot be reverse engineered, TfL uses the same salt key so that it can continue to track the same person through the network. “What they could do is try all the known MAC addresses [provided to Virgin at registration] and see if they give the same outcome,” said Boiten. Because a MAC address is unique to each smartphone, this may then enable the organisation to identify an individual.
The longer an individuals’ movements can be tracked through the network, the more identifying the data becomes. TfL should change the salt key “as often as possible beyond the level they need for identifying an underground trip,” Boiten added. “They should refresh it as often as possible, for example every night when the London Underground is closed.”
A spokesperson for TfL told NS Tech that the organisation is “looking to understand travel patterns and changes over time to understand how regular and less common customers use our stations. For example, customers who are less familiar with our station layouts may use the station differently, and customers may change their travel patterns within stations over time as conditions change. So in order to understand patterns over time, we are preserving our hashing keys, and these keys and will be kept extremely secure.”
In light of the revelation, Boiten added that “the picture rather changes with TfL now confirming that they are keeping the hashing keys in order to link different trips by the same person. Given the retention period, they will be able to construct two years’ worth of people’s tube travel history. Those histories, unlike the individual trips, will be highly unique for every individual.”
One of the risks, says Boiten, is that the data could be cross-referenced against other publicly available information. “The lawyer in a small office who is a Tottenham season ticket holder, maybe. Or the MP who travels by underground from their constituency to Westminster tube station, given that we can see which votes the MP turned up for and when they were. For any such case, we would also find out all other underground trips they have made. So this significantly increases the risks to location privacy.”
While the scenarios Boiten presents may be hypothetical, he says they have an impact on how the project complies with data protection legislation. “Would TfL in principle be able to reconstruct the MAC address through an attack? The answer is possibly or possibly in collaboration with Virgin who provide the WiFi. Why is that important? Because some interpretations of GDPR say that if it’s pseudonymised data, it’s still personal data and subject to all of the protections.”
The TfL spokesperson added: “Our approach relies on the ICO’s published guidance and has been reviewed by our internal Cyber Security Team. Depersonalised Wi-Fi connection data will be held for two years and when this retention period is over, only aggregated data will be kept.
“We take the privacy of our customers very seriously. A range of policies, processes and technical measures are in place to control and safeguard access to, and use of, Wi-Fi connection data. Anyone with access to this data must complete TfL’s privacy and data protection training every year.”
This article has been updated to incorporate TfL’s additional remarks and the response to them.