The retail event sounds straightforward enough: Amazon is to ban people reviewing more than five items they didn’t buy through the site per week. This is reasonable enough but its implications are serious for anyone involved in the Big Data business. The reason for this is that if you’re involved in that area, some of your data may well come from reviews. And Amazon is tacitly admitting that not all of the reviews it publishes are genuine.
This will not be a shock to anyone who has been following the online retail business for any length of time, nor any business that runs on reviews. I’m typing this article in a hotel in Portugal which has either the friendliest or coldest of staff, depending on who’s reviewing (for the record they seem fine to me) and in which the rooms are allegedly pokey (if about 25ft x 40ft with free Internet is pokey, then fine, but it looks as though it has very generous space from here).
Restaurant and hotel managers have nightmares about fake reviews on TripAdvisor. They probably don’t allow those reviews to permeate their systems, though.
Big data with big fibs
This is the problem Amazon might well have been facing, as might some of the businesses that use it as a sales channel. There is simply no way to tell whether the review in front of you is genuine unless you happen to know the person that wrote it.
Let’s put it this way: many years ago, chef Gordon Ramsay was running one of his TV programmes in which he rescued ailing restaurants. One of the proprietors held up an online review that said the place was better than a Gordon Ramsay or Jamie Oliver establishment. Ramsay accused the owner of writing it himself, and he broke down and admitted to having done so.
So, what if it had been a larger organisation, and the review had come in electronically? Would it be possible, if unlikely, that the data would have been fed back and if sufficient false reviews were around, that it might have been put into the business as solid feedback? The answer can only be “maybe”.
Nonetheless, if people using Big Data are not checking the sources of their data and applying some sort of reality check, the scope for damage as people put nonsense in and get nonsense out is considerable.