Our data team spent a few days last week at Big Data LDN, the annual data expo at the Olympia exhibition centre in London. This year it really delivered.
I for one left feeling inspired, and amped up to really take what we’d learned to help our clients, and us, use data in a more forward thinking way.
Here are some of the top trends and take-outs worth noting:
AI and, specifically, Machine Learning was a key feature this year. There were some very cool examples of how manual tasks that used to take teams of people months to deliver can now be done in a matter of hours by a machine.
One of the most interesting was how London Zoo had teamed up with Google to use their AutoML technology in a bespoke way. It allowed them to identify species captured on conservation cameras quickly and accurately – a task that previously took analysts up to 9 months to carry out. This is just one example of how speed and automation are key for business success. And here at Dare, we’re always trying to improve on speed, even at a basic reporting level.
Both text and image ML are taking huge leaps, and are probably two of the more accessible forms of ML for brands to get on board with. However, it’s important to remember that although ML can save thousands of man hours in the long run, the results are only as good as the initial data input itself, which alone can be a time-consuming, cleansing process. In fact, there’s a long-running joke (that’s heavily rooted in reality and not that funny) that 90% of a data scientist’s time is spent cleaning data sets, while 90% of the other 10% is spent checking that cleaning.
Please bear that in mind next time you ask your data analyst to “just pull some data on xyz before lunch” for you, as we might need until the end of the day at least….
The other big issue with ML is the lack of experience and strong talent. The consensus was that it’s not just down to data teams, but also developers to be upskilling and working in collaboration (along with IT) to build and run the best solutions. This means investment is needed not only in technology and data team infrastructure, but more importantly in the infrastructure of the whole organisation to allow for data-led action. And this leads me nicely to point 2.
One of the more inspiring talks we attended was by Bas Geerdink from ING Bank. He demonstrated the power of forecasting and using predictive analysis to not only benefit the customer but also the business. To do this they had to fundamentally change the structure of their business. They invested in people and built teams that worked across multiple disciplines. But the work speaks for itself, and proves them to be a data-driven business.
By looking at trends in customer incomings and outgoings, ING started to give simple nudges to their customers, such as letting them know that they have a bill due in the next week but they may not have enough funds in their account by then to pay it. Their aim is to boost customer credit scores. Indeed by helping customers to manage their money better, ING can improve these credit scores, and in turn get better quality loan applications (as well as more of them). Happy customer, happy bank.
It’s also clear we need to stop simply looking at the past monthly, weekly, daily data for our reporting (although this still needs to happen, and in as real time as possible), but use this historical data to look for trends and foresee possible issues (or opportunities) on the horizon.
However, it’s not just people and infrastructure that needs investment to be able to do this, it’s technology.
The issue many businesses have is that their data is often siloed, dirty, unformatted and the term “legacy systems” comes up a lot. Because of this the thought of trying to pull everything together into one harmonious set of data is daunting. However, if you don’t do this then most of the above mentioned trends, and even a lot of quite simple data analysis becomes impossible.
The longer you wait to do it, the harder it becomes to transform into a truly data-driven business.
And it’s fine if you’re not a data-driven business. There’s still a lot of fun to be had with a more light-touch approach, and a lot of insight to be drawn and action to be taken (that’s our bread and butter after all).
But for those willing to take the plunge you’re probably looking for data lake with a data virtualisation layer. This means you can archive/aggregate data from multiple sources to create a logical, single point of access for information. Data virtualisation allows an application (dashboards, portals, apps, etc.) to retrieve and manipulate data without requiring technical details about the data itself. There are some great solutions out there like Denodo and Snowflake.
As Uncle Ben once said, “With great power comes great responsibility,” and never has this been more true than in the world of big data. With GDPR now in effect, the Facebook and Cambridge Analytica scandal still rumbling, and people simply becoming more knowledgeable of how their data is being collected and used, the ethics of data is a growing trend, and one nearly every speaker mentioned even if just in passing.
Alan Mak, MP, believes data governance need to be the responsibility of everyone,
This area is immense and not something I can really sum up here, although I encourage everyone to read the Data and Democracy in the Digital Age paper produced for the Constitution Society, and presented at the Houses of Parliament earlier this year. It gives a great many examples on the issues around regulating political advertising online, and how data was used/misused in both the last UK election and Brexit referendum.
Often the law can’t keep up with the speed at which new technologies are evolving (although GDPR is an interesting step forward), and this is often a cause for concern. It’s our job as an agency to help support our clients, and not only to use their data within existing guidelines, but to be forward thinking as well as empathetic to the customer need.
Most inspiring of the two-day event was Hannah Fry, Associate Professor in the Mathematics of Cities at the Centre for Advanced Spatial Analysis at UCL. She came armed with a variety of charts and data visualisations showing everything from peaks in male babies born after world wars (earlier egg fertilisation means more male babies, a result of more sex), weird patterns in public bike hire schemes (people don’t climb hills to return bikes), and how to track down a serial killer by plotting where the bodies are found (killers have a travel distance sweet spot).
Everything she showed just highlighted that clean, simple, visualisation of the most diverse data can speak to even the most data illiterate amongst us. And with a multitude of data viz companies selling their wares at the event, such as Tableau, Qlik and PowerBI, making data accessible to everyone has never been so easy.
More importantly Fry highlighted that data really is fun. Really! It’s everywhere, and can be used in innovative and striking ways. Methodologies used elsewhere, on completely different data sets, may work on your client data. It’s just about thinking differently and thinking bigger!
We’ll be heading back next year to see where this crazy industry take us to next, but for now there’s a lot to be cracking on with.