Questions to a Data Scientist
Q&A with Dan Kellet, Director of Data Science @CapitalOneUK
First up, could you explain the difference between structured and unstructured data?
A lot of the data we’re used to is structured. If you’ve ever used a spreadsheet with rows and columns, then you’ve seen structured data. Structured data is generally where you’re working with information that’s regular, predictable and in a standard format.
Unstructured data is everything else.
A simple example is free text. Our call centre agents make notes when they’re helping customers over the phone. We analyze these notes and use the insights to spot trends that help us to better support customers.
As you can see, the complexity comes in when you consider that each agent has their own style of writing and uses different words or phrases to describe the same thing. The unpredictable nature of this makes it much trickier to process.
Rich data like images and video take it to another level. Most of us would rather watch a video than read a pages of text, right? But from an analyst’s point of view, extracting useful data from video can be a real challenge.
There’s a lot of analytic software in the market right now, each with its own pros and cons. Do you think there’ll ever be one super programme that does everything?
Ha! In short, no.
I don’t think so. At least not in the near future. There’s just too much complexity, and the needs of different businesses are too diverse for a single bit of software to be able to do it all.
We’re still debating whether open source or proprietary software is best. At Capital One we’ve decided to go down the Open Source road. But I do think there is a strong case for both. And as technology changes, so will we.
Ultimately, it all serves the same purpose – turning data into useful insight. Any analysis, however it’s done, should always stem from wanting solve a problem, or to improve a product or service. You’ve got to understand the problem you want to solve in the first place.
Projects like Capital One’s Growth Labs are a great way to bring big businesses together with startups who are considered to be shaping the future of tech. If your team was a startup, what problem would you try to tackle?
One of the biggest challenges right now is efficiency. So if we were a start-up, I’d be really interested to explore new ways for businesses to understand and translate data in real time.
Up until recently it was OK to work in batches. You know, you might complete a batch of work in a month, or a week – depending on the size of the project.
Now we want answers in a day. An hour. Or in real-time. And that’s where the future is. It’s a big challenge for data scientists. That kind of work takes serious tech, computing power and resource.
Just look at the growth of the Internet of Things. It’s estimated there’ll be around 26 billion devices operational by 2020. Each collecting huge amounts of real time data, some of which will never have been observed before. It’s going to change the way digital businesses operate in big ways.
At Capital One we’re super excited about the potential of new technologies. It’s something we’re heavily invested in. Both because it’s exciting, and because we want our customers to be the first to benefit.
There’s a lot of competition right now for tech talent. Do you think there’s enough talent to go around?
Totally. I think the data science world is flourishing. Schools and universities have recognised the important role of STEM subjects, and we’re starting to see the benefits.
There’s no doubt there’s a lot of competition out there. We’ve done really well to attract some incredibly skilled people, and will continue to do so. A big part of this is the opportunities people get here. We offer people the chance to work on big scale projects, where their work could make real differences to the lives of millions of people.
Looking back on your 15-year career in data science, what’s the one thing you know now that you wish you knew sooner.
That’s a good question.
If I had to pick one thing, I’d say: collaboration. It’s a big part of our success at Capital One. We don’t have any one man research projects. Sure, we have talented people who specialise in particular fields, but we’re not about silo working.
For the magic to happen – you need to bring people together.