The use of deep learning in retail and advertising, is rapidly expanding and becoming an integral part of consumerism.

We’re almost at the end of day 1 of our Deep Learning in Retail and Advertising Summit in London, and we’ve brought together data scientists, engineers, CTOs, CEOs and leading retailers to explore the impact of deep learning and AI on the industry.

Ben Chamberlain, Senior Big Data Engineer from ASOS kicked off this morning’s discussion by exploring the impact that deep learning has in predicting the customer lifetime value in e-commerce (CLTV).

Deep learning works really well for deterministic tasks, and as CTLV is absolutely not a deterministic task, it’s extremely difficult - I don’t even know my own value to ASOS, it’s a very different kind of problem.

In an ideal world, ASOS would ‘know every action a customer will make for the rest of time, but [they] can’t do that’. Identifying and distinguishing between high and low value customers allows companies to optimise market spend and minimise exposure to unprofitable customers. He explained how ‘a large percentage of customers churn and have a 0% CTLV, whilst some will spend millions each year on ASOS.’ This makes machine learning incredibly complex to implement, and when it was first implemented, it was ‘wrong for the vast majority of customers’. To overcome this, Chamberlain explained how they tested two models: the widened deep model, and the random forest model with neural embeddings which combines combines automatic feature learning through deep neural models with hand-crafted features to produce CLTV estimations that outperform either paradigm used in isolation. Model two has some merit, and this implementation of deep neural networks was tested and adopted by ASOS.

Hear more from ASOS:
@b_p_chamberlain: Our new paper on neural embeddings in hyperbolic space. Talking about this at #reworkretail in LondonIn a CTLV paper published by ASOS, they expand on the implementation of this model, following on from the success ‘of DNN’s in vision, speech recognition, and recommendation systems’, which was influenced by Yann LeCun, Yoshua Bengio, and Geoffrey‚ Hinton's 2015 paper, Deep Learning, (2015). Hear from this trio of innovators ahead of their appearance at our Deep Learning Summit Montreal where they have just been announced as the Panel of Pioneers, check out our blog post here.

Forecasting & Recommendations

The accuracy of probabilistic forecasts is integral to Amazon’s business optimisation process, and machine learning scientist, Jan Gasthaus spoke about the methods they are currently using.

Why bother forecasting?

If I knew the future, I could make optimal decisions. However, I don’t know the future. But what’s the next best thing? Creating accurate predictions taking into account past data to capture the uncertainty and predict more accurately. This allows me to give estimates and quantify them.

People are currently using what Gasthaus called ‘the onion approach’ where ‘you peel away bits you understand and you’re left with the part of the problem that the probabilistic time series can digest.’ Whilst this method has several pros such as its de-facto standard and decomposition, there are also several obstacles such as the amount of manual work required as well as its inability to learn patterns across time series. For example, Gasthaus explained how it is problematic if they ‘care about the forecast in a three week window for example rather than a specific day.’

To overcome this, Amazon are using black box deep learning functions to from simpler building blocks and learn them end to end to come up with their model of distribution and train neural networks. This novel method they are proposing  to produce accurate forecasts is DeepAR. This is ‘based on training an auto-regressive recurrent network model on a large number of related time series.’ Here, the input is the time series of past values, and the output is the estimated joint distribution. ‘Deep learning methodology applied to forecasting yields flexible, accurate, and scalable forecasting systems where models can learn complex temporal patterns across time.’

Dr Janet Bastiman‏ @Yssybyl: @Dr Janet Bastiman @Yssybyl: Predicting the future @amazon by Jan Gasthaus with #DeepLearning - comparison to past approaches #ReworkRETAIL

We next heard from Rami Al-Salman who explained how Trivago are ‘using a culmination of artificial neural networks, word embedding, deep learning and image search to optimise the results that users receive for each distinct search.’ Trivago serves millions of queries every day, and one of the biggest challenges is ‘predicting the intention of users queries’ and providing appropriately corresponding recommendations, for example ‘when a user types czech + currency we will want to recommend “koruna” as an additional search keyword'. This word embedding and use of is ‘one of the most exciting topics nowadays, as it learns a low-dimensional vector representation of words from huge amounts of unconstructed data as well as capturing the semantics of data.’ Deep learning methods are progressing so rapidly in natural language processing and computer vision, Trivago have applied these advancements to their model to provide an improved user experience. Trivago took ‘6 million hotel reviews and put them through vectors, and it’s possible to use word2vec because it’s scalable. Where a normal categorisation would take days to produce, word2vec produces results in less than 30 minutes.’  Al-Salman explained that word2vec allows them to learn the representation of words and give accurate suggestions. He also revealed that in the near future, deep learning will be applied to classify hotels to provide better search facilities.

Hear more from Trivago:
DeepTags: Integration of Various VGI Resources Towards Enhanced Data Quality

Warehouse & Stock Optimisation

Calvin Seward from Zalando spoke today about two issues they are working to overcome in warehouse and stock optimisation: the picker routing problem, and the order batching problem. In a physical warehouse there are rows and aisles of stock and ‘you’ve got a bunch of locations in the warehouse that pickers have to visit - it’s super inefficient.’

One you come up with an optimal route, you can drive efficiency and save money. That’s the goal of our project.

To overcome this problem, research scientist Seward explained how they developed the OCaPi (Optimal Cart Pick) Algorithm to calculate the optimal route to walk. This algorithm however, still has a runtime of around 1 second.

The second problem of order batching lies when ‘customers have ordered a bunch of things. We want to split these into different pick tours, but we can’t assign one order to multiple pick lists because there’s no way to bring the order together.’

By implementing a group force optimisation strategy, Zalando saw an 8.4% increase in efficiency. Additionally, Seward explained that by using neural networks, they can estimate the pick route length and by combining this with OCaPi they have created ‘a black box strategy where we get the neural network to learn from the examples’ which is a whole lot faster. With only the OCaPi running the results were never better than 0.3 seconds, but on the same CPU with the addition of the neural network approach it gets down to milliseconds.

Hear more from Zalando:
Zalando are using AI in several aspects of their business and have also created Fashion DNA to make the properties of their products more accessible, by collecting disjointed information in their catalog and mapping it into an abstract mathematical space - the "fashion space". We spoke to Roland Vollgraf, Research Lead at Zalando Research, who expanded on this.

Check out the interview here.Couldn’t make it to London?

Register for on-demand post-event access to receive all the slides, presentation videos and interviews from the summit, or check out our calendar of upcoming events here.

View our upcoming events calendar and register here.