BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Ocado Uses TensorFlow and Google Cloud Platform for Novel Customer Service Approach

Ocado Uses TensorFlow and Google Cloud Platform for Novel Customer Service Approach

This item in japanese

Ocado Technology announced a new approach to handle their 500,000+ customer base and their pool of email requests through a partnership with Google and its Cloud Platform (GCP). The work focuses on automating email categorization using TensorFlow and its Python APIs hosted on the GCP.

Ocado decided the email pool classification is a good candidate for production-scaled machine learning and automation, specifically, natural language processing (NLP). The workflow adopted by many support-centers is for people to manually process the email queues in a consistent and reliable manner. This doesn't scale well if the business grows quickly or if the overall customer support volume requires an ever-growing support staff. This can lead to increased response latencies and increased customer dissatisfaction.

In the case of Ocado, all emails get sent to a single mailbox. Ocado processes the email content to determine appropriate tagging for things like customer complaints requiring a quick response, general feedback that prompts a lower priority tagging and longer response time, as well as tagging for redelivery requests, refund claims, payment or website issues, and new product inquiries.

Ocado wants to minimize the number of required manual entry fields and tags by the customer, as well as by support staff assigning categories. Manual entry in this manner is prone to bias and generating noisy data, but also takes away precious time from doing what the support staff is there to do, which is following up with customers based on the priority of their request.

Marcin Druzkowski, senior software engineer at Ocado Technology noted some details around various models Ocado uses for training neural networks in his talk at the Datasciencefest this past August. Ocado tested categorization with convolutional neural networks (CNN)and long-short-term memory (LSTM) networks. Some of the methods include logistic regression with bag-of-words, convolutional neural network (CNN) with embedding, and LSTM with embedding.

Druzkowski noted that GPU's weren't necessary because of the relatively cheap cost of CPU chips, the scalability of cloud computing and parallelized model training made specifically writing and training models for GPU architectures unnecessary. He also noted their TensorFlow graphs are deployed as data matrices and graph definitions through a software-engineering centric approach to data science. This is in contrast to some of the common practices in data science software that can prove to be a challenge for deployment and ease of integration into a production environment. Some of the properties noted are portability and dependency management, code quality, test coverage, versioning and continuous integration.

Other novel challenges exist around testing models that require randomness and ranges of acceptable result values, as well as some of the objectivity around what constitutes good model performance. Another challenge that comes up is how to retrain and retest a model as the underlying data set changes. The rate and velocity of change in the datasets become yet another set of variables to account for in deciding whether or not a model is production-worthy. Tests are currently ran using pyTest and TensorFlow, but Ocado declined a request for code samples.

Rate this Article

Adoption
Style

BT