Machine Learning Newsletter

Data Products Made Easy

I read Data Jujitsu recently and enjoyed it a lot. DJ Patil presents a nice set of hard learned things that he experienced in (mainly) Linkedin. I like this type of real experiences rather than a set of rules that advocates a too ideal world and behave according to that world.

A recurring theme of one the short book is that divide and attack the problems you have. The "Jujitsu" term comes from Japanese martial art and suggests doing small attacks and trying to use the opponent's strength towards her because the opponent (problem) is much stronger than you. This naming actually reflects quite well what the short book is about. Developing solutions to hard problems with limited human power and time. What I found surprising a little bit is that, how consumer-oriented he is throughout the book. He looks at the problem solving approximately in the following: if problems are not what customers want(enhancement in the product) or what customers are unhappy about(bugs), then there must be some other incentive(increase in engagement, better sales and so on) must exist so that that problem would become worthy to solve. So, solving a problem for the sake of solving is not beneficial, neither for the customer nor for the business.

Critical Question

  • Does anyone want or need your product?

This question is so critical that I cannot stress how important it is. Generally, the startups or businesses in general do not fail because they could not solve a problem technically but either they cannot create enough incentive for people to buy the product or simply it is irrelevant to the market. If you cannot answer this question with confidence, however hard problem that you solved will not matter. Eventually, if the product will be used, then someone needs or wants your product.

Minimum Viable Product

If you answered the question above, then you are eligible to build your minimum viable product. This does not have to be your fully functional product but it should be a simple product that needs to prove that there is a need and it should be good enough so that you could determine if you want to improve it or not. When you want to improve, ask these two questions:

  1. "Does the customer care? Is there a market fit? If there isn't, there is no sense in building an application."
  2. "How long do we have to learn the answer to Question A?"

When you prototype and measure the engagement of a user in the step 1, you have a better sense of what you should build. Further, this also ensures that you have a validity check that you are not building something nobody wants it.


Go back for a second, what is a "data product" to begin with? He gives a concrete definitiion for this question:
"Data product is a product that facilitates an end goal through the use of data."

Design of Product

Considering how much development, funding have been given towards big data technologies or data analysis tools in general, data products would have been easy, right? Not really.

Data is messy especially if you are getting data from a variety of sources which do not have common interfaces. Data is messy if you are collecting from input text fields that customers fill in. How do you make sure that the data is in the right form? With product design. You would provide feedback as Google search does ahead of the user, you would prompt "did you mean ..." to help the user, you will arrange your dropdown menu based on the input from the customer. This not only provides much better experience for the user but also you get a much better, structured data(think about dropdowns, support type-ahead for a second) in your back-end. Patil presents this fact as: "I've found that trying to solve a problem on the back-end is 100-1000 times more expensive than on the front end".

Use humans when you have to, use technical solutions when you could

Generally, engineering seeks for technical solutions which are scalable. This ensures that the solution will be profitable for high number of users. However, when you try to be relevant in the market or try to see if there is a market, you need to use humans. There is a similar problem cold start which corresponds a significant problem in the recommender system. Think about a user who just signed your ecommerce startup, you want to recommmend things to him but you do not have any history. Not only that, but you just launched your product so you do not have any prior knowledge what people (in general) like. If you had beta users or mechanical turks before you launched your product, you are in luck. If the product is consumer-facing, you should at least some data about your users from to have a head start. Technical solutions are and will be more efficient, cheaper and scale better in the long run, but if you cannot afford the time and effort to build a technical solution to the problem, then do not hesitate humans.

Always be opportunistic

  • If you are able to do data analysis to make the product better, increase the sales, just do it!
  • If you cannot do some operation because you do not have resources or technical expertise, try do divide the operation and try to offer simple version of it instead of providing nothing!

Give the data back!

If your product is data-centric, you must be creating some value around it and you should already providing the data to the customer in some way. You should give it more! To both increase engagement and revenue, give the data in an undertandable, clean and maybe even interactive way. Let users play with the data. If your data is timely and actioanable(think twitter for a sec), then it becomes addictive. Instead of hoarding, share it. Only through sharing and giving back, you could create more value around it.

But do not give it too much!

What Patil calls "Data Vomit" is that if you give too much data without considering if it makes sense or valuable to the customer, there is a good chance to overwhelm the user. So, confusion and frustration replaces the engagement from the day one. There is a sweet spot where more data generates more engagement and after that sweet spot, more data will cause less interaction and engagement.

Consider non-ideal cases

If you are building a product, think about the extreme and edge cases as well. Showing spanish pages to a tourist visiting Mexica just because she is in Mexica may not make sense and especially if she repeatedly changes the browser language from spanish to english!

Precision and Recall

If you are building a retrieval system, first learn these concepts. Then, find out what you want to compromise as these two generally work against to each other. For a search engine, precision might be the single most important metric whereas if you claim to be one of the most comprehensive news source, you need to also increase your recall to be consistent with your claim. Rule of thumb is that if the data is exposed, first try to have a high precision because first page may be the only page that your customer sees. If the precision is not good, it may well be the last page.

Social system for the win

If your recommender system is terrible giving recommendations and customers are unhappy about it, use collaborative filtering first and blame the preferences of other users if customer is still not happy about the recommendation. It is something that customer blames the product for terrible recommendations, and it is completely different thing that people are buying two incoherent things, so I am recommended that product.

Get more data

Even if your domain is not advertising, knowing more about your users always pay off. As you know more about her, you could recommend better, you could personalize better, you could sell better, you could serve better. Asking data if it is done correctly, it could be another way to engage the customer as well. After you get the data, only goal would be better product. Do not abuse it!

User is the most important

Features fail, products fail, nearly everyting in the universe at some point fails. Get used to it, but try to preserve as much as user experience you can in the process. Data products generally empower the user in some way and there is a high chance that the experience will not be constant through her time. Try not to decrease it too much. If the ads that you show may be offensive, give an option to user so that when she visits the website, she could just remove those type of ads(similar to Facebook). If the people you recommend are not very relevant, provide a way to user so that he could give negative feedback to people whom she does not know so that she will not see them again(similar to Linkedin). This not only gives control and value to user, in the process you could learn user preferences and build a better product.

Three Fundamental Questions that you should ask to yourself

  1. What do you want the user to take away from this product?
  2. What action do you want the user to take because of the product?
  3. How should the user feel during and after using your product?

If the core of your product gets it right the core and fundamentals from the day one, you have a lot of time to improve it. Use Jujitsu!

comments powered by Disqus