Recommend items to member based on previous clicks/transactions, on-boarding info, third party data (Facebook).
Required for New members (using on-boarding, marketing attribution, third party (Facebook) data), optional for Old members for which we can use transactional data to recommend on individual level.
Additional hardship here is item uniqueness. One possible way to solve this is to put items into buckets. We have size, price, brand, catalog readily available, but I doubt that those features contain enough information. Stichfix uses item description and comments to extract additional topics from unstructured text (code available, but pretty alpha state https://github.com/cemoody/lda2vec).
Instead of doing Item segmentation, we go slightly different route and do item vectorisation. Having vectorised representation of our content we could identify a region of that vector space which is of most interest to our members and simply do item search within that region. Train a neural network to categorise images based on Vinted data (or take allready trained network and fine tune it, which is much easier with most of the same benefits). Throw out classification layer and you have item vectorisation based on images. Text could be used as well. https://github.com/AKSHAYUBHAT/VisualSearchServer
Plenty of available implementations which we could use as a starting point. For example http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html.
Do we need real-time, or is batch good enough? https://github.com/brkyvz/streaming-matrix-factorization Do we calculate suggestions for all registered members? (not sure about speed)
Requires some sort of data structure to query n-dimensional space. Lucene has some support in 6.0 version, not sure about ElasticSearch. http://stackoverflow.com/questions/5751114/nearest-neighbors-in-high-dimensional-data
Broad application area. Increased value per impression should results in sales uplift. Higher sales of niche items.
High effort, high impact case
Find similar style items.
For good quality items we have a lot of buyers coming after them. As our transaction flow (or most of flows) is pretty long, where can be a queue of members trying to get one item. At that moment we know those members have high propensity to buy. We could suggest similar items to all who have a failed transaction
As before two approaches are possible.
This is how similar items currently work I believe. We could expand the data that we use to define the same segment.
It seems natural that most beneficial source of information for similar items should be item photos. If we vectorise item photos we could search for similar items based on visual appearance.
Additional sales channel. Could be used to sell items which have low exposure elsewhere (old items). Sales uplift.
We already have this, but I think it could be greatly improved. Similar items could be used more extensively to sell items which were not sold though other channels.
Like inverse auction
A very simple implementation could be just calculating mean and average per segment (brand/catalog) and suggest price based on corresponding normal distribution.
Reduce number of incorrectly priced items leading to higher sales.
Provide realtime feedback during item photo taking in our apps.
We had this during our experiment with third party (Vidmantas). Sadly we deleted the dataset :(
For this task we will likely have to fully train the network, pretrained will not likely work due to different problem domain. There are lots of information on this and ready made network architectures, but it is not a very easy task I think. (Bleeding edge network tensorflow implementation https://github.com/tensorflow/models/tree/master/inception)
Where have been a few attempts to simplify NN memory and computational power requirements. Worth checking out if considering phone implementation. http://songhan.github.io/SqueezeNet-Deep-Compression/
Take a photo with a phone, find items on Vinted https://github.com/AKSHAYUBHAT/VisualSearchServer
Not sure what to do if we know member is likely to leave. http://spark.apache.org/docs/latest/ml-classification-regression.html#survival-regression
Subtask of anomaly detection
Monster from Etsy https://anomaly.io/detect-anomalies-skyline/ Said they where working on a cleaner solution, not sure about status
Alert early about trending negative/positive topics in the forums. Use word2vec and logistic regression. Did some experiments earlier, easy for feedback, harder for forum. https://github.com/linanqiu/word2vec-sentiments/blob/master/word2vec-sentiment.ipynb