Overview of Text Similarity Metrics

Learn everything about data mining and user query understanding up to Amazon scale websites

 

Search and Information Retrieval is at the backbone of every e-commerce website today. Understanding what information your users are looking for is critical to increase engagement, click-through rates and sales.

Working with user queries, I am humbled by how diverse and amazing people are in expressing their information needs. As creators of web search engines, if we get to understand their information needs in the best way possible, we can better serve them.

In this course, we will cover mistakes to avoid, tips and tricks, easy wins and low hanging fruits — which will increase user engagement and click through rates. In this course, we will go for a deep dive in these topics — all while building a big project which ties all the concepts together!

Come join me in this journey into the world of query understanding, information retrieval and natural language processing!

Here is the breakdown of the course:

Section 1: Introduction to Query Understanding
1) Why query understanding?
2) Query Understanding Basics

3) Information Retrieval Techniques Review - tf-idf and text similarity metrics


Section 2: Building a Search Engine from scratch

4) ElasticSearch Review
5a) Data Preparation and Setup
5b) Building our own search engine from scratch


Section 3: Query Understanding Techniques - that work!

6) How to measure success? Search Relevance techniques
(Automated vs human relevance inputs)
7) Techniques for increasing precision of search results
8) Techniques for increasing recall of search results


Section 4: Machine Learning for Query Understanding

9) Using machine learning to rank products better
10) Using word embeddings for automatic synonym detection


Section 5: The Future
11) Using RNNs, and other state of the art Deep Learning Models in Text Parsing and Information Retrieval


Your Instructor


Sanket Gupta
Sanket Gupta

Sanket Gupta is a New York based Engineer who has experience of building various data science, statistics and natural language processing projects to build products that solve user needs.
He is an active contributor to the data science community and his blogs have been featured on various publications. He is passionate about teaching and educating in the area of data science. He enjoys travelling, writing and listening to rock music. He also hosts a data science podcast: The Data Life Podcast - which is available on Apple Podcasts and Spotify.


Frequently Asked Questions


When does the course start and finish?
The course starts now and never ends! It is a completely self-paced online course - you decide when you start and when you finish.
How long do I have access to the course?
How does lifetime access sound? After enrolling, you have unlimited access to this course for as long as you like - across any and all devices you own.
What if I am unhappy with the course?
We would never want you to be unhappy! If you are unsatisfied with your purchase, contact us in the first 30 days and we will give you a full refund.
Can I read a blog written by the author?
Of course! Please checkout Overview of Text Similarity Metrics (https://towardsdatascience.com/overview-of-text-similarity-metrics-3397c4601f50) which is a blog post written by the author covering several topics on text similarity metrics.
Will this class be more theoretical or practical?
That's a good question! This class focusses on the practical side of things, that you can use in your projects--today! Some theory is covered but the focus is on projects and practical side of things.

This course is closed for enrollment.