I recently worked on a job posting search engine and wanted to share how I approached it and some findings.
Motivation
I had a data set of job postings and wanted to provide a way to find jobs using natural language queries. So a user could say something like "job posting for remote Ruby on Rails engineer at a startup that values diversity" and the search engine would return relevant job postings.
This would enable the user to search for jobs without having to know what filters to use. For example, if you wanted to search for remote jobs, typically you would have to check the "remote" box. But if you could just say "remote" in your query, that would be much easier. Also, you could query for more abstract terms like "has good work/life balance" or some of the attributes that something like { key: values } would give.
Approach
We could potentially use something like Elasticsearch or create our own job search engine with rules, but I wanted to see how well embeddings would work. These models are typically trained on internet-scale data, so they might capture some nuances of job postings that would be difficult for us to model.
When you embed a string of text, you get a vector that represents the meaning of the text. You can then compare the embeddings of two strings to see how similar they are. So my approach was to first get embeddings for a set of job postings. This could be done once per posting. Then, when a user enters a query, I would embed the user's query and find the job posting vectors that were closest using cosine similarity.