Setting up NBoost for Elasticsearch¶
In this example we will set up a proxy to sit in between the client and Elasticsearch and boost the results!
Preliminaries¶
Install NBoost for Pytorch.
pip install nboost[pt]
Set up an Elasticsearch Server
🔔 If you already have an Elasticsearch server, you can skip this step!
If you don’t have Elasticsearch, not to worry! You can set up a local Elasticsearch cluster by using docker. First, get the ES image by running:
docker pull elasticsearch:7.4.2
Once you have the image, you can run an Elasticsearch server via:
docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:7.4.2
Index some data
NBoost has a handy indexing tool built in (
nboost-index
). For demonstration purposes, will be indexing a set of passages about traveling and hotels. You can add the index to your Elasticsearch server by running:travel.csv
comes with NBoostnboost-index --file travel.csv --name travel --delim ,
Deploying the proxy¶
Now we’re ready to deploy our Neural Proxy! There are three ways to configure NBoost.
Via command line.
On the command line, we can run:
nboost \ --uhost localhost \ --uport 9200 \ --search_route "/<index>/_search" \ --query_path url.query.q \ --topk_path url.query.size \ --default_topk 10 \ --topn 50 \ --choices_path body.hits.hits \ --cvalues_path _source.passage
📢 The
--uhost
and--uport
should be the same as the Elasticsearch server above! Uhost and uport are short for upstream-host and upstream-port (referring to the upstream server).Now let’s test it out! Hit the Elasticsearch with:
curl "http://localhost:8000/travel/_search?pretty&q=passage:vegas&size=2"
Via json.
On the command line, let’s run:
nboost --search_route "/<index>/_search"
In a python script, we can run:
import requests from pprint import pprint response = requests.get( url='http://localhost:8000/travel/_search', json={ 'nboost': { 'uhost': 'localhost', 'uport': 9200, 'query_path': 'body.query.match.passage', 'topk_path': 'body.size', 'default_topk': 10, 'topn': 50, 'choices_path': 'body.hits.hits', 'cvalues_path': '_source.passage' }, 'size': 2, 'query': { 'match': {'passage': 'I want a Louisiana hotel with a pool'} } } ) pprint(response.json())
Via query params.
On the command line, let’s run:
nboost --search_route "/<index>/_search"
In a python script, we can run:
import requests from pprint import pprint response = requests.get( url='http://localhost:8000/travel/_search', params={ 'uhost': 'localhost', 'uport': 9200, 'query_path': 'url.query.q', 'topk_path': 'url.query.size', 'default_topk': 10, 'topn': 50, 'choices_path': 'body.hits.hits', 'cvalues_path': '_source.passage', 'q': 'passage:I want a vegas hotel with a pool', 'size': 2 } ) pprint(response.json())
No matter how we configure NBoost, if the Elasticsearch result has the nboost
tag in it, congratulations it’s working!