Elastic Stack Lab10

In this lab we’re going to play around with pagination, sorting and more advanced filtering. We’re also going to learn how to use fuzziness to match against mistyped searches.

Pagination

If there are a lot of results it’s very helpful to use pagination so that we can return a small number of results.

We’ll start out by using the URI Search to return the first 2 results in the movies index, no sorting, no relevancy, just pull down the first 2.

curl -XGET '127.0.0.1:9200/movies/movie/_search?size=2&pretty'

Did your search only return 2 results? Is this what you expected?

You may have noticed there is no from specified. This is because by default it starts at 0, so since we want to see the first 2 results there’s no need to specify a from value.

If someone was displaying this on a website and I wanted the next 2 results I would specify a from and set it to 2 since we just showed them 0 and 1, the next would be 2.

Go ahead and run the above command but add from=2 to the query.

Did you get back the next 2?

You’ve queried the Elasticsearch API using URI Search now let’s use a JSON body to clean this up a bit and show how it can be done.

curl -XGET 127.0.0.1:9200/movies/movie/_search?pretty -d '
{
   "from": 2,
   "size": 2,
   "query": {"match": {"genre": "Sci-Fi"}}
}'

Your query should return 2 results from the 2nd page (if we were displaying on website). Go ahead and look at the results that would have been displayed on the first page, and query for the results for the 3rd page.

Sorting

Sorting is very easy! We can do it using URI Search or through a JSON body. Remember that to sort a field it cannot use an analyzer and must be keyword type.

Let’s sort all of the documents in the movies index by year

curl -XGET '127.0.0.1:9200/movies/movie/_search?sort=year&pretty'

We can see that it did sort the results by year as expected!

Let’s try it again but now we’re going to sort by the title field.

curl -XGET '127.0.0.1:9200/movies/movie/_search?sort=title&pretty'

What happened?

This error message is due to the fact the title field is analyzed and you can’t sort on an analyzed field, because it’s not actually sorting the contents of the field directly it’s storing the individual terms within the inverted index.

Now we get to delete everything and create a new mapping and re-import the data to create a sub field of keyword type.

curl -XDELETE 127.0.0.1:9200/movies 
curl -XPUT 127.0.0.1:9200/movies -d '
{
    "mappings" : {
        "movie" : {
            "properties" : {
                "title" : {
                    "type" : "text",
                    "fields" : { "raw": {"type": "keyword"}}
                }
            }
        }
    }
}'

Great you just created a title field of type text which is analyzed and also a title.raw field which is of type keyword

Now use steps from previous labs to reimport the movies.json data.

After importing let’s try that sort query again using title.raw instead of title field.

curl -XGET '127.0.0.1:9200/movies/movie/_search?sort=title.raw&pretty'

Are the movies sorted by title? Alphabetically?

A subfield of type keyword allows you to sort any field.

Complex filter

Filters are used for almost everything within Elasticsearch. If you want to retrieve relevant data you will need to filter it.

Here is an example of a complex filter that will return all Sci-Fi movies released between years 2010 and 2015 and do not have trek in the title.

curl -XGET 127.0.0.1:9200/movies/_search?pretty -d'
{
    "query":{
        "bool": {
            "must": {"match": {"genre": "Sci-Fi"}},
            "must_not": {"match": {"title": "trek"}},
            "filter": {"range": {"year": {"gte": 2010, "lt": 2015}}}
        }
    }
}'

Now looking at this query go ahead and write one of your own that meets the following requirements.

Fuzziness Queries.

Let’s start with a non-fuzzy query and see what happens when we misspell something simple.

Start out by using a match query to search for intersteller spelled with an e instead of an a

curl -XGET 127.0.0.1:9200/movies/movie/_search?pretty -d '
{
    "query": {
        "match": {
            "title": "intersteller"
        }
    }
}'

What were the results?

Since we didn’t get any results using a regular match query let’s see if we can use fuzzy to help us out.

We are going to run a fuzzy querying so that it is more tolerant of mistakes. We need to specify the value we are searching for, and the fuzziness we are willing to tolerate.

Let’s use the same query as above with a couple changes.

curl -XGET 127.0.0.1:9200/movies/movie/_search?pretty -d '
{
    "query": {
        "fuzzy": {
            "title": {"value": "intersteller", "fuzziness":1 }
        }
    }
}'

Alright so now that we’ve used a fuzzy query we are able to handle typos and mistakes of up to 1 character.

Alright, let’s play around with fuzzy a little bit. Try and figure out the following scenarios on your own and if you need assistance speak with the instructor.

Lab Complete