Getting Started with Elastic Search

Getting Started with Elastic Search
COMMENTS ()
Tweet

Hey Guys,

In this post we’re going to be looking at Elastic Search. So what exactly is elastic search? It’s basically an enterprise search engine that can be used to solve data related problems that you typically come across when dealing with a large scale, data oriented project.

So when and where should you use elastic search? I’m going to answer that with an example, i.e. of the scenario we faced, where we decided to use Elastic Search.

We were working on project that involved gigabytes of JSON raw logs that needed to be processed on a daily basis (for the past 7 days), after which they were analyzed for certain factors and trends, based on which we needed to send marketing emails and push notifications to a number of users. The volume of these daily logs was in the millions and the whole process required at least 4 to 6 hours daily. After looking at several alternatives, we decided to use elastic search to help speed up this process while also making it less laborious.

The solution we came up with, was to upload the hourly logs generated into the elastic server (which only required a minute or two worth of effort every hour), and then just query the elastic search engine whenever the data was required. Below are some of benefits of elastic search that we observed during our usage.

  • The biggest benefit we observed was that elastic search keeps the data distributed in separate chunks (or shards) which can be placed in different locations (servers). This means the data architecture is readily scalable to a multi-server architecture. And the up-side is that the elastic search engine manages this multi-server architecture all by itself, you only need to configure the servers once at the initial outset and then it’s completely hands-off.
  • Elastic Search offers an extremely efficient Full text search option. It will index all the fields you define in your index (or table in SQL terminology), so the query execution time extremely fast, like in our case it was only a few seconds even though there were millions of records stored and we were querying them using string columns.
  • The data format elastic search handles comprises of structure less JSON documents. And every record has a unique structure based on the event type, with the exception of a few common fields. So it saves you the hassle of having to house the data in an SQL database, since elastic search provides out-of-the-box support for structure less JSON data.
  • It’s really easy to configure and implement. We just needed to run the service, provide some basic configuration details such as the number of shards and server architecture information etc., create an index for our data using some common fields and we were ready to go. Have a look at https://github.com/elastic/elasticsearch or http://joelabrahamsson.com/elasticsearch-101/ if you want to get started with elastic search in 5 minutes.
  • Best of all Elastic Search is open source, and it provides built in support for OSX, Windows and Linux environments. So you don’t need to worry about the type of OS your server or development environments is running on, in order to use elastic search.
  • Plus, to integrate elastic search in your project, you don’t need to learn any new protocol or API. All you need to know is how to the Rest interface, and that’s all. Simply use the PUT request to put data into the elastic search engine and use the GET or POST request to query the data. And that’s all.

To learn more about elastic search and the REST APIs it uses, have a look at http://elasticsearch-py.readthedocs.org/en/latest/api.html

There are also some free tools available online that enable you to manage and visualize the data in elastic search, where you can see the active shards, the volume of data in your elastic search index and everything else you need to know. Elasticsearch-head is one such resource.

One problem that we did face with Elastic Search is that requires a lot of disk space. This is one thing that you need to keep an eye out for, since elastic search maintains multiple copies of data in order to perform an efficient full text search, it also takes up a lot of disk space, so it’s a tradeoff between efficiency and storage.

About Folio3

As a leading software development company, we specialize in developing enterprise and consumer oriented web applications and websites. We also provide website and web application UI and UX design services. If you have a website development or web application development project that you’d like to discuss or would like to know more about our web development expertise, please Contact Us

CALL

USA408 365 4638

VISIT

1301 Shoreway Road, Suite 160,

Belmont, CA 94002

Contact us

Whether you are a large enterprise looking to augment your teams with experts resources or an SME looking to scale your business or a startup looking to build something.
We are your digital growth partner.

Tel: +1 408 365 4638
Support: +1 (408) 512 1812