Evolution in Data Integration From EII to Big Data
Approaches to integrating data are changing with emergence of cloud computing.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Abel Avram on Dec 29, 2011
LinkedIn has open sourced IndexTank, a document indexing engine that runs on the cloud and lets users customize the indexing process and tweak the results.
IndexTank was launched about a year ago and it was acquired by LinkedIn in October, and was recently open sourced. IndexTank is a cloud service similar to Google Custom Search running on top of Amazon Web Services, and providing websites the ability to index their own content that is later searchable by their visitors. IndexTank claims their users have complete control over what is indexed, when, and how the results are sorted. That means a website can promote at the top of search results the documents they prefer to show up first, and not relying on Google’s search algorithm.
Unlike many websites, IndexTank does not crawl web pages in order to index them, but rather the websites send data to be indexed to indexing engine. As a result, a document can be indexed right after its creation, providing live results. Also, the service is adds free.
IndexTank has three main components:
IndexTank joins Zoie, a real-time search engine built on Apache Lucene, and open sourced by LinkedIn in 2008.
IndexTank claims they have attracted thousands of customers in one year, the most notable being Reddit, but the company was not yet on profit at the time of being acquired by LinkedIn.
The source code of IndexTank is available on GitHub: Index Engine, and API plus Nebulizer.
Want to know how software releases can be stress-free and happen with one click? Try Go free!
Accelerating Software Delivery
Neo4j Commercial Licensing Options
Five Key Practices to Agile ALM
Using Drools? See what you're missing! Get the Power of Drools with the Assurance of Red Hat
Go: Agile Release Management Solutions. Go enables predictable, defect-free and timely software releases.
Approaches to integrating data are changing with emergence of cloud computing.
Michele Ide-Smith presents the lessons learned in the process of introducing UX principles and techniques into a large organization through a series of small steps.
Dave Farley and Martin Thompson discuss solutions for doing low-latency high throughput transactions based on the Disruptor concurrency pattern.
Rajneesh Namta shares his thoughts, experiences, and some of the critical lessons learned while implementing software test automation on a recent Agile project.
Dale Schumacher presents several patterns of actor interaction that can be used in collaborative programs written in any language.
Rúnar Bjarnason discusses Scalaz, a Scala library of pure data structures, type classes, highly generalized functions, and concurrency abstractions to perform functional programming in Scala.
One of the main challenges when designing software architecture is considering quality attributes. Not only their design turns out to be difficult, but also the specification of these attributes.
Michael Feathers analyzes real code bases concluding that code is not nearly as beautiful as designers aspire to, discussing the everyday decisions that alter the code bit by bit.
No comments
Watch Thread Reply