QCon SF Keynote: Techie VC's Talk About Trends & Opportunities
Kevin Efrusy and Salil Deshpande talk about what makes a business successful or not, presenting three actual cases they have been involved with: Hyperic, G2One, SpringSource.
Tracking change and innovation in the enterprise software development community
Posted by Abel Avram on Aug 16, 2008
In this presentation, Jinesh Varia, a Web Services Evangelist at Amazon, talks about the architecture of one of Amazon's web services called Alexa. Jinesh explains how Amazon has reached scalability, performance and reduced costs for the Alexa service.
Watch: Jinesh Varia About Amazon's Alexa Web Service (43 min)
The Alexa Web Service, backed by an application called internally as GrepTheWeb, gathers various information about web sites including traffic data, contact information, and more. The collected data is then made available to clients which can run specialized queries against it in order to find specific information.
Jinesh explains that GrepTheWeb uses Hadoop, a free Java software platform which can be used to run applications processing vast amounts of data which, in this case, are stored on Amazon's Simple Storage Service (S3), and are retrieved by Hadoop clusters when a client request is processed. Finally a result is returned to the customer. Hadoop runs inside Amazon's Elastic Compute Cloud (EC2).
The whole architecture is in a cloud whose internals are completely hidden from the service customer. When a request is issued, an entire framework is built on as many machines as is necessary in order to process it and generate a result, then the whole framework disappears. The cloud architecture makes the whole service highly scalable. By being able to extend it on theoretically unlimited number of nodes, the service has good performance. Since the entire service support is created on the fly and exists only while processing a request, the costs are low.
One of the main features of the Alexa's architecture is fault tolerance. The data is duplicated and stored on physically different locations to avoid data loss, and Hadoop takes care of spawning and controlling as many processes as necessary to process the large amounts of data involved.
Free $40 SOA Demystified Book Offer
Agile Development: A Manager's Roadmap for Success
JBoss versus IBM WebSphere: Cost, Performance, Efficiency, Innovation (IBM wins)
CollabNet presents Agile ALM virtual conference, April 15, 2010
Kevin Efrusy and Salil Deshpande talk about what makes a business successful or not, presenting three actual cases they have been involved with: Hyperic, G2One, SpringSource.
InfoQ talks to Mark Fisher, project lead for the Spring Integration project, about the framework.
Peter Lubbers explains in this article how HTML5 Web Sockets interact with proxy servers, and what proxy configuration or updates are needed for the Web Sockets traffic to go through.
Neal Ford shows what ThoughtWorks learned from scaling Rails development: infrastructure, testing, messaging, optimization, performance.
Stuart Halloway discusses Clojure and functional programing on the JVM in depth, and touches on the uses of a number of other modern JVM languages including JRuby, Groovy, Scala and Haskell.
Oren Teich and Blake Mizerany talk about the technology behind Heroku and the benefits of the new add-on system.
Chris Riley presents security issues threatening service based systems, examining security threats, presenting measures to reduce the risks, and mentioning available security frameworks.
This talk investigates technical issues encountered when moving to an Agile process.
2 comments
Watch Thread Reply