BT

Google 'simplifies web development' with AppEngine

by Geoffrey Wiseman on Apr 28, 2008 |

At Campfire One on April 7th, 2008, Google introduced Google App Engine as a way to simplify the job of creating, running and scaling web applications, to make it 'easy.' In essence, Google App Engine allows you to build web applications locally using and then deploy them on Google's infrastructure.

This is a preview release; it's not feature complete and there is a quota system, a set of limits in terms of storage, CPU and bandwidth that applications can use during the preview period, for free. Once the preview period is over, that quota will remain free, but developers will be able to purchase additional resources as needed. The cost for additional resources has not yet been shared (and possibly not even established).

The quotas in the preview release included: 3 apps per developer, 500MB storage per app, and per day (rolling 24 hour) quotas of 2000 emails, 10 GB bandwidth in, 10 GB bandwidth out, 200M CPU Megacycles, 650k HTTP Requests, 2.5M datastore API calls and 160k URLFetch API calls.

Technology: Development Environment and APIs

The technology stack is currently based on Python, one of Google's sanctioned languages, although Google says that they 'look forward to supporting more languages in the future.' Google offers a Python runtime environment that runs in a secure sandbox which provides limited access to the underlying operating system, for the purposes of security and scale. That environment includes the standard library and can be extended through modules as long as they don't employ C:

The environment includes the Python standard library. Of course, calling a library method that violates a sandbox restriction, such as attempting to open a socket or write to a file, will not succeed. For convenience, several modules in the standard library whose core features are not supported by the runtime environment have been disabled, and code that imports them will raise an error.

Application code must be written exclusively in Python. Code with extensions written in C is not supported.

Other security limitations include outbound communication only through the supplied email and URL fetch APIs, inbound communication over HTTP and HTTPS on the standard ports, no filesystem write access and no sub-processes or code execution outside the request-response loop (e.g. background and batch processing).

In addition, Google offers APIs to access a Datastore, Google user accounts, URL fetch and email services. App Engine also includes a simplified web application framework and Django 0.96.1, although the App Engine Datastore is not relational, and can't be used with all Django APIs.

The datastore API is backed by Google's BigTable, but has a lot in common with a simple object persistence API (or an object-relational mapping framework, even though Google takes care to point out that the datastore isn't relational):

For most of you, working with the Datastore will probably take a little getting used to: as I've said, it's not SQL. That's a big difference. However, we think that after a while, the Datastore may actually grow on you, because it makes some things easier. For one thing, our datastore is schema-less, meaning it can support arbitrary new properties or columns, which you can create as you code, without having to design everything up front and create a schema. This comes back to our goal of making writing a web app as easy as possible: just start coding. Your data model can evolve along with your app.

Even though the Datastore is a departure from SQL, we still support a lot of powerful functionality that you usually expect from a traditional database. The Datastore supports efficient queries on any single property or set of properties you provide. It supports provides sort orderings on your query results, including sort orders on multiple properties. It supports transactions for writes, with transactional groupings that you control. It supports batch operations for fetching or creating a large number of entities. It optionally allows you to control the primary key of your entities, for more efficient queries and shorter URLs.

And, even though the Datastore is not SQL, we're providing you with a SQL-like query language, called GQL, to make it easier to formulate queries. GQL is in the spirit of jQuery and FBQL: the underlying store is not SQL, but nearly all of the queries that you'd like to do can still be accomplished.

One big feature that you may have noticed that our Datastore doesn't have, though, is joins. The reason for this is that joins are usually a source of performance problems in a distributed system, when you go beyond a single machine: it's much harder to efficiently support a join on a distributed system that spans many computers and many hard disks.

Although the datastore API supports transactions, they have strict limits and are tied to entity groups:

Every entity belongs to an entity group, a set of one or more entities that can be manipulated in a single transaction. Entity group relationships tell App Engine to store several entities in the same part of the distributed network. A transaction sets up datastore operations for an entity group, and all of the operations are applied as a group, or not at all if the transaction fails.

When the application creates an entity, it can assign another entity as the parent of the new entity. Assigning a parent to a new entity puts the new entity in the same entity group as the parent entity.

An entity without a parent is a root entity. An entity that is a parent for another entity can also have a parent. A chain of parent entities from an entity up to the root is the path for the entity, and members of the path are the entity's ancestors. The parent of an entity is defined when the entity is created, and cannot be changed later.

Every entity with a given root entity as an ancestor is in the same entity group. All entities in a group are stored in the same datastore node. A single transaction can modify multiple entities in a single group, or add new entities to the group by making the new entity's parent an existing entity in the group.

Because App Engine forces you to approach your development in a particular way (e.g. Datastore on BigTable instead of database), Google argues that your application will be easier to scale and can scale nearly transparently:

When a web app surges in popularity, the sudden increase in traffic can be overwhelming for applications of all sizes, from startups to large companies that find themselves rearchitecting their databases and entire systems several times a year. With automatic replication and load balancing, Google App Engine makes it easier to scale from one user to one million by taking advantage of Bigtable and other components of Google's scalable infrastructure.

The User API allows for user authentication / login via Google Account, and access to the account's nickname and email. Any further user information could be gathered directly from the user by the application and stored in the datastore.

The URL fetch API allows for retrieval of information from remote servers by fetching HTTP and HTTPs URLs (supports GET, POST, HEAD, PUT and DELETE, so it seems as if this would support REST functionality).

The Mail API allows for App Engine applications to send email asynchronously with retries if the mail server is unavailable.

The App Engine SDK includes a server to simulate the App Engine python runtime environment, and:

  • reproduces the module import restrictions, and only allows handlers to import an allowed module from the standard library, the third-party libraries included in the App Engine Python environment, and modules in the application directory
  • reproduces the app caching behavior
  • emulates the App Engine datastore using local files
  • emulates Google Accounts with sign-in and sign-out pages that accept any email address
  • emulates the URL fetch service by fetching URLs directly from your computer
  • emulates the mail service using an SMTP server or Sendmail configuration of your choice

At first glance, most of the application configuration seems to be done in YAML.

Motive and Competition

Google's announcement describes their motives, to make it easier to build, deploy and scale out web applications:

Well, we built App Engine because we want more web apps to get created. What we noticed is that, today, it's pretty hard to create one: there are significant upfront challenges to deploying even the simplest of web applications. You've got a lot of tasks to do. First, you have to write the code for your app, of course.

But then, you also have to write your Apache web server configs and startup scripts, set up your SQL database, create all of it's tables and hook up the passwords, set up monitoring so you can tell what's going on with your traffic and logs, decide how you'll push new versions of your code, and on, and on.

That's the technical setup challenge that we noticed. And then, once you've done all that sysadmin work, you have another challenge: you have to actually go find machines you can use somewhere, physically or from a virtual provider, to run your app somewhere. Right now, that costs money: even for the smallest app, which you use a few times a week, you have to pay a pretty big upfront fee to run that app with a traditional hosting provider.

So that's the financial or physical challenge. And then, once you've got the whole thing set up and working, and found and paid for a place to test it out, you've got another challenge: you've got to maintain it all as your app grows. Your machines crash, your configs have errors, your hard disks break, your traffic starts to grow, you have to re-shard your databases, set up more machines and on. Keeping everything going as your app grows is a hassle.

All of these hassles are what we're trying to abstract away with App Engine. They are the problems that we're trying to fix.

Others are already speculating about additional motives. Many point out potential competition with Amazon and Microsoft over the future of cloud computing and web services, often comparing App Engine to Amazon's web services EC2, S3, SQS and SimpleDB:

  • O'Reilly Radar said:

    After Amazon Web Services started doing so well we all knew it was just a matter of time (next will be Microsoft we can can safely assume). Though the obvious comparison is to AWS, they aren't really the same beast. Amazon has released a set a disparate services that can be used to created a general computing platform. The services, though they work together, do not come bundled.

    App Engine on the other hand is almost literally an engine for powering web applications. It bundles together many of the features that AWS offers into a singular package: storage like S3, auto-scaling and processing power like EC2, and a datastore like SimpleDB. App Engine also offers things that are not available on AWS like a Python runtime, Google-specific APIs and perhaps most notably a free portion of the service.

  • VentureBeat: "Google App Engine readies for brawl with Amazon"

Others suggest that Microsoft is heading in this direction as well with things like Ray Ozzie's Mesh strategy and SQL Server Data Services, but may already be too late:

Looking at another angle, some suggest that this could give Google a head-start on acquisitions, a form of venture infrastructure:

  • Business Week argued that the competition between Google and Amazon misses the point, that encouraging startups to develop their applications in Google's infrastructure gives Google "not only good visibility into the kinds of applications people want and the problems it may need to overcome with them, but also a bird's-eye view into the most promising new startups it might want to acquire".
  • ZDNet added that it could save Google money on acquisitions: "imagine how much time and effort could be saved if a company purchased by Google already uses Google's technology?"
  • GigaOM says, "This type of loss-leader service gets startups in the door with Google, giving the company access to the freshest ideas and an entrepreneurial talent pool that it can tap." In "How Google Can Eat Amazon's Lunch,"
  • Kevin Kelleher calls this investing:

    In the interview I speculated aloud that what Amazon was doing was a lot like what corporate VC arms like Intel Capital do — invest in startups with which they will work — or buy — later on. Only instead of using hard cash, they were using infrastructure. Very shrewd, I said.

    The executive's response was that Amazon was not doing that at all, and that it would never do that with web services. I thought but didn't say: Well, if you don't do it someone else will.

    Now some pig is saying that Google is doing it. As valued Google workers pack up their desks and launch new startups, this is the single best strategy for Google to bring them back into the fold. And it's a great way to pull the rug out from under Amazon, strategy-wise and profit-wise.

Feedback, Analysis and Resources

Hello stranger!

You need to Register an InfoQ account or to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

User Base by Michael Prescott

One topic I haven't seen in the commentary is the connection with Google's user service. This seems one of the more insidious aspects of dependency on the GAE infrastructure - if you use Google's 'Users' API, then they actually have your user base.

Bend over for Google... by Jason Carreira

I can't understand who'd want this... With Amazon Web Services, you get a VM hosting service where you can run any AMI.. It can be anything that runs on Linux... it can be multiple services configured to work together including off-the-shelf software. As a software vendor, you get another channel for selling your solutions. You can set up and manage email servers, databases, etc. Anything that runs on Linux.

With Google, you get to rewrite anything you've written... in Python this time (yuck)... and write to Google's proprietary APIs. Sure, you can use Django on the front end (still Python, yuck), but on the backend it needs to use Google's data store API. You can't take off-the-shelf software and run it, you can't use it as a reseller channel for selling your existing software. Everything has to be built from scratch, and it's just web apps, nothing else.

With Amazon, if you get sick of their service or they raise their rates too high, you take your VM's and run them somewhere else. There's tools to convert them to other VM formats, including VMWare. It's just a Linux install, after all.

With Google, if you get sick of their service or they raise their rates too high... You don't really have an option.... You're tied to their API's. You're stuck on their service unless you rewrite.

I'm sure people will play around with this, and I'm sure some startups will be foolish enough to tie themselves into it, but consider yourself warned.

I also like my wife's take over at Profy.com on how it's basically another "me too" offering from Google...

This marks an important new trend in cloud computing I think by Floyd Marinescu

I had a feeling Google was going to do this ever since the bought JOT which was trying to do something similar (application hosting), but I'm surprised they are starting with Python and not Java.

I think Google App engine is the beginning of a whole new category of cloud computing offerings, making the total set in my view loosely similar to the following:
<ol>
  • Grid or Master/Worker implementation clusters. This is the traditional view of cloud computing, where you just have a master/work type programming model with the workers being executed transparently across a grid of computers. This type of stuff is happening behind the firewall, I don't know of any internet/publically exposed services that do this.

  • The internet becoming a new middleware platform. This is where Amazon, Microsoft, Yahoo, and are playing. Middleware api's/products that previously were installed in the datacenter and charged on a license fee basis are now moving online with web service accessesible API's, charged on a utility computing model. Things like compute (Amazon EC2), Storage (S3, Microsoft SQL Data Services), Queing, domain-specific data sources (the realm of mashups and RSS), are part of this 'trend'. The internet is starting to provide middleware building blocks that are mashed up to build applications. Users can see massive cost savings compared to buying similar tools and maintaining them themselves in the data center.

  • The internet as an application hosting platform.So whereas the previous trend is about exposing middleware constructs on the net, this trend is about exposing a whole end-to-end application stack with API's for everything from MVC/web to messaging to data storage (entities) built in. As a developer now you don't even need to think about things such as scalability, about data storage format, etc. You just deploy your app to this 'platform cloud' and it just works. Google's cloud computing offering is thus a higher-up-the stack offering compared to Amazons.

  • </ol>

    I'm very interested to see how all this plays out in a few years... Jason's comments about vendor-lockin are quite important as well...

    Re: Bend over for Google... by Geoffrey Wiseman

    Well, the low barrier to entry (e.g. cost, in dollars) is probably one of the more compelling things to many potential candidates. Amazon's services aren't terribly priced, but they do still add up.

    There's definitely a pretty strong lock-in model here, deliberate or not. You've gotta be willing to accept the App Engine restrictions, and trust Google not to pull the rug out from under you -- and that's a lot of trust. That said, if you're willing to go that far, you get an app that may scale with far less effort than something you assemble yourself on S3, EC2 and SimpleDB, where you have to configure your own images, monitor your own systems for load and provision new instances yourself.

    Don't get me wrong -- Amazon's model has a lot going for it, particularly if you've already got an application developed or you're not willing to live within the App Engine restrictions. But at the same time, I can see there are some advantages to what Google offers -- and if that model were adopted and standardized such that there were more than one provider of App Engine grid space, I think you'd find that people find the simplicity appealing.

    Interesting Stuff by Kevin Teague

    Free, easy Python web application hosting? Wow. This is likely to give a very large boost to Python for web apps, especially for people just learning the craft who may have instead chosen PHP because it's advantages of cost and ease of deployment.


    "For one thing, our datastore is schema-less, meaning it can support arbitrary new properties or columns, which you can create as you code, without having to design everything up front and create a schema."


    I suppose the hardcore DBA guys are gonna have another fit as people write Google web apps where data persistence is just treated as a "big hash". Although we've been doing schemaless data persistence in Python for over a decade now with the ZODB, and there is all sorts pain you can get yourself into when you "create as you code, without having to ... create a schema" it can also be quite nice for prototyping. Packages such as zope.schema are very nice for maintaining sanity without losing some of the advantages possible in a schemaless persistence system though.

    BTW, InfoQ should have a Python Community section :)

    Additional Analysis by Geoffrey Wiseman

    Some more analysis from this evening:

    Re: Additional Analysis by Geoffrey Wiseman

    Dave Winer's take deserved a quote of it's own:
    Now, what Google announced is really exciting! I'm not kidding. It's even better than I hoped. Yes, it's only Python, but IBM's PC-DOS was only BASIC and Pascal when it first came out, and it didn't matter. Yeah, I preferred C, but I coded in Pascal because that's what you had to do to get an app running. What you're going to see here that you've never seen before is shrinkwrap net apps that scale that can be deployed by civillians. That's a mouthful, but that's what's coming. Why? Because here is a standardized platform that can be stamped out in the billions of units. Maybe Google can't do it, but the perception is that they can. Who is willing to stand up and say Google hasn't nailed scaling? What PCs did in the 80s, Google is doing now. PCs took the black magic out of owning a computer. Now Google is taking the black magic out of operating a scalable web app. Python is the new BASIC.

    Re: Additional Analysis by Jason Carreira

    Dave Winer is, as always, clueless.

    Re: Additional Analysis by Nati Shalom

    I'll suggest to take a look also the following article What cloud computing really means

    It seems that the trend of offering proprietary platforms is becoming common among cloud providers:


    Platform as a service
    Another SaaS variation, this form of cloud computing delivers development environments as a service. You build your own applications that run on the provider's infrastructure and are delivered to your users via the Internet from the provider's servers. Like Legos, these services are constrained by the vendor's design and capabilities, so you don't get complete freedom, but you do get predictability and pre-integration. Prime examples include Salesforce.com's Force.com and Coghead. For extremely lightweight development, cloud-based mashup platforms abound, such as Yahoo Pipes or Dapper.net.


    What does that mean for those that wrote their application in Java or JEE?

    I tend to agree with Jason Carreira that EC2 offers more flexibility in that area that will enable existing application to run in the cloud.

    I see an interesting echo system built around their offering that aims to cover the development platform question as well as management and monitoring and other aspect that one would need to deal with when it comes deploying application on the cloud.



    Nati S.

    GigaSpaces

    Re: Additional Analysis by Jacques du Preez

    App Engine looks like an option worth exploring. I am however irritated by Google's lack of consistency: I mean they started with Java with Google Web Toolkit, why not continue along that line. Then vendors at least know that when they come to Google, that Java will be a primary language. Small companies, partnering with Google, can't just hire new Python developers everytime Google decides to use a new language.



    The other thing that is a source of unease is how the actual development process and environment is supported? I mean it's not like I can take my app off-line to test and debug when it depends on Google Datastore.



    I believe however that App Engine's advantage over Amazon's is that it provides a complete package to serve web apps, at the cost of closer coupling & less control. Amazon gives more control, but requires better technical ability.

    Re: Python and InfoQ by Zeev B

    Pedro,
    I totally agree with you. InfoQ has been ignoring Python's success and popularity for a long time now (although there were some exceptions). It seems that they were heavily influenced by the Ruby-on-Rails hype that flourished in the Java camp some time ago. I hope this will change in the near future now that Sun have hired two prominent Python/Jython developers and that Python is gaining a lot of public attention.
    Don't get me wrong - I'm a regular reader of this site and I like the content but as a veteran Java programmer and a recent Python enthusiast I'm a bit disappointed.

    Ze'ev

    Re: Additional Analysis by Jacques du Preez

    I stand to be corrected on my 2nd concern: Google has released an open source Development Web Server that mimics the actual App Engine server environment.




    The App Engine Dev Web Server is licensed under the Apache License 2.0.

    Re: This marks an important new trend in cloud computing I think by Kurt Christensen

    I think in five years we may look back and see that this was a major turning point against the widespread use of Java for web app development. It seems to me that Google is saying to the world: "for web app development, we - Google, internally - view the Python technology stack as being so superior to the Java technology stack in terms of individual programmer productivity, that we're not even offering you an option". Starting now, if you want to develop a web app using a Java technology stack, more than ever before will you be required to explain what you understand about web app development that Google does not.



    Now if they would just release a version that supported Lisp... :-)

    Re: This marks an important new trend in cloud computing I think by Jason Carreira

    Funny that they, Google, use Java for a lot of their most popular web applications. Maybe they just thought Java people were too smart to fall for this.

    Re: Bend over for Google... by Cyndy Aleo-Carreira

    Nothing in life is free. The EULA for GAE is frightening. Google retains the right to take down your app at any time for any reason as they see fit. And that's just the tip of the iceberg.

    Re: Bend over for Google... by Geoffrey Wiseman

    Yes and no - that scares me precisely because of the vendor-lock-in. I mean, I wouldn't necessarily be shocked to have an ISP reserve that kind of right -- only I would retain the option of hosting it somewhere else. With GAE, if Google decides they want to take down your app, what's your next option?

    That said, thus far, Google doesn't seem like the kind of company to abuse that kind of power. Still, it's a lot of trust to put in one company, and a lot of power to cede.

    Re: Python and InfoQ by Ryan Slobojan

    How to fit in Python (amongst other interesting technologies/languages/concepts) is actually an issue we have been struggling with on the internal editorial mailing lists for many months now - Jython and IronPython stories have obvious homes, but the rest don't really fit into our set of communities. InfoQ is formed around the idea of communities - self-identifying, good-size communities of people such as Agile, Java and .Net people. Being able to start up an entire new queue requires quite a bit of effort and support, and my own personal take (not speaking for InfoQ here, speaking as an individual) is that Python isn't yet at the critical mass where we can consider it a good-size, self-identifying community of developers. I also personally don't feel that there's enough news, events and advances in the Python arena to keep a team of editors busy week after week after week. Python's interesting, but so are a whole lot of other things...

    Step in the right direction .. by Sony Mathew

    I think its a step up in the right direction. Some of the concerns of privacy and control will need to be resolved eventually - which i have no doubt will - these are not new problems. Not sure about Python - i expect another language to be adopted before the App Engine gets real success.

    A Web Proxy based on Google Appengine by Lin Shilai

    I have developed a web proxy based on Google Appengine, can you give me some suggests?
    Apollo Web Proxy - quick-proxy.appspot.com

    Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

    Email me replies to any of my messages in this thread

    Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

    Email me replies to any of my messages in this thread

    19 Discuss

    Educational Content

    General Feedback
    Bugs
    Advertising
    Editorial
    InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
    Privacy policy
    BT