InfoQ Homepage Articles Key Takeaway Points and Lessons Learned from QCon London 2013

Key Takeaway Points and Lessons Learned from QCon London 2013

Apr 05, 2013 128 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

Going into its seventh year, QCon London 2013 featured thought-provoking and engaging keynotes from MIT’s Barbara Liskov, Perl Boffin Damian Conway, as well as Wiki Creator Ward Cunningham, and Greg Young.

Over 1,100 team leads, architects, and project managers attended 75 technical sessions across 6 concurrent tracks, 12 in-depth tutorials, facilitated open spaces and for the first time this year, had instant access to all filmed presentations from the event on InfoQ.

This article summarizes the key takeaways and highlights from QCon London 2013 as blogged and tweeted by attendees. Over the course of the coming months, InfoQ will be publishing most of the conference sessions online, including video interviews that were recorded by the InfoQ editorial team. The publishing schedule can be found on the QCon London web site.

You can also see numerous attendee-taken photos of QCon on Flickr.

Keynotes

The power of abstraction by Barbara Liskov
Fun With Dead Languages by Damian Conway
8 Lines of Code by Greg Young
Instantly Better Presentations by Damian Conway
A Forward Look at Federated Wiki by Ward Cunningham

Distributed Systems / REST

A Platform for all that we know by Savas Parastatidis
No Link Left Behind by Paul Downey
Road to REST by Rickard Oberg
HTTP/2.0: Challenges and Opportunities by Mark Nottingham

The Java Developer Track

The Java EE 7 Platform: Higher Productivity & Embracing HTML 5 by Arun Gupta
Garbage Collection - The Useful Parts by Martijn Verburg

The Developer Track

Web Development: You're Doing it Wrong by Stefan Tilkov
How to rescue our kids: fixing the ICT crisis at school by Simon Peyton Jones
You are not a software developer! - Simplicity in practice by Russell Miles
Performance Testing Java Applications by Martin Thompson

Building for Clouds

Clouds in Government - Perils of Portability by Gareth Rushgrove
Extending CloudFoundry with new Services by Chris Hedley, Andrew Crump
Racing Thru the Last Mile: Cloud Delivery Web-Scale Deployment by Alex Papadimoulis

Real Startups

How to turn startup ideas into reality by taking money from strangers by Ian Brookes

Creative Thinking & Visual Problem-solving

Ideas, not Art: Drawing Out Solutions by Heather Willems
Machine Me by Fernando Orellana

Handheld Banking

Put a UI Developer in a Bank; See what happens by Horia Dragomir
Testing iOS Apps by Graham Lee
The Future of Mobile Banking by Michael Nuciforo

Building Web Apis: Opening & Linking Your Data

Introducing the BBC's Linked Data Platform and APIs by David Rogers
The Why, What and How of Open Data by Jeni Tennison
Building APIs by building on APIs by Paul Downey, David Heath
Building Hypermedia APIs with HTML by Jon Moore
Generic Hypermedia and Domain-Specific APIs: RESTing in the ALPS by Mike Amundsen

Schadenfreude - War Stories

The inevitability of failure by Dave Cliff
Painful success - lessons learned while scaling up by Jesper Richter-Reichhelm

Architectural Hangover Cure

Deleting Code at Nokia by Tom Coupland

Agile in Actuality: Stories from the Front Line

People over Process: Applying it in real world software development by Glen Ford
Climbing out of a crisis loop: How a critical BBC back-end team reigned in a workflow crisis-to-crisis cycle by Rafiq Gemmail, Katherine Kirk
Between Fluffy Bunnies and Command & Control: Agile Adoption in Practice by Benjamin Mitchell
Accelerating Agile: hyper-performing without the hype by Dan North
Yanking business into testing - with lots of vegetables by Gojko Adzic, Lukas Oberhuber

Next Generation Mobile Apps

New capabilities of HTML5 browsers by Maximiliano Firtman
Architecting PhoneGap Applications by Christophe Coenraets

Finance (Design & Architecture)

High Performance Messaging for Web-Based Trading Systems by Frank Greco
How NOT to Measure Latency by Gil Tene

Architectures of the Small & Beautiful

Startup Architecture: how to lean on others to get stuff done by Robbie Clutton
Inside Lanyrd's Architecture by Andrew Godwin
Green shoots in the brownest field: Being a startup in Government by Mat Wall
How we scaled Songkick for more traffic and more productive development by Marc Pacheco
Architecture of the Triposo travel guide by Jon Tirsen, Douwe Osinga

The Modern Web Stack

Visualizing Information with HTML5 by Dio Synodinos
Rich HTML/JS applications with knockout.js and no server by Steven Sanderson

Finance, Technology & Implementation

In-Memory Message & Trade repositories by John T Davies
Consumerisation - what does it mean to a developer? by Chris Swan
The technology behind an Equity Trade by John O'Hara

Big Data NoSQL

Big Data: Making Sense of it all! by Jamie Engesser
The Past, Present, and Future of NoSQL by Matt Asay
A little graph theory for the busy developer by Jim Webber
Approximate methods for scalable data mining by Andrew Clegg

Making the Future

Physical Pi by Romilly Cocking, Steve Freeman
Here Comes Wearable Technology! by Rain Ashford

Attracting Great People

Hire Education - making interviews rock by Trisha Gee, Dan North
NoHR Hiring by Martijn Verburg, Zoe Slattery

NoSQL Solutions Track

Moderated NoSQL Panel by Alvin Richards, Chris Molozian, Andrew Elmore, Ian Robinson
Scaling for Humongous amounts of data with MongoDB by Alvin Richards
Becoming Polyglot; Putting Neo4j into production and what happened next by Toby O'Rourke
Eventual Consistency in the Real World by Chris Molozian
Financial Big Data - Loosely Coupled, Highly Structured by Andrew Elmore

Solution Track Thursday 2

Big Data @ Skype by Bryan Dove

Keynotes

The power of abstraction by Barbara Liskov

Alex Blewitt attended this keynote:

Professor Barbara Liskov from MIT opened the conference, covering a historical retrospective on the evolution of programming. … As well as describing the Liskov Substitution Principle (which i’ve written about before) – which, she notes she didn’t coin – as being a desirable property to allow subtypes to replace functionally equivalent object types. Of course, as she noted in her presentation, this doesn’t always hold; for example, both a Queue and a Stack have the same signature types but different semantic behaviour, and clearly the substitution rules only apply for those with semantic compatibility.

Twitter feedback on this session included:

@m4tthall: program readability is much more important than program writability - Liskov, so true and often forgotten #QConLondon

@teropa: In a language, it's not the processing power that matters, it's the expressive power #liskov #qconlondon

@andypiper: People still don't know how to handle concurrency *ripple of laughter* #qconlondon

@PopCatalin: #qconlondon Simplicity matters tremendously - Barbara Liskov

@alblue: Polymorphism in 1974 #qconlondon https://t.co/NPZesEdrF5

@alblue: Duck typing in 1974 #qconlondon https://t.co/MYcsTEtX3S

@teropa: Liskov substitution principle explained by Liskov! She calls it common sense #qconlondon

@sbisson: Fascinating keynote by Turing Award winner Barbara Liskov on the history of programming abstraction. #qconlondon

@m4tthall: Liskov - students MIT tend to start with Python rather than Java and C# as they aren't as easy to start with #QConLondon

@andypiper: Liskov - we need a language that meets needs of advanced and beginner users they can grow with #qconlondon

@fauna5: How to get people to remember your talk? Have a live artist creating a mural as you speak #badass #qconlondon http://t.co/VDlrCRHm7Z

@RobertMircea: Great infographic live drawing at #qconlondon keynote by @ImageThink http://t.co/CAJZHzP81s

@jgrodziski: All the links to free copies of articles Barbara Liskov referred to in her talk at #qconlondon 2013 http://t.co/WK0XpLKjUV

Fun With Dead Languages by Damian Conway

Alex Blewitt attended this keynote:

The gist of the presentation was that of avoiding language monoculture; by way of example, his first slide was of a type of bananas now extinct having been wiped out by a particular disease some time in the past, and that the current crop of bananas being genetically identical may suffer the same fate in the future. Applied to languages, and more specifically developers of those languages, his argument was working only in a single language necessarily increases the possibility that external changes may make the language redundant or enforce a particular mind set.

His examples of both C++ and Latin were well received; he used operator overloading in C++ to render a text file with a series of lines like state1 ------> state2 (having overloaded the -- operator to return a partial function, and the > operator to apply that function) to represent the source of a state machine. Importantly this also gave the source file the ability to re-order the lines of text in the source file and not change the meaning of it, introducing the point of a position independent language.

The final part of the presentation was exploring that point further, by using the concept of Latin’s expressive tenses to represent whether a variable’s value was being assigned or referred to.

Will Hamill attended this keynote:

Damian demonstrated some diversity in style by using PostScript, a declarative language (old school!), to write a program that determines the value of Pi by declaring functions and using the stack to store function results, as it has no variables or methods (much like how compiled code actually works deep down in the plumbing, underneath all the abstraction). As a bonus, because the only common use of PostScript these days is in printers, you can still write this program and send its source file to most office printers and have them actually print out increasingly accurate estimates of Pi.

Damian also illustrated translating a C++ implementation of Eratosthenes’ Sieve into Latin (yes, Latin) to show how the need for ordering of variables and method operators could be removed through the use of different inflexions on the nouns being used - when you have a suffix at the end of a variable telling you it is being operated on, and a suffix on another telling you that it is operating on something then the ordering doesn’t matter any more. This part of the talk was fantastically entertaining and ended with a demonstration of the actual program written in Latin and executing to produce the prime numbers up to CCLV (that’s 225 for us non-Latin types).

Twitter feedback on this session included:

@JamesEdwardsUk: Fun with dead languages by Damien Conway - good start, funny guy! #qconlondon

@timanderson: People who can only code in Java are "sterile clones" says Damian Conway #qconlondon

@h269: At the very entertaining Keynote: Fun With Dead Languages #qconlondon

@janerikcarlsen: #QConLondon Lovely keynote to conclude an awesome day of thought-provoking stuff.

@alblue: Nextstep terminal makes an appearance at #qconlondon https://t.co/Mt3rDunJRE

@CaplinTech: We're learning how to code PostScript. Not entirely sure why, but it's very amusing #qconlondon

@garethr: Now wondering if I want to write postscript or Java for my next project. I blame Damian Conway #qconlondon

@secboffin: Postscript looks odd now, but there are very few conventions to learn. A bit like LISP. #qconlondon

@secboffin: Seeing C++ called a dead language at #qconlondon. Had Stroustrup not nailed it to the perch etc.

@Frank_Scholten: LOL "Instead of a cluster with a java web app use recycled postscript printers. You get persistant storage too" ~ Damian Conway #QConLondon

@sbisson: Damien Conway turning Latin into a programming language at #qconlondon

@dthume: Move over lady lovelace; romans may have been the world's first programmers; Damien Conway at #qconlondon

@Frank_Scholten: Never learned latin in school. You can change the order without changing meaning. So let's program in latin! D. Conway keynote #QConLondon

@dgheath21: #qconlondon http://t.co/gJgKvQNIEn

@Frank_Scholten: #perl $s=count() in #latin becomes countementum Damian Conway's keynote #QConLondon

@pablojimeno: Programming in Latin. Simply amazing Damian Conway #QconLondonÂ http://t.co/XrBac2ZWHP

@secboffin: Hilarious, but shows power of declension in programming. Sieve of Eratosthenes in Latin at #qconlondon. And it runs! http://t.co/doQ1qm9eOG

@alblue: I’ve never seen the sieve of Eratosthenes written in Latin before #qconlondon https://t.co/5f1SGF5sWh

@BlackPepperLtd: Learning to program in Latin at #QConLondon. Interesting keynote about lessons that can be learned from dead programming languages.

@dthume: Programming languages define the way in which programmers think Damien Conway at #qconlondon

@alblue: Probably the best keynote I have ever seen. #qconlondon https://t.co/MIBKT5J7zf

@markhobson: Amazing, keynote defines a Latin-based programming language live on stage. #qconlondon @BlackPepperLtd

8 Lines of Code by Greg Young by Greg Young

Richard Smith attended this session:

The particular example that he picks is a seemingly simple command on an object repository; something like:
[Transactional]
public class DeactivateCommand {
	private readonly ItemRepository repository;

	public DeactivateCommand(ItemRepository repository){
		this.repository = repository;
	}

	public virtual void Deactivate(Item item){
		repository.Deactivate(item);
	}
}
What complexity lurks behind that [Transactional]? If you are using a typical aspect-oriented programming (AOP) framework, a dynamic proxy (i.e. a runtime extension of your class) will be involved to perform the interception and wrapping of methods to implement the transactional behaviour; not only is that complex, difficult to understand and extremely difficult to track down problems with, but it also introduces 'just because' rules to our coding that don't make sense: what happens if we forget to add that virtual (answer: the proxy won't work), or if we return this; from a proxied method (answer: you lose the proxy; this is known as the 'leaky this problem'). How do we explain this to a new team member?

If this command object is instantiated through a dependency injection or IOC container, finding out what it is using as its repository requires looking through magic non-code configuration, too.

Greg says, and I agree with this: "Frameworks have a tendency to introduce magic into my system". They do so in order to hide the complexity involved in solving the complex problem they are designed to address – but we should look at our particular problem and ask whether our problem requires us to solve that one. Can we rephrase our problem to avoid the need for magic? Using the example of dynamic proxies again, we see that the problem they are designed to solve is intercepting method calls with variable parameters; if we control the whole codebase, we can change the problem so that method calls don't have variable parameters, and the AOP framework becomes unnecessary!

Mike Salsbury attended this session:

This was a talk about Simplicity and Magic. Frameworks contain magic, and IoC is like magic. The problem is that the more magic there is in your code the harder it is for anyone new to ramp up and be able to contribute. So you are only able to hire people who already know how to do magic.

It is much easier (and more useful) to explain Composition to a junior than the magic of Dynamic Proxies. Along the way we also considered whether single method interfaces shouldn’t be interfaces but maybe a function. There was also some examples of using lambdas. Also how Factory can be an anti pattern as well as a pattern. And the partial application pattern.

Overall the take home message was the same one I got from an AI professor. Don’t make it harder than it is. Do you really need a framework? Or is it just easier to mask the real problem if you use a framework. If the solution to your problem doesn’t require a framework, don’t use one. Keep it Simple.

Twitter feedback on this session included:

@octoberclub: get rid of magic in yr code base to achieve simplicity. functional style. @gregyoung #qconlondon #IoC #AOP

@m4tthall: Greg Young "IOC Containers make it very easy to do things you shouldn't be doing" #QConLondon

@jgrodziski: #qconlondon @gregyoung "beware of the magic frameworks bring with them"

@octoberclub: IoC containers make it all too easy to introduce massive complexity @gregyoung #qconlondon

@m4tthall: #gregyoung very often tools hide problems, tools can add complexity, you own all the code, regardless if it is in ext. library #QConLondon

@matlockx: Greg young: "you own all code in your project...your boss doesn't care if the bug was in someone else's library #QConLondon

Instantly Better Presentations by Damian Conway

Alex Blewitt attended this keynote:

Thursday started off with another classic presentation from Damian Conway, this time with a crash course on how to deliver “Instantly Better Presentations”…

Be passionate about what you do – and sound like you’re enjoying it

Be knowledgeable about your subject – if that means learning it, so be it

Slides are for the most important point only – don’t read along text from it

If you have to demo running code, use animations to show code flow

If you can’t use animation tools, use multiple slides and animate manually

Handouts are for afterwards, not a copy of the slides

Get rid of all unnecessary backgrounds/graphics/bullets

Rehearse, rehearse, rehearse (at least 3 times before giving it)

Tell the audience when questions are OK (at the end, during)

Talk to a picture of a large audience if you can’t find one

Tell a story by choosing points carefully and threading a narrative through

Twitter feedback on this session included:

@Gshtrifork: Stories help us manage complexity and help us remember information! #qconlondon Tell it as a story (Damian Conway)

@Gshtrifork: Less is More #qconlondon

@Gshtrifork: Show less on more slides... #qconlondon

@ravinar: Every unneeded decoration obscures your message, show less and on more slides - Damian Conway #qconlondon

@chrismadelin: Better technical presentations - Practice in front of your cat, it's harder to hold their attention #qconlondon http://t.co/yaPSPKpRbQ

@mjpt777: Totally awesome keynote by Damian Conway on how to give better presentations. Step 1 - know your subject and be passionate. #QConLondon

A Forward Look at Federated Wiki by Ward Cunningham

Alex Blewitt attended this keynote:

Today’s keynote was the master of the wiki, Ward Cunningham (@WardCunningham) on Federated Wiki.

The point of a federated wiki is to expose data through markup languages (think CSV meets markdown) and for a page to render data from whichever source it has come from. Not only that, but data sources can be combined from other sources or pages, so that blended views of data can be combined together.

There’s an example federated wiki at ward.fed.wiki.org along with some Vimeo videos demonstrating what it is like (http://vimeo.com/27671065,http://vimeo.com/27671347 and http://vimeo.com/27673743).

The use of d3js to provide the chart rendering and graphics in a browser was a pretty neat trick. It uses HTML5 Canvas if available, and if not falls back to SVG and even VML to render the graphics, so it works on almost every browser that has a graphical interface.

Richard Smith attended this session:

His newest idea is that of a federated wiki which can pull in information from an entire information ecosystem. What does he mean by that? Just as a normal wiki allows collaborative accumulation and interpretation of information about one subject, a federated wiki allows access to information (held on individual wikis, naturally) from a variety of sources. For example, a federation about cars could consist of individual wikis held by manufacturers, parts suppliers, garages, amateur enthusiasts and traffic laws.

The actual federation mechanism would act in a similar way to inheritance, with wikis being able to specify others as their base, with pages within the base being overridden by those in the higher level one if there is a conflict. Because content can be data or calculations, as well as text, and data can be passed between pages, that means that calculations or visualisations from one wiki can operate on data and results in another, or even override data sources in another.

A traditional wiki only shows one page at a time, in one browser tab. Ward showed how a view into a federated wiki opens several panels, each showing a different page, and how data flows 'across' the view between pages – so by changing which pages are open to their left, visualisation pages can show a graph of something different, or calculations can operate on different data sources. Information referred to in a calculation or visualisation is looked for up the current page, and if it isn't there, it's looked for to the left.

The wiki itself allows for client- and server-side plugins to interpret the markup, allowing a particular wiki to interpret its own markup as a domain-specific language (DSL). Ward showed a demo of a federated wiki controlling a microcontroller chip over USB, via a serverside plugin which translates wiki markup commands into chip messages.

Twitter feedback on this session included:

@benjaminm: Scaling Agile is often 'How can I make this team of 100 act like a team of 15, but I have to have 100 people @WardCunningham #qconlondon

@tastapod: Lovely! @WardCunningham at #QConLondon: "I want to elevate plagiarism to a virtue, and call it collaboration." #federatedwiki

@teropa: Don't be afraid to recycle the things you've learned... And each time to do it a little less well @WardCunningham #qconlondon

@EdMcBane: Federated wiki is no about why, it is about "why not?". Aperture-Science-style keynote from @WardCunningham at #QConLondon

@chickoo75: Wiki interactions with a micro controller! Never expected to see a live demo of this stuff. #qconlondon Only @wardcunningham!!!

@sbisson: Federated Wikis as collaborative dynamic endpoints for the Internet of Things. Fascinating talk by Ward Cunningham at #qconlondon

@alblue: The great @WardCunningham at #qconlondon https://t.co/O5cF3g5uUP

Distributed Systems / REST

A Platform for all that we know by Savas Parastatidis

Twitter feedback on this session included:

@reteganc: Web is the data and information platform, but not yet the knowledge, intelligence or wisdom platform. Knowledge is next. @qconlondon

No Link Left Behind by Paul Downey

Twitter feedback on this session included:

@robb1e: gov.uk hand rolled CMS uses markdown with https://t.co/A0WCfZftr2 #qconlondon

@AgileSteveSmith: Great gov.uk talk by Paul Downey @psd "a dashboard is useless without a call to action" #qconlondon

@mahemoff: Govspeak is @GovUK's markdown-derived markdown language /via @robb1e #qconlondon https://t.co/5YXSbt1gY9

Road to REST by Rickard Oberg

Mark Hobson attended this session:

He described the design evolution of a RESTful service that provided the back-end to various client platforms. The lessons learnt were two-fold: firstly, the resources exposed by the service should correlate to use-cases, rather than entities; and secondly, the often neglected HATEOAS constraint of REST allows clients to discover, and adapt to, server changes. Embracing these ideas again blurs the boundary between RESTful services and their UI, or as Rickard aptly put it, “a good REST API is like an ugly website”.

Will Hamill attended this session:

Rickard made an impassioned defence of the principles underlying REST and their usefulness, and asked one thing of the attendees: if your application communicates with your API in a way that isn’t actually RESTful - please stop calling it rest!

RESTful APIs require resource linking in order to actually meet the requirement for using HTTP as the engine of application state (HATEOAS), but most common implementations don’t do this. Rickard demonstrated how most websites are actually better REST API clients of the server resources than hand-rolled REST clients for web applications, as they only interact by following links or submitting forms and do so with properly described resource links and using the HTTP GET/POST verbs as intended with meaningful URLs.

Rickard showed how rather than exposing the domain model via the API, a more semantically meaninful and truly RESTful implementation is reached by exposing use cases; actions within the application described as URLs and acted upon with HTTP verbs. This also simplifies client development and documentation. For example, instead of a URL describing /users/willhamill/changepassword from a use-case perspective this could be /accountmanagement/changepassword/willhamill for changing my own password and /administration/changepassword/username/willhamill to change the password of another user.

In terms of simplifying server side implementation, this means that we can easily determine that actions in account management act upon the authenticated current user, but the actions performed in the administration use cases must require the permission (or role or whatever) to participate in these use cases. It also means that we can have this check a requirement for any resource after /administration but not need to specify it on on each single resource. We could also do something like have /administration paths only available within the internal network.

URLs exposed in the system start with higher level use cases and each / in the path represents a sub-use case, and doing a HTTP GET on / will list the actions available to the current user within that use case. I think this approach of exposing use cases rather than data or domain models makes a lot of sense and can make it easier to arrange the resources and actions in a web application. Rickard tied his example back into the point he made about a website - if we wanted to display a simple resource on a web page in this hierarchy we would likely be constructing a URL of this nature to make a meaningful link to the resource.

Kevin Hodges attended this session:

Figure your use cases and expose them on your API
The server is in charge of providing a list of links based on the state of the resource and what services are available (simplifies the client)
Client decides columns and filtering from fixed set
RESTful clients should follow links and submit forms, if not, you are doing something wrong

Twitter feedback on this session included:

@stilkov: A good REST API is like an ugly website; RESTful clients should follow links and submit forms @rickardoberg at #qconlondon

@NuagenIT: Loved how @rickardoberg apologised about RMI and JBoss during his talk on REST #qconlondon

@grantjforrester: Good session. Even the worst websites are probably better than most REST APIs today. #qconlondon

@PopCatalin: #QConLondon if you do REST right then the client becomes a fancy browser (renderer)

HTTP/2.0: Challenges and Opportunities by Mark Nottingham

Alex Blewitt attended this session:

Mark Nottingham (@mnot) gave an overview of what’s coming up with HTTP/2.0, including some of the rationale for providing new protocol layers to save data and reduce the number of packets required between web requests. The main principle is that between subsequent requests to the same server, the client transmits most of the same set of request headers (User-Agent, Accept, Host etc.) which can be avoided with subsequent requests with a sufficent encoding pass.

However, the transport layer is responsible for stripping and then re-assembling the layers on the other end; the client still sees the same full set of headers from the client that it is expecting to see even if the interim layers don’t actually send those bits.

David Arno attended this session:

The web has a problem: HTTP. Web pages are growing larger and more complex, involving many request/response cycles to complete the page and TCP works badly with HTTP. These problems are particularly severe with the fastest growing sector of web use: mobile. The mobile Opera browser even seeks to solve this by bypassing HTTP completely if it can by using its Opera Turbo proxy mechanism. Google have been working on an another alternative: SPDY. So the IETF have set up an HTTP working group to look in to a v2 of HTTP. The group is considering some pretty radical ideas, such as it being binary format (no more using telnet to examine HTTP responses, new tools will be required) and having the server automatically push CSS, JavaScript and image files associated with a page, rather than waiting for requests. The migration won’t be easy though, and it’s possible that the http and https protocols will have to be replaced with new ones. Even this might not be enough to solve all problems, so the group is even considering alternatives to TCP.

Twitter feedback on this session included:

@bluefloydlaci: #QConLondon great presentation on HTTP/2.0's challenges, opportunities, roadmap. HTTP 2.0 is the next great thing to come

@kevdude: HTTP/2.0: Challenges and Opportunities interesting talk on header compression, latency & reducing http request #QConLondon

@DavidArno: Also attending interesting talk on HTTP v2 at #qconlondon. Boy, is it different to v1.1, eg binary based & server push built in.

@sandropaganotti: HTTP 2.0 will be no longer text based !! bye bye telnet testing :D #qcon

The Java Developer Track

The Java EE 7 Platform: Higher Productivity & Embracing HTML 5 by Arun Gupta

Alex Blewitt attended this session:

Arun (developer evangelist from Oracle) provided a view on what’s coming up with Java EE 7, including a new annotation based message driven bean and simpler setup for enterprise Java applications. Most of the upgrades are evolutionary rather than revolutionary, although a couple of new APIs will provide standard JSON parsing (both in streaming and later object form).

Twitter feedback on this session included:

@gurkein: J2EE 7 standardizes spring batch #qconlondon

<Garbage Collection - The Useful Parts by Martijn Verburg

Richard Smith attended this session:

The first point he made is that, although the name would suggest otherwise, garbage collection is not really about the garbage (dead objects); it is about live objects. Garbage collection tracks the tree of object references from each root node (an entry point or currently active stack reference) each generation; what is available for collection is simply those objects that have not been located, and the Java GC doesn't need to know anything about them. (The .Net GC is a little different as it supports finalisers, so it needs to know what has died, but the main point remains the same.)

Java splits its memory space into three separate areas, known as memory pools:

Younggen space, which is further subdivided into 'Eden', the space for new objects, and 'Survivor 0' and 'Survivor 1'. Short lifespan objects never leave these pools. Almost all new objects are created in the Eden pool, and move to the current Survivor pool if they are still alive when younggen GC operates; that happens whenever Eden reaches a threshold size. The two Survivor pools alternate being the one in use; during a GC, objects are moved into the current Survivor pool (either from the previous Survivor pool or from Eden), and their previous location dereferenced. (Similar things happen in .Net, which is why objects that need a fixed memory location must be given a fixed-location GCHandle.) Younggen collection typically happens multiple times per second in a running application.

Oldgen or tenured space. Once objects reach a certain lifetime (4 collections by default), they are instead moved into this memory pool. Objects too large to fit into the appropriate younggen space are also moved here, so rapid allocation of large amounts of memory can pollute tenured space with young objects, leading to inefficient collection. Collection in this space is done by a concurrent mark-and-sweep algorithm, pausing execution for a minimum amount of time, until the space becomes too full or too fragmented, when a full compacting, pausing collection is run.

Perma-Gen space, for objects which have an expected lifetime of the entire process duration, e.g. Spring configuration and services. Garbage collection doesn't run at all here, but references from perma-gen objects can act as GC roots (see below) into other pools.

Why split up the memory we manage into pools like this? The answer comes from the Weak Generation Hypothesis – really more of a Theory as it is empirically backed – which states that most objects have a very short lifespan, a few objects have a long lifespan, and few objects are in between. …

Finally, we learnt what OutOfMemoryException actually means in Java. It means that one of the following is true: 98% of execution time has been taking place in garbage collection; a full GC execution freed up less than 2% of the heap; a single allocation is larger than the available heap memory (the one we all understand); or a thread failed to spawn.

Twitter feedback on this session included:

@wsbfg: My notes from Martjin Verburg's Garbage Collection, the useful parts. https://t.co/Lev5KFCUAl #qconlondon

The Developer Track

Web Development: You're Doing it Wrong by Stefan Tilkov

Kevin Hodges attended this session:

How to tell if you are doing “web” wrong

1. Your back button doesn’t work
2. On page refresh, you get the home page
3. You need to open a second browser window with the app
4. Your content doesn’t load first
5. Can’t bookmark stuff
6. URI doesn’t represent a single meaningful concept
7. Javascript is intrusive
8. Your HTML doesn’t make sense without javascript

Point being, the web already does most of this stuff for us. Stick to how the web works and don’t fight it

Mark Hobson attended this session:

He outlined a number of typical UI smells, such as the back button not working as expected and the inability to open multiple windows, that indicate that perhaps your architecture is fighting the model of the web. These problems tend to arise when we use a higher-level web framework to abstract ourselves away from the underlying web technologies (HTML, CSS, JavaScript) and the properties of HTTP (statelessness, client-server). Stefan argued that by attempting to overcome these problems, web frameworks ultimately evolve into primitive web-like architectures that foolishly try to re-solve the problems that the web itself has already solved. It’s much easier to work with the web rather than fight against it.

Stefan proposed a hybrid style between traditional server-side UI components and modern single-page applications (SPA) that takes the best characteristics of each, which he dubbed ‘Resource-Orientated Client Architecture‘ (ROCA). ROCA is a set of recommendations that describe how your application can be of the web, rather than just on the web. Central to this style is the subtle concept that the UI becomes merely a semantic HTML representation of its RESTful service.

Will Hamill attended this session:

Stefan listed common antipatterns and what he saw as irrational complaints about HTTP and the Web. This included working around its statelessness, preventing browser functionality (e.g. opening multiple windows of the same site, breaking back button use, preventing refresh of the page, etc), making ads and images load before article content, and so on. …

Stefan argued that we need to focus on the power and capabilities of the Web, as though it is not perfect, no other system comes close - imagine trying to spec out and create a system so pervasive and flexible from scratch! Stefan also argued that pushing all session-state and logic down into a heavy client-side JavaScript page was not the ideal solution, even though it seems the logical opposite of server-side frameworks and things like applets that give the illusion of statefulness.

Stefan briefly mentioned ROCA (Resource-oriented Client Architecture) as a series of recommendations for more rational use of the web’s intended functionality….

We should let people use their browser features! Don’t prevent them from bookmarking pages in your system, for example by using properly constructed and meaningful URLs for resources. When everything in the site is created dynamically and the URL doesn’t change, we’re violating users’ expectations that their browser will behave the same across the web. Browser behaviour shouldn’t be broken by the implementation of one particular web app.

HTTP is stateless; embrace it! Not dealing with server-side session stickiness and its associated nightmares frees you up greatly when it comes to horizontal scalability. Stefan described a few approaches for splitting large seemingly-stateful interactions into single interactions and thus maintaining statelessness. This was quite interesting, and I thought of the myriad of briefly-stateful interactions we make with many systems that could be changed in this way (most recently when booking my car’s MOT online, it is an 8-step process, but that need not be necessarily stateful in server session terms).

The use of unobtrusive JavaScript enables progressive enhancement, separation of concerns and a baseline good experience for the lowest common denominator. Leaving business logic on the server, to the most reasonable extent, avoids duplication of logic in JS-heavy clients and results in low cohesion.

Twitter feedback on this session included:

@timanderson: If the browser back and forward buttons don't work right, you've done your web app wrong says Stefan Tilkov #qconlondon

@teropa: Trying to solve a design problem by adding more devs is as effective as trying to solve an algebra problem by chewing bubblegum #qconlondon

@timanderson: I like JavaScript better than Java says Tilkov #qconlondon, wonder how many agree?

@Squire_Matt: Single page apps are just as bad as webforms type server components. Use the web as it was meant to be used. #QConLondon

@nixer65: Doing web wrong/right at #qconlondon - like the ideas about going back to HTML serving and not endless client side json processing

@Squire_Matt: If you are building a web app, build a web app, not a desktop app in a browser... #qconlondon

@carlalindarte: Not all browsers are equal so your UI shouldn't look equal in all browsers. It only needs to look good. #qconlondon

@austin_bingham: ROCA: Resource Oriented Client Architecture, Stefan Tilkov's new FLA for doing the web right. #qconlondon

@dthume: Just realised that 90% of everything I believe about Web apis came from the various @stilkov talks I've watched. #qconlondon

@steinsli: Loved how @stilkov separated content from design at his web dev speak. #QConLondon #noReadingFromSlides

@stilkov: FWIW, uploaded slides for my #qconlondon “Web Development: You're Doing it Wrong” talk http://t.co/6iqklmyhOO

How to rescue our kids: fixing the ICT crisis at school by Simon Peyton Jones

David Arno attended this session:

Software development has a crisis in the making, in terms of future developers: our kids in school. Whilst computers are fun, the way they are taught in schools, through the ICT curriculum, is dreadfully dull and lacks anything remotely resembling computer science. Children are taught how to create spreadsheets, use a web browser and the like. And that’s it. Over last five years, there has been a drop in both boys and girls doing IT A levels. In 2009, none of the English exam boards offered a computer science GCSE. So a group, called Computing at School Working Group decided to try and fix this. Their aim was both simple and huge: reintroduce proper computer science as a subject to be taught from primary school. Through lobbying government, and with help from the like of the Royal Institute and the Raspberry Pi catching the media’s attention to the matter, things have gone from dire to exciting in just a few years. In 2013, all five awarding bodies now offer CS GCSEs. The depertment for Education is in the process of replacing the ICT curriculum with a Computing curriculum for KS1-4 (5 to 16 year olds). As Simon explained though, this just means we’ve won the air war. We now need to fight the ground war, to get teachers trained to a level where they can teach computing properly. And this really is a “we”. Every developer in the country can help by joining CAS, running after school clubs, helping to mentor teachers, and by offering to help teach computing to primary age children in the classroom. I recently started running a Code Club after school club at my daughters’ school. What about you?

Twitter feedback on this session included:

@andypiper: Loving Simon Peyton Jones analysis of why we need to teach Computer Science – it’s like any elementary science - important base. #qconlondon

@austin_bingham: Holy cow! If Simon Peyton Jones were presenting on watching paint dry, I bet it would still be fascinating! #QConLondon

@jmdarley: Simon Peyton Jones is one of the most passionate people I've ever seen speak. #QconLondon

@charleshumble: Simon Peyton Jones mentioned http://t.co/2uVAp8AyCK - really interesting ideas on how to teach CS to children #qconlondon

@steinsli: How to rescue our kids: fixing the ICT crisis at school at @qconlondon. Slides styled in Comic Sans, -teacher's most beloved style. Irony?

@andypiper: So much love in the room over Computing in schools! Simon Peyton Jones #qconlondon great community http://t.co/bAU8i8BPgq

@capotribu: How to rescue our kids: fixing the ICT crisis at school http://t.co/TmYxE0sUtR < was great, will speak to my daughter's school #qconlondon

You are not a software developer! - Simplicity in practice by Russell Miles

Will Hamill attended this session:

Russell’s talk was quite narrative and described common mistakes made when development teams get bogged down in implementation and the drive for productivity. Russel used an impact mapping approach throughout the talk to derive the valuable goals for our efforts, and urged that we draw out assumptions between our current state and the desired end goal, so that the can be tested.

Russell made the point that often as developers we question only our implementation options rather than the goal itself. For example, when someone tasks you with writing a mobile app because their goal is “We need an app on the app store” - question that goal rather than getting straight into cracking open your favourite IDE. Why do we need an app? What do they actually mean by that? Will any kind of app do? Seek to understand the business decisions in terms of the value you need to create rather than the productivity you need to have - are we creating an app so that we can increase customer retention? Or are we trying to meet the functionality that a competitor has?

Fundamentally, Russell made the point that we are involved not to produce software but to produce valuable change for the business. This resonated with points made in the pre-conference training session on ‘Accelerated Agile’ by Dan North that I attended, where Dan described how value to the business is what we need to deliver rather than just the output of software.

Frank Scholten attended this session:

His main message: in the last decade we learned to deliver software quite well and now face a different problem:overproduction. Problems can often be solved much easier or without writing software at all. Russel argues that software developers find requirements boring, yet they have the drive to code, hence they sometimes create complex, over-engineered solutions.

He also warns of oversimplifying: a solution so simple that the value we seek is lost. His concluding remark relates to a key tenet of Agile development: delivering valuable software frequently. He proposes to instead focus on 'delivering valuable change frequently'. Work on the change you want to accomplish rather than cranking out new features.

Twitter feedback on this session included:

@alexbutcher: Being an early adopter: you have a poor sense of risk. - Russell Miles #qconLondon

<Performance Testing Java Applications by Martin Thompson

Frank Scholten attended this session:

Informative talk about performance testing Java applications. Starts with fundamental definitions and covers tools and approaches on how to do all sorts of performance testing. Martin proposes to use a red-green-debug-profile-refactor cycle in order to really know what is happening with your code and how it performs. Another takeway is the difference between performance testing and optimization. Yes, defer optimization until you need it. But this is not a reason not to know the boundaries of your system. When load testing, use a framework that spends little time on parsing requests and responses.

Twitter feedback on this session included:

@jon_moore: #qconlondon @mjpt777 : "Best way to take out the Death Star would've been to make it parse a bunch of XML..."

@giltene: When you want to make load happen, XML is much better than JSON. (paraphrasing @mjpt777 in his #qconlondon performance testing talk)

@trisha_gee: People make lame excuses for poor code @mjpt777 #QConLondon

Building for Clouds

Clouds in Government - Perils of Portability by Gareth Rushgrove

Will Hamill attended this session:

Gareth described the deployment process for GOV.UK, the new front page for the UK government designed to replace and improve upon DirectGov and BusinessLink. The deployment pipeline treats configuration and networking as code, including configuration management and automation using Puppet. It’s clear that automation has resulted in a very safe and low-friction deployment process as evidenced by the frequency and stability of GOV.UK releases. Check out the GDS Badger of Deploy on Twitter to see that they’ve made 848 production releases since launch in the 17th of October. Impressive!

Avoiding platform lock-in is a major concern for GDS, as they’re tasked with spending taxpayer money in order to get the best deal for the requirement to host various sites, transactions and platforms and they do take it seriously. Lock-in imposes risks due to the reliance on a single vendor and gives the vendor power in the supplier-consumer relationship. Having learned lessons from government being treated as a feeding trough by the likes of £Billion contracts with Big Named vendors (my words here, not his ;) ), portability in the software solution is very important.

Gareth compared the various capabilities of cloud service providers such as Amazon and Rackspace, and how there is a difficulty in comparison when the various APIs of each provider’s services use such different language or in some cases the same langauge to talk about distinct terms. Avoiding platform lock-in is a big thing for GDS and lock-in comes in flavours other than the most obvious technology support ones such as capability lock-in when you rely on services or features that only one vendor provides or (implicit) capacity lock-in when you’re so big that there aren’t really any other options than the likes of Amazon (e.g. Netflix aren’t going to realistically be able to move their systems to another provider).

Twitter feedback on this session included:

@bluefloydlaci: Portability matters. Cloud providers. Vendor lock-in. Common denominator not enough, standards needed. At #QConLondon

Extending CloudFoundry with new Services by Chris Hedley, Andrew Crump

Twitter feedback on this session included:

@grantjforrester: HA support in CloudFoundry? In roadmap. #qconlondon

@davidlaing: Notes on how to add your own custom service to your cloud foundry PaaS instance - https://t.co/Le0i21wEil #qconlondon

Racing Thru the Last Mile: Cloud Delivery Web-Scale Deployment by Alex Papadimoulis

Will Hamill attended this session:

Alex’s talk was an interesting look at the difference between typical, smaller, enterprise deployment and ‘web-scale’ deployment. Web-scale is the term used to describe the scope and size of the challenges encountered by platforms and products used by and served to vastly more users than typical applications. Think Facebook, Google, Twitter and Netflix - the problems they encounter trying to design and maintain systems for millions or billions of people are quite different to those encountered by an in-house enterprise application developer responsible for deploying their ‘customer portal’ or the likes to another thousand users.

In all web-scale systems with publicised details, the most obvious difference from other applications is that they are inevitably decomposed into a multitude of services and interacting components. This decomposition reduces dependency and enables parts of the product to be upgraded or deployed separately. When you reach the scale of having hundreds or thousands or hundreds of thousands of servers with deployed components, the probability that a component somewhere has failed asymptotically approaches 1. One of the most interesting things approaches to handling this in a service-oriented architecture that I’ve read was about is the Netflix Chaos Monkey which Netflix use to test robustness and durability of their systems.

Alex described other problems that web-scale systems have in deployment, including strategies for rollout of upgrades (half and half, random selection of servers, rolling wave) in order to ensure zero downtime. The web-scale systems involved have very different constraints compared to your office’s Sharepoint document system when going down for half an hour at 1am on Sunday is going to be noticed by a million users (think any time Twitter is down, for example). Interestingly Alex mentioned how Twitter use a bittorrent-like system to push out updates to their servers in chunks and spread updates throughout their network of machines.

Alex argued that the best way to reduce pain in deployment is to do it often, and the best way to test your failure plans is to actually enact them. You don’t want to be one of those nightmare scenarios you’ve read about where someone has to restore from a backup tape for the first time and then realises that the wrong thing has been being backed up for months. This is similar to the principles proposed in Continuous Delivery - if something is fragile, dangerous and difficult then don’t postpone it; do it more often. When you do deployments frequently then you become more proficient, you tend to automate things and you tend to have fewer changes between deployments.

Rollbacks are often a cause of great pain in deployments, but I agree with Alex that the best way to do a rollback is not to create custom rollback scripts and reverse-deltas, but instead to run the deployment process for the previous, working version of the app (and that way you know the process is tested much more often).

Real Startups

How to turn startup ideas into reality by taking money from strangers by Ian Brookes

Kevin Hodges attended this session:

When presenting, care but don’t be evangelical
Need to sell “FUND ME”, they are investing in you, not necessarily the product
Stick to a single page, if you can’t “sell it” with that then they are not the investor for you
First impressions are everything, need to be WOW
Code faster, measure faster, learn faster
Tell stories, woven with details of success and failure
Make sure you chase them afterwards, bordering on stalking, they need to see enthusiasm
Know where you sit in the market
Get everything into 12 minutes:
1. Problem
2. Attractive market
3. Unique advantage
4. Compelling investment

Twitter feedback on this session included:

@carlalindarte: Listening to Ian Brookes on how to turn your IT business ideas into reality. Interesting insights of this entrepreneur #qconlondon

@teropa: Pitching is not about the product. It's about you. It's also not about the stats, it's about stories. "You got to wow them" #qconlondon

@teropa: Reason why 68% of investors turned founders down: Indifferent first meeting. The speed dating analogy seems apt #qconlondon

@carlalindarte: It's not about the technology you'll use for your business idea, it's about how you will take it to the market. #qconlondon

@teropa: Innovative technology may make investors wary, because they don't know if it works. "There's a balance there" #qconlondon

@carlalindarte: How successfully sell business ideas?1Focus on the real problem 2Know your market 3Your unique advantage 4How you'll get money! #qconlondon

Creative Thinking & Visual Problem-solving

Ideas, not Art: Drawing Out Solutions by Heather Willems

Trisha Gee attended this session:

Heather Willem's session encouraged us to doodle throughout. In fact, forced us to. Right up front she addresses the fact that doodling is seen as a lack of attention, as a waste of time. And I realised, sitting there in the audience with my iPad and stylus, that I did feel guilty drawing away while someone talked at me. But it was a brilliant exercise in unblocking some of those creative juices, and letting us see the power in visual information. Perfect is not important, pictures are powerful.

Frank Scholten attended this session:

Heather Willems shows us the value of communicating ideas visually. She started the talk with an entertaining discussion of the benefits of drawing in knowledge work. Diagrams and visuals help us to retain information and helps group discussion. The short of it: it's OK to doodle. In fact it is encouraged!

The second part of the talk was a mini-workshop where we learned how to create our own icons and draw faces expressing basic emotions. These icons can form the building blocks of bigger diagrams.

Machine Me by Fernando Orellana

Dušan Omercevic attended this session:

One thing that I see frequently is people willing to do something but being incapable of doing the first step. They have ambition, skill, and knowledge to accomplish something great but they keep pondering about what their first step should be, while months and years are passing away. At QCon London Fernando Orellana presented a very simple, but highly effective approach to the kick start problem. Fernando suggestion is to take a piece of paper and just draw a random doodle on it. Then you take a deep look at doodle until you start seeing things in it and then you just complete the picture.

Twitter feedback on this session included:

@charleshumble: A robot that visualizes our dreams. http://t.co/jygwQNQtfw #qconlondon - by @polyfluid http://t.co/hOXVrowuir

Handheld Banking

Put a UI Developer in a Bank; See what happens by Horia Dragomir

Richard Chamberlain outlined the main ideas in the talk:

Banks have top developers

If you want a good UX you need to hire a good UI developer

Build apps for your customers

Huge release cycles for teams providing services for your apps are a pain

QA need to work with the team and not to the spec

Tools and equipment. There’s little point in developing a web application if you don’t have firefox and chrome installed. Your web devs need to test on all versions of browsers and all the devices they are going to support. They also need an IDE that isn’t eclipse or visual studio. They will also need stackoverflow to google for weird IE work-arounds.

Twitter feedback on this session included:

@carlalindarte: Banks build super reliable apps that look awful Horia Dragomir @hdragomir speaker at #qconlondon UI developer

@teropa: People who go from startups to corporate environments make themselves redundant because they optimize everything. #qconlondon @hdragomir

Testing iOS Apps by Graham Lee

Alex Blewitt attended this session:

Graham Lee (@secboffin) gave a high-level talk through the testing frameworks available to iOS developers. The tools included Calabash,Cucumber, OCUnit (built into Xcode and also known as SenTestingKit) as well as browser-based tools such as Safari’s web inspector.

Mike Salsbury attended this session:

The talk highlighted several different testing frameworks that could be used to test iOS applications and via WebView. The last was probably the most interesting to me, as it highlighted a way of using WebView so that you can utilise your existing JavaScript testing framework to test iOS apps embedded within web views. This isn’t necessarily exactly what we want to do, but could be a very interesting approach for CI.

Another CI friendly approach would be to use Catch, which is available on Github. The other approach was to use the normal embedded tools. E.g. XCUnit and Instruments. Although one twist was downloading the Network Link Conditioner tool to simulate network latency.

Calabash was a BDD style approach using Ruby and the spec approach. That would fit in nicely with some of our other frameworks in style.

The Future of Mobile Banking by Michael Nuciforo

Twitter feedback on this session included:

@teropa: For banking mobile will replace your memory. Like it did for phone numbers #qconlondon

@carlalindarte: Mobile has achieved things one third faster than desktop Internet. #qconlondon 'The future of mobile banking'

@teropa: Banks are becoming mobile operators. Operators are becoming banks. #qconlondon

@carlalindarte: On average an active mobile banking user will log in 20 times a month. They have quickly moved from once a month to once a day. #qconlondon

@teropa: Banks losing sales because they've forgotten to do sales in the mobile channels, and that's where users are going. #qconlondon

@teropa: Banking in 2015 #qconlondon http://t.co/Ku7wlAmN0m

@teropa: Banks are waking up to the fact that they have a lot of data about what people are buying ->collab with retailers on offers etc #qconlondon

@carlalindarte: One of the constrains of mobile banking: still treated as a project and not as a channel. That causes lack of investment #qconlondon

@teropa: UK/europe is basically replicating what has been done in Asia in mobile banking #qconlondon

@teropa: Key takeaways in "The Future of Mobile Banking" #qconlondon http://t.co/evmKZgUHpR

Building Web Apis: Opening & Linking Your Data

Introducing the BBC's Linked Data Platform and APIs by David Rogers

Alex Blewitt attended this session:

This covered how the BBC are using graph databases and RDF to link data and events for their news and sports platforms, so that when an author is writing an article it will auto-suggest tags based on the content of the text, and those tags will then semantically link the story with other areas and parts of the BBC.

Some of the problems included how to uniquely identify the people involved – for example, there were many athletes which shared a name with another athlete at the games – as well as how to regionally associate stories. For categories such as counties the boundaries are known, but for voting areas (which change over time and can be quite complex) the problem is largely unsolved.

Mike Salsbury attended this session:

There was World Cup 2010, Olympics 2012, all towards creating a Platform with semantic roots, that might be available as an open API sometime in the near future. We got a full overview of the development and thinking behind the API, and where they’d like to take it next.

There was Scala, the triple store graph database and lots about linked data. The database is full of Subject, Predicate, Object triads, and no tables or rows. You can access it with SparQL construct graphs, and these queries can be represented as WebService endpoints (I think).

Kevin Hodges attended this session:

Concept abstraction (maybe use for Exodus)
Use of triples to link common concepts or content (check out “Triplestore”)
Pages are generated dynamically from the most recent tags in the triple store, the subject and object get tagged with a “relation”
Content is tagged when it is generated
Statistical stuff is still done separately
Triplestore holds canonical data, “facts” on products
Reads need to be scaled => caching
Not big data, just well organised
GeoNames, check this out
Use of Mashery to make the data “open”, aids rate limiting and manage clients

Twitter feedback on this session included:

@stilkov: Interesting idea for being able to provide a SPARQL endpoint without risk: an EC2 AMI that you can use @daverog #qconlondon

The Why, What and How of Open Data by Jeni Tennison

Richard Smith attended this session:

Jeni is an advocate of open data, and in this talk she laid out some reasons for us to join in within our own data.

Good data should be reusable – consumable by several different applications or modules – and combinable – an application should be able to read data from multiple sources and work with all of it. Most services are designed to be linked to other services, via data streams, and offer data through application-specific APIs. Using a well defined standard format makes it easy to pass data between services, whether that data is open or not.

Most current data, even that which is publically available, is not open. Open data has to be available to everyone, to do anything with it, for example using a Creative Commons attribution licence. (Share-alike licences, more like the GPL, are also available, but they restrict use cases to some degree.)

Why would a company which generates or provides data want to make it open? The benefits for everyone else are clear, but in the case of a non-altruistic business entity, there must be an incentive for the company too. Providing open data allows other companies or individuals to provide additional services, for example mobile applications or visualisations. As long as the data source is required to be attributed, this can extend brand awareness and user base. Collaborative editing and updated of data can also produce excellent and accurate output, if the user base has an interest in keeping it up to date; for example Wikipedia or OpenStreetMap. Offloading some of that data maintenance onto users lowers the cost of maintaining the same quality.

Whether to open up, and what data to open, has several considerations. Primary data, which is generated at high cost or effort, can have commercial value high enough that it can not make sense to open it up. But most companies generate large amounts of 'secondary data', which is a side effect of other processes (for example transactional data, information in CRMs etc), which can be opened up if it doesn't contain personal data. Any data referring to individual people is likely to have data protection concerns and again may not be eligible for open distribution.

Open data is still an experiment: we don't know exactly which business model works the best, how best to measure the usage of data or how to find open data when we want some. But Jeni asks us to consider the benefits that opening some data up can provide to our own businesses, as well as society at large.

Building APIs by building on APIs by Paul Downey, David Heath

Twitter feedback on this session included:

@alblue: Design principles by @govuk https://t.co/Mv4DwFLHkn #qconlondon

@alblue: Putting APIs first: http://t.co/KrADZkD0Jc from @GovUK at #qconlondon

@alblue: Government digital service design manual (work in progress) https://t.co/bNm54x1d2z #qconlondon

Building Hypermedia APIs with HTML by Jon Moore

Alex Blewitt attended this session:

The presentation sounds much more buzzwordy than it actually was; in fact, it was largely a set of common sense and careful use of HTML5 attributes for annotating structured data, along with a demonstration tool to read pages with suitably annotated data.

The general format is to encourage the use of HTML5 microdata, which is a set of attribute names that can be applied to existing HTML elements, and then processed by a tool that understands how to parse them. These can either be attached to semantic nodes in the structure (such as h1 elements) or wrapped with a standalone span tag with an associated attribute value. In this way, it’s possible to encode information in the same representation that a user will use to read the content

Jon had a tool written in a scripting language (python or ruby; I forget which) that took an example HTML page and used it to generate a list of elements with semantic data in them. Furthermore, these were then exposed as dynamic properties on an object returned by the caller.

The key addition to this talk (over and above being just an HTML5 tutorial) was the use of introspecting forms and the ability to submit standard HTML forms, with ‘arguments’ for the form values that needed to be submitted. This allowed the tool to reach into the HTML and use it to represent not just state (the objects) but also the transitions to other states (by submitting forms and returning the data in that HTML page).

Mark Hobson attended this session:

He proposed using HTML itself as the hypermedia representation for RESTful services. This approach has many advantages over JSON or other XML representations, for example: web browsers implicitly become clients of your API; HTML already has comprehensive hypermedia support; and HTML5 provides semantic metadata in the form of HTML Microdata. He demonstrated a simple command line tool that was able to programmatically explore and use any REST API written to these principals, much like a user can navigate any website. Once again we witness the trend of unifying human and computer interaction with web services.

Twitter feedback on this session included:

@RichardLundDev: Building Hypermedia APIs with HTML #qconlondon API doesn't always have to be JSON format. Wise words.

@raymcdermott: html5 replaces json as a data format - discoverability FTW #qconlondon

@BlackPepperLtd: Using #html5 as the mediatype for your RESTful APIs is an interesting idea covered by @Jon_moore @QConLondon. Human discoverable apis

Generic Hypermedia and Domain-Specific APIs: RESTing in the ALPS by Mike Amundsen

Mark Hobson attended this session:

Looking into the future, Mike Amundsen hypothesised how these ideas may evolve in his talk, “Generic Hypermedia and Domain-Specific APIs: RESTing in the ALPS“. He highlighted concern over the recent explosion in web service APIs, specifically as they tend to be proprietary rather than domain-specific. For example, there are hundreds of shopping APIs but there is no single standardised API to access them all through. Mike proposed that we need a common language to standardise domain-specific APIs, much like schema.org does for domain-specific data, which he calls Application-Level Profile Semantics (ALPS). It is very much a work-in-progress but it has great potential to take us towards the fabled semantic web.

Schadenfreude - War Stories

The inevitability of failure by Dave Cliff

Richard Smith attended this session:

Dave comes from a banking background and this presentation talked about failure in real and software systems from a mostly financial perspective. Failures will happen, eventually, in any complex system. He started with several examples of the failure of the financial markets, from the tulip and South Seas bubbles through to the failure of Long Term Capital Management and the May 2010 trough in the Dow Jones index – momentarily the worst one day performance in US market history, although shortly afterwards followed by the best intra-day performance in history as the market recovered.

This type of market failure is happening harder and faster due to the rise of algorithmic trading. In the last decade, algorithmic trading programs ('robot traders') have risen from a small, specialist part of the market into the norm; over 70% of all trades are now performed by computers, often with millisecond response times. A small bug in one of these systems can cause a very fast and serious failure; Knight Capital was seriously damaged by mistakenly deploying development environment market simulation to production, costing them $400m, and on a less serious level, automated book pricers on Amazon.com resulted in a second hand book being offered for over $20m.

Stock trading has always been based on the latest technology and speed of information: first horse messengers, then pigeons, then telegram and telephone communication, and now the Internet. And technology has always had occasional failures, particular when users become involved, as people can't be rigorously modeled, so even a perfect engineering solution can fail once people are included in the picture: the Millenium Bridge in London was engineered correctly, but was not good for users.

Catastrophic failures often happen because of the normalization of deviance. We start out doing something in a safe and controlled way, and make some guesses about what the safe operating parameters are. But every time we go outside those parameters and there is no failure, even though it triggered all our warning alarms and processes, it is natural to expand the 'safe' operating zone to include the new conditions ... even though the risk of failure is greatly increased. The Challenger and Columbia shuttles were both lost to events which were known to be a potential problem, but for which the deviant parameters had become normalized so that the increased risk was repeatedly taken, until a catastrophic failure did occur.

This problem is also prevalent in financial trading software engineering. Risk management in a new algorithmic trading program is extremely tight, but as the algorithm gets away with making risky decisions, risk management is relaxed until a catastrophic failure (in financial terms this time) occurs. As we see more algorithmic trading in the markets, we are likely to see more technology-created catastrophic market failures like that one in May 2010 (and Dave lists several other examples of individual markets being destabilised by a failure of an algorithm).

Frank Scholten attended this session:

Dave Cliff of the Large Scale Complex IT systems group at University of Bristol warns us about the evergrowing complexity in large scale software systems. Especially automated traders in financial markets. Dave mentions recent stock market crashes as failures. These failures did not make big waves in the news, but could have had catastrophic effects if the market did not recover properly. He discusses an interesting concept, normalization of deviance.

Everytime a safety margin is crossed without problems it is likely that the safety margin will be ignored in the future. He argues that we were quite lucky with the temporary market crashes. Because of 'normalization of defiance' it's only a matter of time before a serious failure occurs. Unfortunately I missed an overview of ways to prevent these kind of problems. If they can be prevented at all. A principle from cybernetics, Ashby's law of requisite variety, says that a system can only be controlled if the controller has enough variety in it's actions to compensate any behaviour of the system to be controlled. In a financial market, with many interacting traders, human or not, this isn't the case.

Painful success - lessons learned while scaling up by Jesper Richter-Reichhelm

Kevin Hodges attended this session:

1. Always check back on reality
2. You will make mistakes
3. Software is easy, data is hard
Ultimately, they looked back having realised they weren’t actually building the right stack, only what they thought was the right stack.

Architectural Hangover Cure

Deleting Code at Nokia by Tom Coupland

Tom Coupland replied to some of the questions the session attendees asked him:

Have you considered datomic to replace mongodb?

In a word ‘yes’, although i wouldn’t use the word ‘replace’ as really it’s about adding tools to your toolbox or weapons to your war chest, depending on your favoured analogy. Datomic’s really interesting and comes up in our conversations pretty frequently. What hasn’t happen yet it enough investment in learning and experimentation for us to begin the real adoption push.

How have you found refactoring and maintenance of Clojure across developers for code they didn’t write without static typing and IDE tooling?

I certainly had some reservations in this area and it was a big concern for us. What i think we’ve found is that it’s not as big a problem as feared. The result of having such a small amount of easily understandable focused code is that maintenance isn’t causing us problems. I think a big part of that is our level of acceptance testing, sometimes it’s a little over the top, but too much is better than too little at the end of the day. Refactoring code, if i’m honest, i’m still slow at and it’s something i really want to improve, it’s not caused too much personal irritation yet though, there’s so much less code ergo there’s less refactoring you want to do.

Any challenges regarding transactions when moving from an ejb db stack to clojure and mongodb?

Not as much as you might think. Part of our desire to move away from what we were using in that area was that we didn’t have any complex transactional needs and yet had to pay the cost of tools that provided those abilities. This was one of those realisations that comes out of thinking about what does our software ‘really’ do and whats the minimal tooling we need to get that done.

Was the 71% decrease compared with the beginning ejb or the spring stack?

That comparison was made between an EJB3 service and the clojure rewrite.

Are you saying that unit tests are not necessary if you know for sure what the code does?

Uh, not really. When writing clojure, particularly when repl driving the code, your testing it all the time as you create it. Now you could argue that at the end of that process you could codify the testing you’ve done, but i’ve started to view unit tests as a bit of straight jacket for code, they slow down your ability to change it. What i prefer is acceptance testing, taking the view that i don’t really care how the service goes about doing what i want, just that it does do what i want.

Do you have a legacy of all the technologies you use along the way? How do you deal with that?

There is legacy that gets built up along the way, but there’s an advantage of the small service approach that these things tend not to need much change, after a while they just sit there reliably doing what they do. Of course occasionally you have to go there and that’s when it worth considering a rewrite of the service to a newer style. However you can’t always justify the cost and that’s when you just have to put your professional hat on and do what needs doing. Of course each of those occasions adds weight to the rewrite side of the scales and eventually they’ll tip in your favour.

How did you convince your senior managers to make these changes?

Persuasion. There’s an interesting book about driving technical change from the pragmatic programmers. In brief, you have to relate to their worries and fears, pre-empt them as much as possible, accept the ones that are real and tell them your going to take on the responsibility to see the worries don’t come true and make them believe (in) you. You’ve got to sell it to them basically and selling things to people is a bit of an art.

Did you find any problems/issues when adopting clojure?

Obviously there’s loads that could be said here because issues is a very broad term. Problems, though, we didn’t have many. The big problems are mental ones, your learning a new way to think and persuading other people of the virtue of new ideas, but that’s not so much to do with clojure, those problems exist in any adoption.

Did you find the tooling and the ecosystem around clojure mature enough?

Yes. The ecosystem is vibrant, and full of cheerfully helpful people that are passionate about the language. If i had to choose one thing that really made a big different its Sam Aarons Emacs Live project, this just took a way a lot of the emacs learning required and gave the whole endeavour a huge boost.

How hard was the clojure learning curve for you and junior devs adoption time in terms of keeping a clean and concise codebase?

Hard. It’s not easy, its just simple Collaboration and communication are key for a successful adoption, sharing your new clojure code with as many people as possible. Something personal that i realised was that with all my efforts to improve my OO code, by breaking things down (i liked the decorator pattern a lot!), keeping my state nicely contained and not mucking about with it too much, all the good things your supposed to do, i was actually laying the ground work for learning a function style. Then when someone showed me clojure it just seemed like a better way to express those ideas; same concepts, far better way of writing them!

How did you train up all your existing Java developers with Clojure? How did you sell this idea to business, given that this would take out many man-days to transition to a new language?

We bought a lot of books and spread them around. Started having the lunch time meet ups and showed it really working for us. A lot of it people self learned in their own time, others used their 20% time to experiment. We didn’t really do any formal like ‘training’, it was mooted and agreed to be a good idea, but in the end it hasn’t happened and the point were it would be useful is passing.

On the man-days point you have to give yourself (and them) a dead line. Essentially saying ‘we think this will work, but if by this point it isn’t, we’ll still have to produce the goods with our current stuff’. You have to earn their trust and make them believe in your ability to pull this thing off.

Has the number of lines of test code increased with the adoption of closure? Have the number of bugs found in production increased?

Not really, we had a strong acceptance testing ideal before, we still do now. The line of test code has decreased massively as there’s very few unit tests and the acceptance tests are really concise. Bugs in production havn’t changed at all.

Why didn’t you liked java in a first place? Why you were trying to substitute it with other language?

In a nut shell i wanted to reduce the amount of time i spent typing out my solution to problems. I want to solve the businesses problems and deliver value, by reducing the amount of pure manual labour involved in expressing solutions is a good thing. Also i started to feel that java was actually hiding the solution to problems, not making them clear. Once you start to see your system as just a flow of data that you apply a few modifications to, you see that java (and objects in general) is not a great tool for expressing that.

Twitter feedback on this session included:

@trisha_gee: Make sure people know what new technology is coming, so they're ready for it, and therefore less resistent @tcoupland #QConLondon

@nemanjavuk: From 5751 #java #ejb3 #hibernate code lines to 1674 #clojure lines with 71% more efficiency! @tcoupland #QConLondon

Agile in Actuality: Stories from the Front Line

People over Process: Applying it in real world software development by Glen Ford

Will Hamill attended this session:

Glen Ford’s talk was about applying the principle of individuals and interactions over processes and tools in real terms, and how the impact of considering the human factors involved in development can make a real difference in team performance. Glen began by recounting from his experience as a team lead of a time when he was given feedback illustrating that his impression of how he led the team was different from how the team were experiencing it and that he wished he had been given the feedback sooner. Glen encouraged us to constantly seek feedback rather than waiting for it, and actually apply it to ourselves.

In a high performing team (especially a team with many experienced members) there is often significant inertia to change, so overpowering that inertia really requires making people emotionally invested in change. In order to give your people direction, you need to sell them on the vision and relate your short term plans back to that vision to demonstrate its relevance. Glen said that we must have a stated motivation to work effectively, and that if you can’t understand the reason for doing something (in terms of the vision) then perhaps you shouldn’t be doing it.

The processes used to guide the team are a set of concepts and not the law; the better your interactions with the people involved then the less you will require the process to instruct or control. This was very much an emphasis of MacGregor’s Theory Y over Theory X. Giving your people the right reasons to do something means they’ll usually make the right decision.

Sven Johann attended this session:

He shared his experience from being a tech lead at a start-up. He recognized that his team's doing Scrum as a ritual act, without asking why they're doing certain things. They discovered that a process isn't a rule of law, but rather a set of concepts. Instead of following rules, they formed a team vision and a why for everything they do. If you don't find a why, don't do it. In their specific context, they couldn't find a why for estimations, so they skipped it. Finding awhy also encourages communication and the more communication they had, the less process they needed. The best and most open communication is among team members, which know each others strengths, weaknesses and quirks. So they decided to do not break teams apart, but rather to form long-running teams, which eventually got hyper-productive.

Climbing out of a crisis loop: How a critical BBC back-end team reigned in a workflow crisis-to-crisis cycle by Rafiq Gemmail, Katherine Kirk

Will Hamill attended this session:

Katherine began her talk by describing the situation of one of the teams in the BBC working on a high-demand backend media service. When she was brought in after previous managers had quit, she knew that there was a massive percentage of time spent firefighting and a huge gulf between the expectations the team were setting and their ability to deliver, given the quality of the system, the time spent on urgent fixes and communication issues.

Katherine described how her first action was actually to absorb the situation rather than diving in and proclaiming new strategies as some higherups had expected of a new manager. This seems like a personally risky but very wise move, as better decisions can be made with deeper understanding rather than a knee-jerk reaction. She collaborated with the team to understand their problems and their frustration with estimation & planning work that could never all be delivered in the required time, and then ensured that the expectations of the team were reset so that they instead could under-promise and over-deliver.

Planning was reduced and capacity was projected based on a more realistic understanding of what can actually be tackled and how much time must be spent on urgent fixes. Team members were rotated through dev/ops/test/firefighting workstreams on a two-week basis, which I think is a great idea. This spread knowledge around and also reduced the perception amongst the team that some people got to do the “new” stuff and some people just got to fix bugs.

The team used finer grained boards to display more accurate progress - ‘done’ became two columns of ‘development complete’ and ‘in review’ in order to eliminate the typical progress update of ”I’m nearly done”. Being open and truthful both within the team and in the team’s capacity and progress to others was important to improve communication. Katherine also described how they defined all the implicit and assumed parts of the process in order to ensure they could properly track what work needed to go into particular actions. This is one of Kanban#Six_core_practices)’s core principles in “make policies explicit”.

Katherine’s main push was to improve the communication and to try to empower the team that they could solve the problems on their own; that they would essentially become self-managing. A good test of this would be if a manager of the team could take a few days’ leave during a week without having to necessarily have a replacement step in for the entire time. A common theme during this talk as with most of the talks on Thursday was of following the agile principles rather than any particular strict process.

Between Fluffy Bunnies and Command & Control: Agile Adoption in Practice by by Benjamin Mitchell

Will Hamill attended this session:

Ben started by describing an agile team he had led in an organisation that had previously had bad experience with agile, so when running his team they were actually having their standups in secret. I can understand why this was done, in order to ensure the team can actually be productive without a top-down command being used to force them into ineffective practices, but as Ben said, you can’t be open and transparent when you’re hiding in the stairwell to have a meeting.

Ben described a few of the things he had done to try and encourage open communication; people avoid embarassing or threatening truths so it was important to make negative views discussable. Reducing the barriers to people raising problems was important - for example having the question “what wasted your time today?” being a stock question at a standup meeting gets people being more direct about it. One nice touch was to implement the ‘two hand rule’ at standup meetings to ensure people are being concise and relevant: if at standup someone is saying something you think too long, detailed, irrelevant, etc then just put your hand up. When two people have put their hands up, the person currently blabbering on will take their issue offline, no questions asked. This is a good suggestion to ensure smoother flow of the standup and prevents it from turning into a 20-minute affair.

Ben had in one team noticed poor morale regarding progress and productivity, which turned out to be because the entire product backlog had been stuck up to the left of the board. This meant that team members looked at their progress for that sprint and at a glance only saw that there was a huge amount of work still to do. I’ve heard this before on the Agile NYC podcast when a team was depressed by the insanely long and detailed product backlog. I think there are two issues there in terms of the need only being to illustrate the current sprint backlog in terms of glance & go progress on the board, but also a hugely detailed and long product backlog may be a symptom of too much analysis.

When it came to making decisions about the process and the project, Ben said that we should explain the observations and inferences that lead us to make suggestions; say what you have seen and ask what others think about it in order to draw out your own assumptions and to encourage communication. When it comes to negotiating with business owners, it is best to show people rather than just telling them. Demonstrate your progress and negotiate based on the real data to hand.

Twitter feedback on this session included:

@chickoo75: People blame others, systems and "deny that they are denying" #qconlondon @benjaminm

@pablojimeno: How we think we act and how we tell others we act is different to how we actually act. @benjaminm #QconLondonÂ

@janerikcarlsen: Everyone is for the truth, as long as the truth is not embarrasing or frightening #qconlondon

Accelerating Agile: hyper-performing without the hype by Dan North

Will Hamill attended this session:

I’ve just made a few bullets under the points that Dan described in his talk:

Learn the domain (use BA to educate devs not as permanent conduit)

Devs sent on same domain course as real business people!

Prioritise risky over valuable

Opportunity cost - when doing X be aware of every other Y you could be doing and if Y is more valuable, change

Within MVP, order doesn't really matter so do risky first

Plan as far as you need

Review your planning horizon

Try something different

Assume there is something you haven't tried that could benefit you

Different languages to get different perspectives, likewise programming styles

Fire, aim, ready

Get something in front of actual users for actual feedback

Showcase frequently

Build small, separate pieces

DRY is the enemy of decoupled (counterintuitive perhaps but be aware)

Deploy small separate pieces

Make component deployment quick

Make product deployment consistent

Make components self describing and environments unsurprising

Prefer simple over easy

Don't always just bring in big, complex things just because it's a one line dependency

A cool little dependency/package management tool called Fig

Make the trade-offs

build v buy v oss

Framework v roll your own (e.g logging via System.out.println)

Share the love

Code Reviews - keep quality up, spread knowledge

Learning lunches

Great On-boarding

Be okay with failure

Be broad-minded, see bigger picture of business significance of actions

Think about the product rather than the project

Progress via experiments

There are always 12 steps

Delivering in this fashion can be addictive!

Twitter feedback on this session included:

@teropa: Prioritise risky over valuable. Find where the dragons lie. #qconlondon @tastapod

@jgrodziski: #qconlondon "learn the domain" : seed the team with a domain expert, study trading like a trader, practise trading with the traders

@benjaminm: I have sat in a room for 2 days, coming up w 400 stories, with ratings of risk and other made up stuff @tastapod's confession #qconlondon

@AgileSteveSmith: DRY is the enemy of decoupled "DRY within a bounded context" Amen! @tastapod at #qconlondon

@jgrodziski: #qconlondon @tastapod step 6 of hyper-performing agile: "build small, separate pieces" and share memory by communicating

@laura_jagger: If you never rollback, you never have to solve the problems of a rollback - the way forward may look like a revert #qconlondon @tastapod

@teropa: Make the tradeoffs. "Does logging really need a framework?" #qconlondon @tastapod

@klangberater: My key learning from @tastapod talk: as soon as you are religious about something, you are on the wrong path #tdd #qconlondon

@octoberclub: think product development not project delivery @tastapod #qconlondon

@shuttlebrad: May need to lock my team in a room to watch @tastapod’s #QConLondon talk. Urgently.

@portixol: Another theme of this years #qconlondon is using functional programming paradigms in Java. @tastapod recommended http://t.co/duM7FoCevx

Yanking business into testing - with lots of vegetables by Gojko Adzic, Lukas Oberhuber

Will Hamill attended this session:

Lukasz and Gojko made observations about the difficulties in the team structure and organisation. One of the biggest problems was that separating development from testing leads to much longer delivery cycles. This happens even when you pretend that you’re agile “because we’re doing sprints” and you do all your development in one sprint and throw it over the wall to be tested in the next. The grief you’re causing yourself here is that it’s at least one sprint before you can tell that your work isn’t actually done and needs fixing. Seems obvious to some but some organisations have a real problem in bringing testing within the development process and leave it after the fact.

Testing is not done to prevent business risk, because this at best just creates inertia. Testing is done to enable change - it should provide a safety net and feedback. Tests should be isolated from unnecessarily testing implementation when they should be testing outcome. Sure, unit tests are going to be testing implementation details but higher level tests shouldn’t be relying on really low level stuff. If your test steps describe just how to test something and not what it is you’re actually testing then you’re doing it wrong. The test should be for the action and should not in most cases describe a long workflow.

Instability in automated testing is like kryptonite for applications. “Just run it again, that one always fails” means you’ve got a test that is worth less than no test, because it’s giving false negatives. Ensure that your integration tests are testing how components interact with each other, not the business logic. This should be done at a higher semantic level of acceptance testing.

Next Generation Mobile Apps

New capabilities of HTML5 browsers by Maximiliano Firtman

David Arno attended this session:

Developing for mobile has three key problems:

There are hundreds of browser variations out there, with different behaviours and different features. Max cited various examples, such as different versions of the Android browser baked into different versions of Android, most of which cannot be upgraded; the fact that Chrome on iOS is effectively a skin on top of Safari; and how in iOS, the “web view” version of Safari runs a completely different JavaScript engine to the browser. Then there’s browsers like Silk (on Kindle devices) and the variations in IE between Windows 8, Windows RT and Windows Phone. And so the list goes on.

HTML5 is in draft and in flux. Many browsers can claim to be HTML5 compatible, yet offer completely different behaviours, when experimental feature APIs are used for example.

Screen-size hell. Different devices offer wildly varying resolutions and screen sizes. Rendering an HTML app well in all these resolutions is a massive challenge.

Max offered some great advice on dealing with these issues, the absolutely most important of which is never, ever, ever try and detect the device and simply serve up a fully featured experience to those browsers you’ve chosen to properly support and a reduced experience for everything else. It’s a lazy and stupid approach that will frustrate users when newer versions of their browser/OS appear that fully support your app, which you haven’t tested against. Instead, use feature detection (via JavaScript frameworks like Modernizr if you wish) and the responsive web design & progressive enhancement patterns to serve an adaptive experience that utilises as many of the user’s browser capabilities as possible. Oh, did I mention that you should never use device detection?

Twitter feedback on this session included:

@reteganc: There are around 200 mobile browsers .. wow. @qconlondon

@reteganc: All HTML5 features impl. in mobile browsers are in draft .. because HTML5 specs are not ready/final. @qconlondon

@reteganc: Chrome for iOS is actually a Safari with Chrome look. @qconlondon

@reteganc: HTML5 for mobile.. Don't use browser detection but feature detection. @qconlondon

Architecting PhoneGap Applications by Christophe Coenraets

Twitter feedback on this session included:

@teropa: When building on #phonegap, keep your app browser-runnable -> when you have a problem, you can use chrome dev tools. #qconlondon

@teropa: Nice to see the horrible 300ms click delay problem on mobile web getting some attention. #qconlondon

@teropa: Some performance numbers for options in removing elements from a web page #qconlondon http://t.co/WnG0gXEqnW

Finance (Design & Architecture)

High Performance Messaging for Web-Based Trading Systems by Frank Greco

Mike Salsbury attended this session:

This was all about the history of ajax, comet and websocket. We’ve been part of and innovating in this space for over a decade. His vision for websocket is that with a secure, standard port, full duplex way of communicating over the web, we’re moving to SOA going outside corporate firewalls. Food for thought. He also said to “not program at the socket level” but build your application-level logic (resend, failover, etc..) on top of websocket.

How NOT to Measure Latency by Gil Tene

Richard Smith attended this session:

Gil is the CTO of Azul, who make a fast, low latency and low pause time JVM (Zing). In this presentation he explained how naive measurements of latency and response time can lead to incorrect results and poor decision making.

Before deciding what measurements to take, it's important to consider why we want to measure response times. What features of the response time distribution do we care about? When a system is loaded, it doesn't have a fixed response time as a function of load; typically, the distribution of 'hiccups' (pauses where response times are anomalously long) is multimodal as different types of 'freeze' take effect. These distributions can't be modelled accurately by average and standard deviation, the whole shape of the distribution is important.

Different applications have different requirements for latency behaviour. A critical system may have absolute limits on what the worst case response time can be, which in a way makes measuring performance easy: the only factor you care about is the maximum time. But for 'soft' real time applications, like algorithmic trading, or interactive systems where the requirement boils down to 'don't annoy the users', the performance percentiles when under projected maximum load is what matters. So before investing time into measuring response times under load, it's important to establish the actual performance percentile requirements of the application. The idea of 'sustainable throughput' is the maximum frequency of requests that can be serviced while satisfying the latency requirements, so it makes no sense without knowing the requirements.

One of the most common problems in measuring response times is the Coordinated Omission Problem: observations don't get missed at random, and it's disproportionally the bad answers that get missed out. Most load testing frameworks create lots of threads or processes, each of which streams requests at the target. That means that if a request takes an unusually long time, the thread or process is waiting for it to return, and not submitting more requests – thereby failing to record as many results during a bad time! This can seriously affect the accuracy of measurements; if you are submitting requests every 10ms, and there is a 'hiccup' of 1 second every 10 seconds, you are failing to record 100 bad results in that time. The 99% latency in this scenario is really 1 second, but a measuring tool will record it as 10ms! An unreasonable difference between the 99% value and the maximum value can be a good indication that your load test has this problem.

Before running a measuring tool against a real system that you're interested in, it's a good idea to create a synthetic system with known hiccup behaviour (for example deliberately turning it off for some time), and make sure that the monitoring tool you are using correctly characterises that system. If it doesn't, Gil offers the HdrHistogram library which can characterise response time results correctly.

Finally, Gil ended with some comparisons of servers running Azul's Zing JVM against those using the standard one – using non-normalised charts because, as he puts it, "it's really hard to depict being 1000× better in 100 pixels".

Mike Salsbury attended this session:

When measuring latency, don’t measure the average and standard deviation. He showed that a dataset with latency spikes – or “hiccups” as he put them get smoothed by average and standard deviation. All systems have hiccups, whether it’s garbage collection, database re-indexing, resizing memory allocations. They are all things you have to pay for on a regular basis and they introduce Latency. You need to measure max latency and percentiles. With all this data you can tell if there are hiccups.

Another thing missed in latency testing is “co-ordinated omission”, where previous requests take longer and the test clients wait for the previous request to complete before requesting again. This creates a smaller, less accurate dataset.

In reality, if you’re latency testing you should try a test where you create a hiccup by pausing the machine and see if your results can pick it up.

He also showed jHiccup – http://www.azulsystems.com/jHiccup. A tool that adds a thread to a running JVM, sleeps for a millisecond, wakes up and measures if it actually was a millisecond since it slept. If it was longer, there will have been a hiccup and we can now measure that.

Architectures of the Small & Beautiful

Startup Architecture: how to lean on others to get stuff done by Robbie Clutton

Kevin Hodges attended this session:

Small beautiful architectures
“Don’t allow your codebase to evolve into a big ball of mud”
Make it work, make it right, make it fast – Kent Beck
Create features with a hypothesis around how they impact
Make sure the hypothesis is easily validated
Write code that is always production ready and easy to change
Lesson 1, simple user testing is simple, don’t assume anything and always be validating
Lesson 2, Use tools to discover simple mistakes, passing tests doesn’t ensure production ready, use SQL Explain on slow queries
Lesson 3, Shorten the request/response cycle, do the minimal amount of stuff possible. “perceived performance is more important that actual performance” (conditional loading etc)
Lesson 4, Focus on your differentiators, don’t over engineer stuff
Lesson 5, Simple elegant design can prevent complex architecture creep
Lesson 6, Feature flags, offer resilience as well as a way to offer features
“complex should just be lots of simple”

Richard Smith attended this session:

A common problem that almost all systems run into at some point is complaints that the system is 'too slow'. This can be addressed by using profiling tools to find the slowest part of the application, and then concentrating efforts on that section. Caching can improve performance, but it can be difficult to do it correctly, for example finding all the places where a cache item should be invalidated is not easy. Resorting to a cache can also hide poorly written code, as it will still be slow, it just won't get run as often. Mark Pacheco made a similar point about the pre-rebuild Songkick architecture in his talk later on Friday.

If you have dependencies on other services in your application request-response cycle (or the equivalent in an interactive application), they should be 'weak dependencies'. Robbie shared an example of a web application which had a bug recorded against it as users not being able to register; it turned out that the problem was with the mailing list provider used for signing people up to newsletters, but it had been coded in as a hard dependency, and being unable to connect to it was causing the whole registration process to fail. Wherever possible, calls to third party services should be done in the background or deferred if they can't be executed immediately, and the main application flow be allowed to continue.

A related point is that web applications should load the primary content first, and then pull in secondary content (ads, links to recent news, Twitter feeds etc), so if some piece of secondary data isn't available, the main content is still showed to the user. The perceived performance of a system (for a website, receiving the content that you want to read) is often more important than the actual performance (completed page load time).

The most expensive resource you have is time, particularly in a small agile team, so it doesn't make sense to spend time replicating something that another tool or library already does, particularly if you only have time to make a poor implementation. Just buy and use the tool! It's generally not possible to build an adequate replacement in the time that is available for the cost of the tool, and it makes sense to concentrate on the functionality that will differentiate your application from the rest of the market, not the functionality that everyone has. And only implement what you really need ... which means you need to ask questions of the customer to find out what that minimum set actually is.

One approach to maintain performance of a system under load is to follow the lead of The Guardian with their emergency mode: if load starts to get too high to offer the full experience to all requests, instead of rejecting requests, disable expensive but secondary features so that the main content can always be served.

Architecture should be allowed to evolve: refactoring should occur on a design level as well as a code level. Don't create abstractions and general solutions until you're sure you will need them – creating an abstract general solution takes work, and if it isn't needed then that is wasted effort. A specific case can always be refactored into a general one later if it is needed.

Twitter feedback on this session included:

@teropa: #guardian achieves zero downtime deployment by serving static content from a CDN during deployment breaks. @robb1e #qconlondon

@matlockx: robb1e "refactor continuously" yay, all know that but how many r doing that? a must! ;) #qconlondon

@jgrodziski: #qconlondon @robb1e "spend your time wisely" work on your core domain, use off-the-shelf components anywhere else. don't reinvent the wheel.

@teropa: when we see a puzzle to solve,it tends to get the better of us.Should spend our time more wisely-can I buy it instead? #qconlondon @robb1e

@pjwalstrom: good quotes from @robb1e at #qconlondon: "Only two hard problems in Computer Science: cache invalidation and naming things." Phil Karlton

Inside Lanyrd's Architecture by Andrew Godwin

Alex Blewitt attended this session:

They have a highly available architecture which allows them to evolve the site, and have the ability to put the site into read-only mode with cached content should the back ends suffer from any issues (or upgrades). They deploy content continuously (several times a day) and roll out new features with feature switches that allow some parts of the stack to be switched on or off depending on the user or group that they are associated with. This doesn’t do A/B testing in the strict sense, but does mean that the beta users get to see the new features before the wider audience against production data, so that they can get a feel for what works and what doesn’t work before it goes live.

Green shoots in the brownest field: Being a startup in Government by Mat Wall

Will Hamill attended this session:

Mat described the evolution of the tech stack of the GOV.UK platform, and how it was not necessarily intended from the start to be a platform but rather a solution to a problem. Mat described that letting the developers involved make most of the technical decisions rather than a traditional approach to ‘strategic architecture’ had enabled them to solve problems faster and with the most suitable tool for each part of the problem rather than being prescriptive and restrictive about implementation details.

Mat illustrated how the architecture had changed most in terms of the interactions between the components within the publishing platform, and how simplicity has been important in solving just the problem at hand. Mat argued that the involvement of the developers as the actual problem solvers and as trained professionals rather than just keyboard bashers was critical to good communication and the kind of working environment that was necessary to keep productive, talented people at work as opposed to the kind of bureaucratic environment that has in the past often driven many people away from the public sector. Having the developers well integrated with the rest of the team (architects, product owners, deliver managers) gave them the context and information needed to make the right decisions, rather than simply isolating the developers from the outside entirely.

Mat also described how making some tradeoffs or using a ‘good enough for now’ approach was involved in keeping momentum, rather than stopping work and getting mired down in external dependencies and having to work to outside parties’ deadlines. Mat’s example of this was a project currently in progress to develop a system for individual online electoral registration, which relies on integration with approximately 400 local authority based Electoral Registration Officers (EROs), and also with the Department for Work & Pensions (DWP) in order to provide confirmation of identity data. As an aside, Mat made the point that this kind of integration was happening across many new government transaction projects because contrary to what you may infer from the tabloids - there is in fact no one ‘big government database’!

In the above project, integration with EROs and DWP was necessary for the team to progress with integrating and testing their services but the dependency on an actual electrical datalink between these third parties would be a major impediment to the team’s momentum and would significantly delay progress. In the interim, instead, the arrows on the diagram (as it were) were in fact connected not yet with a secure connection but with “high bandwidth, long-latency” transport: a secure motorcycle courier. Data was delivered encrypted via a vehicle in order for progress to be made. The motorbike diagram got a laugh from the audience but the real importance is in making the trade-off allowed the team to maintain momentum. This example was particularly poignant for me as the team that I’ve had the privilege to be working with for the last 9 months has been this project and the effect on us was that we that at the time we could make significant progress without having to seriously delay the integration phase of the project.

How we scaled Songkick for more traffic and more productive development by Marc Pacheco

Richard Smith attended this session:

Their initial architecture was a typical one for a web startup: a mySQL database, fronted onto by a web application layer that did everything else – not just the web site itself but also auxiliary functions like their scraping and data ingest tools. That meant that a change to any part of the site's functionality required them to redeploy everything, and the unknown dependencies within the web layer meant that changing one part of the site could break things in a completely different part. Their builds would take hours, even using an Amazon EC2 cluster, there were complex relationships between supposedly disparate parts of the system, and the dependency graph became very unwieldy (Mark showed it to us, and it was so dense you could hardly see the lines).

They decided they needed to re-architect it to allow for scalability, to allow their development effort to be applied to functionality, not fixing bugs, and to speed up their release cycle so that they could get new features into production faster. That is a big decision: re-architecting takes a long time, and it isn't guaranteed that the outcome will be better than what already exists. There is also the consideration of whether to re-architect within the framework of the existing system, or whether it would be faster to simply start again with the knowledge gained from the first system.

If you are going to do a major redesign, there are some important points. The design work should be collaborative across the whole team, just as development is. Clear names need to be chosen for objects within the design, and agreed upon by everybody (from developers through to managers, product people and salesmen, so that discussions about what work needs doing are clear for everyone. The existing feature set needs to be looked at and you need to decide what can be cut out from the new system, and what the minimum acceptable set of features is for the first release of the new system. And, if it's possible, the redesign should be done piecewise, so that the whole system is always in a working state of some kind – an application that is taken down so it can be rebuilt, or for which development stops and the old version is left up until the new one is finished, is likely to lose custom.

Songkick decided to move to a strongly decoupled service model. They created a collection of Sinatra (Ruby server technology) applications which deal with the database, and accept requests via HTTP containing either JSON or form-encoded data, returning JSON. Their web application, a Ruby on Rails app, acts as a client to these services, and doesn't have a direct connection to the database. They also redesigned the object model for individual pages, moving to an MVC approach, and their page model now consists of components, which themselves may be made up of elements. Elements can be used in different contexts, and they pick up data from the context they are in (from the page model or the component model).

They also chose to link assets together by convention. A component model class name matches the name of the CSS to be applied to that component, to any JavaScript file that needs to be included with that component, and to a directory for any static assets (e.g. images) it uses. That means that it is very clear what needs to be looked at if a component needs changing or removing.

The result was a radical improvement in productivity, with a release cycle ten times faster; much better performing code (they removed all of their internal cache code and page load times didn't change); a code base that halved in size, partly due to having fewer but better targeted tests; much faster builds, down from over an hour on a cluster to under 10 minutes on a single machine; and a more collaborative, evidence-based development process.

Architecture of the Triposo travel guide by Jon Tirsen, Douwe Osinga

Will Hamill attended this session:

Triposo’s main difference is that their travel guides are algorithmically generated rather than by hand, which means that they can collate and aggregate incredible amounts of data, gaining insights into sights, restaurants, events etc from a larger scale view….

The details of Triposo’s application development and deployment process was an interesting insight into how a company of their small size can regularly push updates for 80~ apps to the iTunes App Store.

The team is distributed and communicates mostly via Google Hangouts or similar, but it was very important to have the team come together often in order to build social connections and camaraderie. Douwe showed photos from a project kick-off meetup when the development team had all gotten together in Gran Canaria. I’ve read a few other articles about distributed teams and it seems that having the ability to get everyone in the one place for dinner, drinks and a bit of fun does have a positive impact on communication and relationships in the team (where often the distance and separation can be detrimental). Triposo have a focus on expanding their product through experimentation which is done at company hackathons and the results are either built into the product or given away to the community via the Triposo Labs site.

Triposo’s data aggregation and build process (for they are one; each app must work offline) is based largely upon crawling open data such as Open StreetMap, Wikitravel and Wikipedia for places and other sources for inclusion in the app, but also makes use of knowledge gained from crawling across closed data that can’t be used directly within the app. For a simple example, the more photos found on Flickr of a given place can be used to determine how popular a site is with tourists. Much more detailed analysis and other inferences are made with these kinds of data sets and while my notes here are unfortunately sparse it was very interesting.

The data, once aggregated, is sync’d with Dropbox and accessed by the build server. Build, signing, testing and app store submission of the 80~ apps are orchestrated by a set of VMs as the process for each app takes over an hour and would be infeasible to do singly. This enables Triposo to send an app to the app store for each major city/region which is necessary for SEO reasons because their competitors all sell single region apps and otherwise a global Triposo app won’t end up the search results.

Sven Johann attended this session:

The presenters, former Googlers and ThoughtWorkers are avid travelers and wanted to know, if they can do better then the common travel guides like Lonely Planet & Co. So they started what they learned at Google: crawl the web, aggregate, match, and rank. They send their crawlers to fetch gigabytes of travel related content from all kinds of sources like Wikitravel, Wikipedia, Facebook, Open Street Maps, Flickr and some more.

Once they have all the data, it's time to parse. From each source they extract information about the places like villages, cities and countries, and the points of interest (restaurants, museums, shops, trees, etc). They're looking for patterns to create one bucket of information for a particular place from all the various sources they crawled. After this phase they end up with exactly one record for each place or point of interest that has all the information from any of the sources they've used. Now it is time to rank and these ideas were pretty cool. Among other things, they extract meta data from Flickr pictures like where and when the pictures were made. That brings them interesting information about possible events, e.g. there are many pictures around 52°38'N 4°45'E, but only from April to September and only Fridays between 10.00–12.30 a.m. There must be something interesting! That's the cheese market in Alkmaar. So, if your on a trip in Amsterdam, your Triposo travel app proposes you a day trip to Alkmaar on Friday (with my Lonely Planet book I usually see that only when it is already too late).

The Modern Web Stack

Visualizing Information with HTML5 by Dio Synodinos

Alex Blewitt attended this session:

He covered using techniques like CSS3 transitions and 3D transformations, as well as Canvas and WebGL drawing. There were also some cool demos, such as http://graphicpeel.com/cssiosicons which shows a screenshot that looks very much like the iOS home screen, but with icons that are entirely CSS based. There were also some Canvas demos, such as BrowserQuest, a graphical adventure rendered in a Canvas.

There were also some other JavaScript libraries mentioned, such as Raphaël,D3.js (again), and Fabric.js (providing Canvas with fallback to SVG).

Richard Chamberlain attended this session:

35,000 years ago the cave painters used their cutting edge technology to express themselves: fingerpainting. Today we have a number of low and high level tools to visualize with on the web: CSS, SVG, Canvas, WebGL; as well as Raphael, processing.js and D3.

We walked through examples of consuming and repackaging data with pure CSS3, or mediated by JavaScript, via heatmaps of the most forked and most watched projects on github. We were then shown a comparison of the relative merits of SVG, Canvas and WebGL as the low level options. Something new for me here, google happily indexes SVG, useful if you’ve got data in a diagram you need indexed.

Raphael produces SVG output which automatically falls back to VML on older browsers.

Processing can be implemented in pure JavaScript or Processing compiled to JavaScript.

D3 was the most interesting though, as it binds data to the DOM and then applies data driven transformations from then on. It also includes beautiful ready to use layouts. We were shown visualizations for navigating into graphs, and Radar charts.

Rich HTML/JS applications with knockout.js and no server by Steven Sanderson

Alex Blewitt attended this session:

Steven Sanderson (@stevensanderson) from Microsoft showed a demo of using Knockout JS to wire up UI models with underlyin events, so that an object model could be used in the browser to render content whilst at the same time reacting to changes in the model. I’ve seen this approach before in other libraries like Objective-J and SproutCore (which seems to have passed on, much like iCloud has). The difference with Knockout.JS seemed to be that it was a much more natural fit for JavaScript, without having to go into many details as to the underlying framework.

The second part of the demo was how to wire up Microsoft Azure services so that they could drive the JavaScript app, by exposing a NoSQL type DB with REST CRUD operations directly from the JavaScript app itself. The fact that all this existed within a ready-to-go Azure console was a pretty slick part of the demo (and subliminal advertising) – but Steven also had a demo in the Apple app store using PhoneGap as a web view for the same JavaScript app that he had developed (or at least, a previously created version of the same).

Finance, Technology & Implementation

In-Memory Message & Trade repositories by John T Davies

Richard Chamberlain attended this session:

We started with a brief history and the increased regulation required post 2008 and Dodds-Frank and EC legislation. This coupled with the explosion in the volume and complexity of trades to be stored means a rethink of the data layer is necessary. Derivatives are now ubiquitous and the FpML they are based on can be Byzantine in their complexity.

The established data storage is all relational in structure, but now the majority of data that is required to be stored is hierarchical. In a sense there are two options, but with either a bridge has to be built. Either the relational databases stay and they do little more than becomes indices to CLOBs of hierarchical data, or there is a need to look at Graph and NoSQL databases. The RDBMSs will become difficult to search, and performance is the key here. Alternatively the Graph and NoSQL databases may be a better architectural solution, but they do not yet have a good search query structure.

Consumerisation - what does it mean to a developer? by Chris Swan

Twitter feedback on this session included:

@andypiper: “Hybrid is NOT the best of both worlds” - end up doing more coding to tailor for platforms @cpswan #qconlondon

@andypiper: heh - not a hybrid strategy http://t.co/NzukgHgvy3 @cpswan #qconlondon

The technology behind an Equity Trade by John O'Hara

Richard Chamberlain attended this session:

This was a great and detailed talk about how the Equity markets work across the world. Apparently equities are the second simplest asset class after FX. NYSE was the biggest market by value, but India the biggest by volume. A pointer to the way exchanges may change in the future.

Then we were introduced to the range of trading types within equities from Single Orders through to High Frequency Trading, the UAVs of the financial world, via Program, basket trades and algorithmic trading engines. No doubt these are all meat and drink to the Business, but the talk started with the observation that the IT guys are often told ‘what’ to do, rather than ‘why’. …

We were then taken through the lifecycle of a trade, and also made aware of the quote from Goldman Sachs that described one of it’s goals as to be a ‘low cost provider’. McKinsey also had a report on ‘The Triple Transformation’. All encouraging investment banks to take risks with technology to cope with the sea change in their volumes of trades and the reduced profits to them of managing those trades.

Some of the same NoSQL and in memory solutions were name checked again when the actual technologies in use were displayed. The conclusion was that we may be entering a new era of banking technology.

Twitter feedback on this session included:

@m4tthall: large global banks have approx 10 billion dollars annual technology costs #QConLondon

@m4tthall: we are embarking on a new era of banking technology, some of the biggest challenges in IT will lie in banks John O'Hara #QConLondon

Big Data NoSQL

Big Data: Making Sense of it all! by Jamie Engesser

Sven Johann attended this session:

Jamie Engesser from HortonWorks pointed out, that we should really, really do Big Data for a reason and not because it's cool.

The Past, Present, and Future of NoSQL by Matt Asay

Will Hamill attended this session:

Matt is employed by 10gen, the makers of MongoDB and gave a talk on the emergence of NoSQL, the state of the union and briefly commented on where he thinks the future for these technologies lie. Matt described the history of data storage without SQL (which is not new!) and how the introduction of SQL in the 1970s was a great leap forward in decoupling the data storage schema design from the query design.

However more and more companies are discovering the lessons learned by web-scale systems like Facebook, Google and Craigslist: that traditional SQL based relational data stores can’t scale to cope with today’s huge data sets. The NoSQL paradigm emerged from a resistance to hammer the square peg of RDBMS into the round hole of loosely structured, sometimes complexly linked, sometimes unlinked, huge scale data….

Matt described how for many modern organisations, NoSQL is the new normal. For example at The Guardian their data persistence technology of choice is now MongoDB for ease of use and scalability, and to choose something different on a new tech project requires justifying why not to use it (which I’m sure is quite different to many organisations’ approach in this area).

Matt suggested that the future of data storage lies in the ‘polyglot persistence’ paradigm; to have simultaneously multiple data storage styles in use for different parts of the business’ data as per what best fits that data and the way it is used. For example, storing highly related data like recommendations or travel itineraries in a graph database, website user comments in a document store and HR records in the RDBMS. Horses for courses - not just the current flavour of the month for every problem!

Richard Smith attended this session:

Like many 'big new ideas', that of the schemaless, non-relational database is not entirely new, but takes inspiration from the past. Before the introduction of SQL, NASA developed IMS for their Apollo program in 1969. The schemaless approach meant that thought needed to be put into query design up front, at the same time as application planning. What SQL brought to the table was the ability to think about the schema and the data structure up front, but allow query design to be done later, speeding up the planning phase.

So what has changed to provide pressure in the present day for a change to the SQL/relational model? Very large relational databases (very large amounts of data, or numbers of relational connections) become hard to update or change – it took Craigslist 3 months to complete a tiny schema change in their RDB! The advance of storage technology, bandwidth and Internet connectivity means that large web companies now receive more data and load than a RDB can cope with; the big data revolution began with companies like Google and Facebook trying to find something that would deal with their needs. And as business systems become more people-oriented, the data that needs to be stored becomes more flexible and less suitable for storage in a fixed record format.

NoSQL is moving into the mainstream. Major online companies like Amazon and Netflix use NoSQL databases for their user-oriented data storage, e.g. for their recommendation system, and this type of user-centric data is where their focus and innovation are directed, not their record-based billing systems and accounts (where a relational model will always continue to be the right answer). News media is an area where companies need to be flexible and agile, as nobody really knows what the future of news media will be, and several news sites use NoSQL databases. And more companies are reaching the limits of a single relational database on a single server, so the ability of NoSQL solutions to be scalable on commodity servers is a big advantage.

So what of the future? Matt sees a future where NoSQL becomes the 'new normal': in the majority of cases, the choice of data storage mechanism is quite an open one, and both a relational and non-relational database would be a valid option; in those cases, he sees companies choosing NoSQL databases as their standard. A few mainstream organisations have already made this decision; The Guardian is one of them. NoSQL database implementations have come of age in the present day, and are now general purpose, high performance and easy to use. They are not suitable for everything – Matt was very clear throughout the presentation that the relational database will always have its place – but they are already a reasonable choice for most situations. As they become more mature, they will become the default choice for the majority case for a lot more organisations.

A little graph theory for the busy developer by Jim Webber

Richard Smith attended this session:

Until recently there was a trend of not only storing data in a database, but performing calculations there too, via stored procedures. But as consumer hardware gets faster and more powerful, complex queries aren't necessarily run against the database any more; instead, data is extracted from the database by simple lookups and processed elsewhere. This means that we are free to optimise databases only for storage and simple reading operations, not complex joins and queries.

Several other talks in this stream were about non-relational systems that acted as some kind of key-indexed storage, which is easy to look up by its index, but hard to write cross-cutting queries for. Jim introduced the idea of a different way of looking at data storage – have the data model store not only data points, but connections between data objects as a graph. Graph traversal allows for individual queries to return very quickly, but the maximum throughput of queries per time period is lower than for a NoSQL database.

In Neo4j, they use a 'property graph model': nodes have arbitrary properties associated with them, and relationships connect two nodes in a directed way (for a two way link, two relationships are created) with a label and other properties. There is no schema applied to the nodes or edges within a graph, each one can have different properties.

Graph theory tells us some useful properties of dynamic graphs, especially social ones, that allow us to make predictions and perform retrospective analysis of a network. The first is that a dynamic graph naturally closes triangles: if A has some relationship with B and also with C, then B and C will naturally develop some kind of relationship, and the relationship between B and C will be such as to maintain structural balance within the triangle. These concepts of 'triadic closure' and 'structural balance' are powerful in making predictions: by constructing a graph of existing relationships, and closing each triangle with an appropriate type of link to maintain structural balance, a representation of all the implicit relationships can be made. Jim demonstrated this with a graph of allies and enemies in the mid 19th century, closed all the triangles and showed that it matched the sides in WW1.

Another important property is known as the 'Local Bridge Property': if a node has two or more strong links, any further connections it makes will be 'weak relationships' or 'bridges' to other parts of the network. Bridges are useful predictive tools when considering how to partition a network, as they will usually break first.

A new way of representing data requires a new type of query language in order to get useful results from it. Neo4j uses a query language called Cypher, which has matching clauses based on relationships between nodes in the network and given starting nodes. By choosing appropriate starting nodes for a query, different questions can be asked of the database, but unlike a traditional indexed model, all queries of the same complexity will be roughly equally fast (and the numbers for graph traversal speed Jim claimed were impressive, of the order of microseconds for simple relationship following). A graph-based data model is applicable for a wide variety of applications so this type of storage deserves a closer look, in my opinion.

Approximate methods for scalable data mining by Andrew Clegg

Richard Smith attended this session:

We don't necessarily think about it, but characterising large data sets is difficult. select distinct or frequency analysis type queries on a large distributed data source are hard and expensive, particularly when the distinct list doesn't fit in memory. Approximate methods are ways of answering this type of question in an approximate, probabilistic way, usually parallelisably, and with a predictable error, while only storing a small summary of the data which is many times smaller than the full data set. Applying approximate methods to a data set will generally increase the CPU load, although not necessarily in the case of a distributed database where CPU time is used to serialise and deserialise information sent between shards, but database servers are often not running at full CPU load in any case.

The first example was that of set membership: given an index on a table, and an element, is the element in the table? In a small data case it is simple to set up a hashtable of the index and callhashtable.contains(element), but this breaks down when the set becomes large. Instead, we can store with the table a Bloom filter of the index, which is a bitfield of size n, and define k hash functions which each return 0 to n-1. An item has a characteristic set of bits, with k bits turned on, those being the result of running each hash function on that item. When an item is added to the table, the filter is updated with that item's bits; when an item is looked up, if all the bits in the index's filter corresponding to that item are not set, it is not in the index. Existing databases already use this method as a preliminary filter to return false quickly on a lookup that fails.

Next he talked about an approximate cardinality measure, i.e. how many distinct elements a set has. The simplest version of probabilistic counting simply looks at the longest run of trailing zeros in the result of running a hash function on every data item; the estimated cardinality is as simple as 2ⁿ. As simple as this method appears, by using several hash functions and combining them all in a final answer, an answer with a low error can be produced.

The next example was of frequency estimation. Similar to the Bloom filter, the count-min sketch algorithm uses several hash functions which return indices into an array. This time, the array is of integers, initialised to 0; when an item is added to the index list, the values corresponding to each of the indices of its hash function are incremented. When looking an item up to find its approximate frequency, each value in the count array corresponding to returns from the hash functions is looked up, and the lowest is returned (because the minimum value of those values is the number of times this item is present, but they could be higher due to other items with the same hash function result). As well as having a tunable error rate (based on the number of hash functions and the size of the array), this method has the useful property that it is more accurate for high counts, which are likely to be the items that are more important to the application.

Finally Andrew talked about the similarity search: finding items that are similar to the one we are looking at. Similar is usually defined as having a low distance in some multi-dimensional distance space, and calculating exact values for that distance can often be expensive. An approximate method requires a suitable locality-sensitive hash function, which (unlike a good normal hash function) will return similar values for items which are nearby in terms of that distance. An example for cosine distance is the 'random hyperplane' algorithm: for a hash of n bits, n hyperplanes are constructed in the space, and for each one that a point is above, the appropriate bit is set to 1. There are other hash functions that approximate other real distance algorithms.

Making the Future

Physical Pi by Romilly Cocking, Steve Freeman

Alex Blewitt attended this session:

Steve and Romilly had created a Pi-driven robot … They didn’t only have a Pi but also a connection with Arduino boards using the I2C communications protocol, with the Pi being the brains to the Arduino’s limited but functional slave system. With all the software being coded in Python, it makes interacting with other devices much easier than it was in the past.

Here Comes Wearable Technology! by Rain Ashford

Romilly Cocking reported on this session:

Wearable tech has some very significant applications. She told us abut research that's helping people on the Autistic spectrum to better understand the mood of others. She also talked about her work in the area of the Quantified Self (QS) movement.

Richard Chamberlain attended this session:

Started off talking about history of cyborgs and going over the work of early wearable computing genius Steve Mann with a brief detour via Robocop and also noting that watches were the first wearable tech. Picked up a lot of tech to investigate later.

Quantified self –http://quantifiedself.com/ – community about measuring everything about yourself for better understanding

Lilypad – http://arduino.cc/en/Main/ArduinoBoardLilyPad – arduino for textiles

Shrimping – http://shrimping.it/blog/ – Super cheap arduino substitute

She also demoed a lot of her work. Some of which can be seen here:http://rainycatz.wordpress.com/2012/10/22/baroesque-barometric-skirt/ . Most of it achieved by a lilypad and some conductive thread to make circuits in clothing.

Attracting Great People

Hire Education - making interviews rock by Trisha Gee, Dan North

Kevin Hodges attended this session:

Recruiter
Message on the job advert must be inclusivity
Where are you advertising?
What are we actually hiring for?
Interviewer
Try and give the interviewee a good experience
Both you and the candidate must be sure, gear the conversation towards booths parties happiness
Clear that’s a 2-way conversation
Offer opportunities for feedback
Is this person smart enough?
Don’t hire yourself…
Look for evidence, experiential, hypothetical, credential, opinion

Twitter feedback on this session included:

@teropa: In an interview: "Write me a singleton" "I can't, on moral grounds" #qconlondon

@vwggolf3: If you hire someone because you don't have reasons for saying NO you must improve the way you recruit #qconlondon #hr advice by @trisha_gee

NoHR Hiring by Martijn Verburg, Zoe Slattery

Will Hamill attended this session:

In this discussion Kim and Martijn described important ways to focus on seeking out getting the right people.

When assessing a CV, often given the template nature and the somewhat boring list of technology acronyms it can be easier to tell whether the person fits and actually cares about applying for the job from their cover letter. Unfortunately when using external recruiters this won’t be included and the CV may be reformatted (or worse) to fit their template. However a good cover letter can tell the business that a candidate has actually investigated what it is that the business does, what they expect from the job and in plain English tells a little more than the ‘10 years Java, 5 years .Net’ does.

Analysing a candidate’s CV is a great way to derive questions for an interview. Use the technologies they’ve listed or the projects they’ve worked on to ask open ended and relevant questions. By relevant, Kim and Martijn mean to avoid the kind of gimmicky “how would you implement a merge sort” basic algorithmic questions (unless the job in question actually relies upon this kind of very low level algorithmic knowledge), or something that any real developer will likely just turn to Google to answer when doing their day job.

In a telephone interview Kim warned not to answer closed questions that could be quickly Googled while the candidate stalls, which is a good suggestion. ….

Martijn mentioned the utility of checking up public profiles on Github and Stack Overflow to check out candidates’ contributions, though not holding it against the candidate if they didn’t get involved in such public-facing activities. Martijn also implored the audience not to use anything they read on the candidate’s Facebook page to influence their impressions (unless they’re clearly doing something awful like kicking a load of babies) because both of the inappropriate sway an opposing opinion may have on you and also out of sheer respect for privacy.

In terms of judging the candidate at interview it’s very important to have a technical person in the interview (most tech companies do this already) and if possible, someone on the team for which the opening is being advertised. Fundamentally after all the logic tests, CV questions, “tell us a time you had to deal with problems on a project” style stuff you should be asking yourself the most important question: “would I want to work with this person?”…

Kim and Martijn also argued that many businesses don’t make it easy on themselves by way of creating uselessly vague or acronym-drenched job specs, without going into much effort to actually attract, inform and convince candidates why they should work there….

Kim and Martijn also described how as a business getting involved in the developer community has no drawbacks but great advantages in terms of getting your brand and your work out in front of exactly the kind of people you want to hire. Attending or sponsoring user groups and conferences is especially a good way to get exposure from the most talented and engaged people - who are inevitably already employed!

NoSQL Solutions Track

Moderated NoSQL Panel by Alvin Richards, Chris Molozian, Andrew Elmore, Ian Robinson

Twitter feedback on this session included:

@m4tthall: NoSql - "seeing a lot of companies moving from batch reporting to real time analytics" easier said than done :-) #QConLondon

Scaling for Humongous amounts of data with MongoDB by Alvin Richards

Richard Chamberlain attended this session:

Before this talk I was skeptical about NOSQL databases in a production environment. Alvin showed off some good features of mongoDB that makes it something to investigate further. Auto-sharding and balancing of data between shards is a cool feature. Also the theory of “don’t love each piece of data equally” was good. With mongoDB you can specify when writing how much replication your data needs to have until you consider the operation complete. Non-important data can just be written to memory on the primary server, with mustn’t lose data you can specify that the data needs to be replicated to a secondary server in a different data centre before it’s complete. This increases latency, but you’re pretty sure you won’t lose it. This is available via the insert API.

Twitter feedback on this session included:

@matlockx: #MongoDB #QConLondon auto balancing data between shards is a +1

Becoming Polyglot; Putting Neo4j into production and what happened next by Toby O'Rourke

Twitter feedback on this session included:

@wsbfg: Notes on Becoming polyglot - putting neo4j into production. Toby O'Rourke. https://t.co/MDUH6qia0X #qconlondon

Eventual Consistency in the Real World by Chris Molozian

Richard Smith attended this session:

The core idea of his talk is eventual consistency: the idea that a distributed database will eventually contain the same data on every node, and that temporary inconsistency is okay. The CAP Theorem proves that you cannot have all three of Consistent, Available and Partition-tolerant data; a distributed case is by definition partitioned, so that means that a distributed solution must trade consistency off against availability. (Availability in this context means the degree to which the system is able to accept select and update requests on any node.) If we choose consistency, then we have 'eventual availability – and in reality, if a system is not highly available (i.e. if any node of the distributed system we connect to cannot accept our request), the system is not usable, so we must prioritise availability and sacrifice some consistency. That is, for an available distributed data solution, eventual consistency is inevitable. Fortunately, eventual consistency is usually okay in the real world, and it is already used in well known distributed systems like Facebook.

Eventual consistency works by allowing any node to accept an update at any time, and pushing that update to a number of replicating nodes. Eventually the update will propagate through the entire distributed database. However, this means that updates can happen in a different order on different nodes. The ideal solution is a data model where providing the same updates in a different order results in the same final state; Chris calls this outcome "Strong Eventual Consistency" and it is currently the subject of Basho's research effort. However, until this is perfected, accepting eventual consistency also means accepting data divergence or conflict from differently ordered updates.

The question then becomes: how best to resolve these conflicts? Riak uses a concept they call a vector clock, which acts rather like version control revision numbers. When submitting an update, the update is tagged with the identifier of the last known state for that record. If two different nodes try to update the same record with two different updates that have the same parent state, that means that two clients have tried to update the record at the same time, and conflict resolution will be triggered. (Conflicting updates can also occur if a network has been split due to infrastructure failure, and then connection is re-established; this situation, and the state mismatches it can cause, is well known to IRC users as a 'netsplit'.) This 'sibling' resolution is context dependent so Riak's API lets the application layer define how to resolve conflicts in different cases.

The overall message is that eventual consistency is acceptable, it is in fact unavoidable for a distributed database, and we should use the appropriate methods to get the best out of it.

Financial Big Data - Loosely Coupled, Highly Structured by Andrew Elmore

Richard Smith attended this session:

This one was about how relational databases don't fit the bill for storing information about financial transactions that come in a very non-tabular format.

The SWIFT message formats used for recording inter-bank end user transactions (e.g. sending money to a business abroad to pay for services) are tree structured. The message contains several blocks; the main content block contains sections, some of which may be optional or repeated; they contain fields which similarly may not always be present or which may be present several times; and so on. The format is several levels deep, and at no point is there a guarantee that a particular field will be present in every message. It's also versioned, as the formats are updated frequently. This is obviously a poor fit for a traditional RDB and its idea of records containing values for a fixed set of fields (columns in a tabular view).

Several solutions have been tried in order to manage this data. Firstly, some consumers simply extract only the fields that they are interested in, and store that in an RDB. This works, but data is permanently lost, which could be losing value from future re-analysis of stored data with new ideas. Others tried storing the message records in XML, but although this preserves the tree nature of the data, it is not easily indexable and it is not the format that the consuming code will want to use. The ideal world would be an object tree in the target environment (i.e. Java objects for a Java program, populated structures to a C++ program, etc), in an indexable database type container to permit querying.

Naturally the presenter's company sells just such a solution, particularly targeted at financial messaging! Their system has predefined type mappings for the financial message types used in the talk. The general concept is that a grammar is specified for the input format, and incoming messages are parsed and translated into an object tree, which is then stored in the data source.

Andrew then moved on to the advantages of using an event-driven data processing paradigm. In a typical flow-based architecture, scalability only happens in one place (the number of streamed processing units you have), even though some parts of the process are probably more resource intensive than others. By having each processing unit small and self-contained, taking data from one storage pool, processing it in some way and returning it to another, and having the data storage pools dispatch events to request processing to occur, each step of the processing becomes scalable and potentially distributable. This is very similar in idea to the pub-sub (publish/subscribe) model of multiple worker processes typically seen in the Q world, and in my time at First Derivatives I saw how easy that makes it to scale and distribute the processing units.

Storing objects in your database has some big advantages. You can use any object properties in indexing and querying, and you can store exception objects or other state-holding rich failure objects in it to get the most possible information about unsuccessful execution cases. But it can also result in very large record sizes (and of course the data objects must be serialisable in order to distribute the database).

Distributing an object database also raises the questions of how to manage notification (i.e. dispatching events to request a processing agent takes on a data object), and how to deal with exceptions, when the system is spread across multiple instances. These problems have already been looked at in solution that use conventional RDSs, but they are more immediately obvious when the object tree is integrated with the data storage and when the database and application are distributed together.

Solution Track Thursday 2

Big Data @ Skype by Bryan Dove

@sean_wilkes: #bigdata at Skype #QConLondon 4533. Embrace failure. We're doing a bunch of tough things and you learn more this way. An #agile principal!

@tkalkosinski: #skype big data team is 7 people total! Impressive by Bryan Dove on #QConLondon

@portixol: Good question at the Big Data @skype session #qconlondon - How do you educate the stakeholders on how to embrace failure?

@portixol: Answer: Even when we fail the time to fix is shorter - push code every day #qconlondon

Events

Alex Blewitt attended the Atlassian Angry Nerds Party:

Atlassian sponsored the end-of-day party in the associated hall, which has the benefit of being right next door to the conference venue as well as a fairly sizable space in which to congregate. There were also several games-style tables, including football table and an air hockey table. The beer was probably more plentiful than the finger food, but none the less was a pleasant way to end the evening, as I talked for some time with Graham Lee about the evoluion of the Nextstep platform and reminscing about ye olden times.

Opinions about QCon

Opinions expressed on Twitter included:

@charleshumble: #qconlondon are making videos of all the talks available to attendees within hours. That's awesome!

@tomayac: Conference location matters! Analyzing #QConLondon imgs/vids from social networksâ†’5 out of top10 about view. #TomsPhD http://t.co/IZUF9U8YLm

@andypiper: Dateline: London #qconlondon @ Queen Elizabeth II Conference Centre http://t.co/HiTeKEc6EX

@hatofmonkeys: Fantastic day on the cloud track at #qconlondon . Great speakers, talks and questions. Many thanks to all involved!

@toughplacetogo: Free beer! #qconlondon #boom

@Kanzo007: #QConlondon is fantastic, and now provided with my amazing free t-shirt from #atlassian it's even better! :D http://t.co/RYlZqlLmZp

@arungupta: Beerapalooza and snacks keeping lots of geeks happy at #QConLondon Enjoy courtesy of @OTN_Events_EMEA http://t.co/C51kP8QWkQ

@madspbuch: Enjoying my time at #QConLondon , delicious food :-D

Takeaways

Romilly Cocking shared his impression on QCon:

QCon was well organised, very well attended, and packed with interesting talks. If you're a software professional, or interested in what top professionals do, then QCon is a must. And if you can't wait till next year, you can find this year's talks (and other excellent material) available on InfoQ.

Alex Blewitt shared his impression on QCon London 2013:

New this year was the immediate availability of the videos on the same day as the conference. Prior years have drip-fed them out from the conference via InfoQ over the next six months, but this time raw video footage was available as early as the same evening. This was great for the conference goers who missed out an opportunity to see something (and would otherwise have forgotten had it not been immediately available). Once they’re edited with the slides in situ, they’ll be made available on InfoQ as well. The immediate access videos were only available to paying conference guests, though there was some interest in making a separate video pack available to purchase – if you have any thoughts on that, contact Floyd or I can pass messages on to him. …

What is clear is that it’s QCon’s biggest year ever. We expanded to more floors than before and had a higher footfall than any previous year I recall. And one of the things that makes QCon great is the diversity of talks with a wide range of industries (and government!) represented. Which other conference can you go from a robotic Pi to a massively distributed architecture talk and meet some of Computer Science’s greats into the bargain?

See you next year.

Twitter was flooded with impressions on QCon, including:

@jaumejornet: Each #QConLondon I got a book from @developerfocus, probably the best gift anyone can give you Thanks guys, u rock! http://t.co/xNu5cMZGma

@markgibaud: Awesome day at #qconlondon. So inspired, narrowed it down to about 8402658 things I want to try out next.

@floydmarinescu: Feels surreal to already be able to watch the presentations from #qconlondon from today - in video already :) http://t.co/LWGdJxvLB0

@skillsmatter: Thank you #qconlondon for 3 fantastic days!

@dthume: #qconlondon draws to a close once more. Props to all the staff; the organisation, and the nosh, were as top notch as ever.

@BlackPepperLtd: The #QConLondon conference has been a great few days. Lots of good content. Lots of great people. Thanks to all involved.

@CaplinTech: Aaaaand it's over. Bye bye #qconlondon thanks for the knowledge, new tools/techniques, new connections and free beer.

@matlockx: Great open space talk with @pniederw Thanks again Peter! #QConLondon

@EdMcBane: Great open space with @benjaminm @trisha_gee @IsraKaos and many others, talking about hiring great people. Too bad #QConLondon is over

@garethr: One thing I've taken away from #qconlondon is a realisation that Erlang has got popular in certain interesting circles

@dthume: #qconlondon - over for another year :( 11.8 months 'til I can talk about interesting stuff again with folk who share the love.

@PauliciPop: Awesome #qconlondon was awesome http://t.co/lpmkdUN2Sz

Conclusion

The seventh annual QCon London brought together over 1,100 attendees - including more than 100 speakers – that are spreading innovation in software development projects across the enterprise. QCon's focus on practitioner-driven content is reflected in the fact that the program committee that selects the talks and speakers is itself comprised of technical practitioners from the software development community.

QCon London 2013 was co-produced by InfoQ.com and Trifork – producer of the GOTO conference in Denmark. QCon will continue to run in London around March of every year. QCon also returns to Beijing and Tokyo this month and in August will be held in sunny Sao Paulo, Brazil.

In June 2013, InfoQ.com will be holding the second annual QCon New York which already has over 70/100 speakers confirmed. QCon New York 2013 will feature the same calibre of in-depth, practitioner-driven content presented at all QCon events worldwide, and will take place at the New York Marriott just outside of Manhattan (at the Brooklyn Bridge) on June 10-14, 2013

InfoQ Software Architects' Newsletter

Key Takeaway Points and Lessons Learned from QCon London 2013

Write for InfoQ

Related Sponsors

Keynotes

<Garbage Collection - The Useful Parts by Martijn Verburg

<Performance Testing Java Applications by Martin Thompson

Extending CloudFoundry with new Services by Chris Hedley, Andrew Crump

How to turn startup ideas into reality by taking money from strangers by Ian Brookes

The Why, What and How of Open Data by Jeni Tennison

Building APIs by building on APIs by Paul Downey, David Heath

Climbing out of a crisis loop: How a critical BBC back-end team reigned in a workflow crisis-to-crisis cycle by Rafiq Gemmail, Katherine Kirk

Between Fluffy Bunnies and Command & Control: Agile Adoption in Practice by by Benjamin Mitchell

Yanking business into testing - with lots of vegetables by Gojko Adzic, Lukas Oberhuber

Architecture of the Triposo travel guide by Jon Tirsen, Douwe Osinga

Physical Pi by Romilly Cocking, Steve Freeman

Hire Education - making interviews rock by Trisha Gee, Dan North

NoHR Hiring by Martijn Verburg, Zoe Slattery

Moderated NoSQL Panel by Alvin Richards, Chris Molozian, Andrew Elmore, Ian Robinson

Scaling for Humongous amounts of data with MongoDB by Alvin Richards

Becoming Polyglot; Putting Neo4j into production and what happened next by Toby O'Rourke

Eventual Consistency in the Real World by Chris Molozian

Financial Big Data - Loosely Coupled, Highly Structured by Andrew Elmore

Big Data @ Skype by Bryan Dove

Takeaways

Rate this Article

This content is in the Enterprise Architecture topic

Related Topics:

Related Editorial

Popular across InfoQ

The InfoQ Newsletter