InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Architecture of a $7 Billion Loss: Causes and Remedies

Posted by Jean-Jacques Dubray on May 26, 2008

Sections
Development,
Architecture & Design
Topics
Security ,
Architecture

The Société Générale released last week the "Green Report" prepared by PWC that details how a trader, Jerome Kerviel, lost 4.9 GEUR [1] (over 7 Billion USD) on behalf of the bank where he worked.

The Profit & Loss curve best summarize the debacle that led to such a loss:


The dash lines in the middle represent the "Official" P&L while the gray line represent the real P&L of the trader.

By the end of Q2 2007, the 2 GEUR loss (3 billion USD) was already undetected.

How could this ever happen?  Security? Processes? Or lack of control? Olivier Rafal, columnist at Le Monde Informatique, does not think so.  He reports that the trader was controlled 75 times during that period. He believes that Jerome's management simply looked the other way. The report talks about nearly 1000 faked trades that were used to cover his P&L.

Eurex had actually warned the Société Générale of unusual positions as early as November 2007 and stated:

the risk management at its exchanges had functioned correctly, "also in this case".

Jean-Pierre Mustier, the head of investment banking, responded:

When controls came up, most of the time he admitted that it was not a proper transaction and that it was a mistake. He was replacing it with another transaction of a different nature that would be checked by another department.

The report talks about an accomplice who entered compensating transactions to hide Jerome's real position. Since it is relevant to the judicial investigation the report cannot comment. It notes however that about 15% of the fraudulent transactions were entered by a trader assistant. 

From an IT perspective, the report explains (page 1):

The fraudulent activity resulted in a massive position [49 GEUR (roughly 75 Billion USD) in January 2008] which has been masked, as well as the risk and their P&L using three types of techniques:

  • Entry followed by cancellation of fake operations hiding the risks and the P&L. The trader entered one or several fake operations in the systems so that they could be taken into account in risk calculation and value of the portfolio.... we have identified 947 transactions of this type.
  • Entry of fake compensated transaction  (buy/sell) for identical quantities for different prices "outside the market", with the goal to mask the P&L when transactions become effective... we have identified 115 transactions of this type
  • Entry of provisions that would temporarily cancel his P&L. The trader used the ability to correct model biases, normally reserved to trader-assistants -without access rights to prevent traders to enter them-, to enter positive or negative provisions [in the middle-office system] to modify the calculated value [of a position] by the front-office system. We have identified 9 operations of this type.

The CEO of the Societe Generale explained last April that they uncovered a major design flaw in their processes

Controls were in place but we were missing something that we have been doing manually since January 24 and that we are currently automating. This is what is called the "cross-control" which allows to detect when someone is cancelling to many operations"

The PWC report recommends

  • using biometric authentication instead of Windows authentication for the most sensitive applications
  • forbidding any transaction from the front-office onto middle-office applications
  • considering forbidding any XL connection where the password is stored in the spreadsheet
  • secure reporting applications (the report notes that many reporting feeds have been insufficiently tested)
  • check if the workstation matches the potential user of an application

Some wonder if these fixes will have any effect on future frauds since the report explains that the trader was able to justify his fraudulent activities with seven fake emails.

[1] GEUR means "one billion Euros"

Too much Noise, Not enough Signal by Gavin Terrill Posted
Re: Too much Noise, Not enough Signal by Jean-Jacques Dubray Posted
  1. Back to top

    Too much Noise, Not enough Signal

    by Gavin Terrill

    Soc Gen had a GRC program in place, but management were inundated with routine, dare I say "mundane", alerts. Is this really management's fault? Perhaps a case of too much noise and not enough signal? In any case, you have to wonder how effectively their KRI component was operating. To me, an effective KRI program needs to incorporate temporal concerns so that anomalies, such as an accomplice placing a counter balancing but bogus trade to offset a large position quite soon after the initial breach, are also considered. I believe there is great promise in utilizing tools such as Complex Event Processing, in conjunction with Predictive Analytics, to better help detect and prevent these types of losses in the future.

  2. Back to top

    Re: Too much Noise, Not enough Signal

    by Jean-Jacques Dubray

    Gavin:



    Yes, I agree with you, it looks like a lot of people are quickly concluding that "a few people were the reason for this massive loss" and "IT is not at fault".



    If I understand the explanation it looks like the trader was able to defeat the KRI system by entering fake operations. So IMHO, it does not matter how well the component was working because at the end of the day you are just a CRUD operation away from a disaster.



    I however disagree with you with the "temporal concerns" being located in the KRI system. Architecturally, there is a much better way to deal with "temporal concerns", or should I say long-running asynchronous behavior?



    If you think in terms of "Resource Lifecycles" and state/transitions you realize that enabling people to "transition" resources to arbitrary states is the wrong thing to do. For instance a provision should "expire".



    Of course when we are trying to build solutions with a CRUD-oriented Synchronous Client/Server Programming Model for a world that is Peer-to-Peer, Asynchronous and Long-Running, it is no surprise to me that this kind of thing can happen.



    The lifecycle of a resource is something that cannot be violated regardless of the "human tasks" that are performed, it is enterprise wide business logic, it is independent of the "business process", i.e. the work flow of activities (automated or not) that are performed to advance the state of resources.



    >> believe there is great promise in utilizing tools such as Complex Event Processing,
    yes of course, provided that you can define events easily, clearly and properly which is not the case in a CRUD-oriented Synchronous Client/Server programming model. Since the states of a resource are explicit in a resource lifecycle definition, you have a much better position to define events (i.e. the occurrence of a state) and monitor the actions that result in state transitions.

Educational Content

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.