BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles From Cloud to Cloudlets: a New Approach to Data Processing?

From Cloud to Cloudlets: a New Approach to Data Processing?

Leia em Português

Bookmarks

Key Takeaways

  • There is a growing consensus that the growing volume of end-device data to the cloud for processing is too resource-intensive, time-consuming, and inefficient to be processed by large, monolithic clouds.
  • It’s simply not feasible to shift data from every IoT device to cloud-based neural networks for voice recognition. However, by using interstitial cloudlets between these devices and the central cloud, neural networks can start informing the operation of even the smallest IoT devices.
  • By far the biggest challenge that cloudlets present is security, because as soon as data is fragmented across multiple storage and processing devices, it becomes more difficult to manage them securely. The most effective security measure is to encrypt data as they move between cloudlet, device, and cloud.
  • Despite all of the opportunities that cloudlet models present, they remain fairly rare as of 2020.

Even though new technologies are appearing all the time, the aggregate direction of travel of the past decade has been clear: away from local, distributed data processing and toward cloud storage.

This shift was partially motivated by the explosion in the numbers of connected devices over the past ten years, which has also necessitated that data be brought together in one place for processing and storage. 

Now, however, a strange reversion might be occurring. The growing popularity of small, distributed clouds, or “cloudlets” in the nomenclature, is an implicit recognition of the limitations of the “traditional” cloud model, and could signal a major shift in the way that data is collected, stored, and processed.

In this article, we’ll look at the rise of cloudlets: what they are, the challenges they present, and whether they are a more viable way of networking devices than the clouds we’ve grown used to.

Hyperscale vs. localscale

Though the term “cloudlet” is still relatively new (and relatively obscure) the central concept of it is not. Even from the earliest days of cloud computing, it was recognized that sending large amounts of data to the cloud to be processed raises bandwidth issues. Over much of the past decade, this issue has been masked by the relatively small amounts of data that devices have shared with the cloud. 

Now, however, the limitations of the standard cloud model are becoming all too clear.  There is a growing consensus that the growing volume of end-device data to the cloud for processing is too resource-intensive, time-consuming, and inefficient to be processed by large, monolithic clouds. 

Instead, say some analysts, these data are better processed locally. This processing will either need to take place in the device that is generating these data, or in a semi-local cloud that is interstitial between the device and an organization's central cloud storage.

This is what is meant by a "cloudlet”: intelligent device, cloudlet, and cloud. Cloudlets can be viewed as a datacenter in a box, with the goal to bring the cloud closer to the device by providing it with the ability to process at least some data locally.

It’s worth noting that this approach is significantly different from the way in which IoT infrastructure has been assumed to work. Even a few years ago, it was assumed that 5G connectivity would allow all of the data processing required for IoT devices – up to and including autonomous vehicles – to be performed in the cloud. 

It’s becoming increasingly clear, however, that the computational requirements of these devices – and particularly the desire to equip them with voice-recognition interfaces – is far larger than can be handled in monolithic clouds. 

The opportunities and the challenges

At the same time, organizations and engineers alike will be extremely hesitant to lose all of the advantages of cloud storage and processing. The added convenience and security that cloud models provide are one of the major reasons why 36% more companies are using more cloud-based applications this year than they did last year. But by moving some of these applications closer to their data sources, it is hoped, they can be made more efficient. 

The opportunities that could flow from doing so could be huge. At the moment, one of the biggest limitations on the use of neural networks is the sheer amount of data that they must be fed in order to be able to work effectively. 

It is simply not feasible, for instance, to shift data from every IoT device to cloud-based neural networks for voice recognition. However, by using interstitial cloudlets between these devices and the central cloud, neural networks could start informing the operation of even the smallest IoT devices. Moving data streams through the edge services of the public cloud’s edge services means moving that data through pipes that are linked to service areas across entire countries or provinces, which is why using cloudlets to transmit data instead can be more efficient. 

This will have several knock-on effects. One, already mentioned, will be a greatly increased capability to use voice recognition systems in IoT devices. Another equally as important effect will be an enhanced ability to use visual AI systems (such as facial recognition) in distributed locations. 

This said, the move to cloudlets is also creating some new challenges involving data distribution, storage and security. The most fundamental of these relates to the architectural models that cloudlets should be designed with. The fundamental assumption of the cloudlet model is that these smaller, distributed clouds can process data more quickly than their larger, centralized analogues. 

However, localized data processing of this type requires that sufficient power is available to local processing units, which is a challenge for inherently portable IoT devices. This issue is compounded if data must be moved over long geographic distances.

Secondly, researchers investigating the viability of the cloudlet model face another issue: keeping data synced and coherent not only across multiple devices, but also across multiple sub-clouds. Ensuring that data are consistent in this way doesn't just have implications for the reliability of cloud systems themselves: in the case of autonomous vehicles, it could be critical to their safe functioning.

Security and fragmentation

By far the biggest challenge that cloudlets present, however, is security. One of the major driving forces behind the move to cloud infrastructure has been that these systems are significantly more secure than their distributed counterparts, because they allow all data to be brought together and managed under a centralized system for access and control. This has led to cloud storage systems becoming extremely popular among security-conscious organizations and individuals. 

Moving to a model in which data is stored and processed in cloudlets may undermine security, because as soon as data is fragmented across multiple storage and processing devices, it becomes more difficult to manage them securely. To make matters worse, many of the proposed uses for cloudlet infrastructure call for data to be collected and stored on devices that hackers may be able to gain physical access to.

The clearest example of this is the type of cloudlets that are being proposed for autonomous vehicles. It’s not hard to imagine, for instance, that these vehicles could be connected together in cloudlets that allow them to share information on local traffic conditions without relying on a centralized infrastructure. The problem is that storing too many data in these de-centralized clouds leaves them vulnerable to physical attack, because a car is much easier to gain physical access to than a data center.

A second issue with the security of cloudlet systems is that – by their very nature – they are bespoke, individually customized systems. There is little point, in other words, in investing in an interstitial cloud, only to have it running on the same resource-intensive software as a centralized system. 

Designing bespoke cloudlet systems to handle the specific needs of equally bespoke IoT devices may make these more efficient, but it may also make them less secure. Large, monolithic cloud systems may have hundreds of security engineers looking out for potential threats: cloudlet systems that have been developed in-house cannot match this level of oversight. Recent advances in application security testing have addressed many of the security holes in cloudlet systems, but much work remains to be done.

On the other hand, there are some reasons to believe that cloudlet systems may be more secure than their full-cloud brethren. This is because the data collected by these smaller systems is inherently less exhaustive, and therefore inherently less valuable, than the full-spectrum data that resides on many cloud systems.

It’s also possible, of course, that in building systems which share data through a larger number of interstitial devices, more layers of protection can be put in place. 

In reality, however, the most effective way of doing this would be to encrypt data as they move between cloudlet, device, and cloud. One of the simplest to accomplish this would be to rely on a virtual private network that utilizes L2TP or IKEv2 protocols, both of which offer excellent security and reliability when it comes to negotiating a new tunnel session. At the moment neither device nor cloudlets possess the computational power to deploy strong encryption while retaining respectable performance.

Communications

Despite all of the opportunities that cloudlet models present, they remain fairly rare as of 2020. This is partially because the infrastructure required for the large-scale deployment of these systems is still being rolled out. 

One of these requirements is likely to be 5G connectivity. However, the way in which the 5G standard has been adopted in various parts of the world could have a major impact on the feasibility of the cloudlet model. 

Specifically, one of the major applications for cloudlet systems is likely to be in industrial (and particularly automotive) manufacturing, where construction robots may be linked together in a hierarchical “pyramid” of cloudlets and clouds. 

However, in the US 5G has been standardized on the higher S band, which is also used for radar. As a result it’s extremely difficult to allow these robots to communicate using 5G. In other countries, which have implemented a different 5G standard, using this extra connectivity to implement cloudlets in manufacturing may be easier.

It’s also worth noting, however, that 5G also comes with it’s own security issues. At the moment, the vast majority of data passed over 5G networks is encrypted, and this is possible because it is only processed and stored in the cloud. Cloudlets increase the surface attack area of these systems, raising concerns that these data could be stolen or surveilled. 

The future

Whether cloudlets have a viable future remains to be seen. In many ways, this model may offer the best of both worlds for network engineers: a way of deploying advanced functionality on distributed devices without the associated bandwidth problems. On the other hand, it could be argued that the cloudlet model, by trying to make this compromise work, does neither task particularly well. Specifically, cloudlets may undermine the security inherent in centralized systems, while not offering that much more computational capacity.

Seen in this way, cloudlets are likely to be a key “battleground” over the next few years. On one side of this battle are companies keen to push their edge computing technologies, and to shift the location of computation “further left” in the data flow – toward end devices and edge data centers. On the other side are large, well-resourced cloud storage and security companies who have a vested interest in keeping computational tasks in the cloud.

In reality, we are likely to see the emergence of a mixed model over the course of the next decade, and this will be more motivated by necessity than desire. After all, 90% of all data that exists today was generated in the past two years, and 80% of that is video or images. Given this, it’s simply not feasible to continue to store and process data either on devices or in the cloud.

What is required, in other words, are a number of interstitial computation and storage systems, and – most importantly – systems that are able to manage requests in an intelligent way. 

Conclusion

The growing momentum behind CXL – which can be seen as a hardware-implemented cloudlet system – shows the value of having dynamic, assignable resources, and this insight is likely to have a huge impact on the way that cloudlets are used in the coming few years.

This will represent something of a reversal, of course, because in the 1970s the very first corporate networks, running on hierarchical mainframe models, used a system very similar to cloudlets. It may be, therefore, that we’ve come full circle. 

About the Author

Sam Bocetta is a former security analyst, having spent the bulk of his as a network engineer for the Navy. He is now semi-retired, and educates the public about security and privacy technology. Much of Sam’s work involved penetration testing ballistic systems. He analyzed our networks looking for entry points, then created security-vulnerability assessments based on my findings. Further, he helped plan, manage, and execute sophisticated "ethical" hacking exercises to identify vulnerabilities and reduce the risk posture of enterprise systems used by the Navy (both on land and at sea). The bulk of his work focused on identifying and preventing application and network threats, lowering attack vector areas, removing vulnerabilities and general reporting. He was able to identify weak points and create new strategies which bolstered our networks against a range of cyber threats. Sam worked in close partnership with architects and developers to identify mitigating controls for vulnerabilities identified across applications and performed security assessments to emulate the tactics, techniques, and procedures of a variety of threats.

Rate this Article

Adoption
Style

BT