Carlos Segarra

Publications

[NSDI 2025] GRANNY: Granular Management of Compute-Intensive Applications in The Cloud (PDF)
C. Segarra, S. Shillaker, G. Li, E. Mappoura, R. Bruno, L. Vilanova, P. Pietzuch

Abstract:

Parallel applications are typically implemented using multi-threading (with shared memory, e.g., OpenMP) or multi-processing (with message passing, e.g., MPI). While it seems attractive to deploy such applications in cloud VMs, existing cloud schedulers fail to manage these applications efficiently: they cannot scale multi-threaded applications dynamically when more vCPUs in a VM become available, and they cause fragmentation over time because of the static allocation of multi-process applications to VMs.

We describe GRANNY, a new distributed runtime that enables the fine-granular management of multi-threaded/process applications in cloud environments. GRANNY supports the vertical scaling of multi-threaded applications within a VM and the horizontal migration of multi-process applications between VMs. GRANNY achieves both through a single WebAssembly-based execution abstraction: Granules can execute application code with thread or process semantics and allow for efficient snapshotting. GRANNY scales up applications by adding more Granules at runtime, and de-fragments applications by migrating Granules between VMs. In both cases, it launches new Granules from snapshots efficiently. We evaluate GRANNY with dynamic scheduling policies and show that, compared to current schedulers, it reduces the makespan for OpenMP workloads by up to 60% and the fragmentation for MPI workloads by up to 25%.

[SESAME @ EuroSys 2025] Mosaic: Optimizing Cloud Resource Efficiency with Lazily-Packaged Application Modules (PDF)
S. Ivanenko, C. Segarra, R. Bruno

Abstract:

Modern cloud platforms require users to build their application code and package it with its run-time dependencies in a hardware-agnostic package like a container or VM image. Packaging applications before their deployment on their target hardware prevents users from leveraging optimized hardware or harnessing opportunities that appear at run-time, like application co-location for faster communication. This forces developers to preemptively package the same application for an ever-increasing universe of possible target architectures, bloating cloud storage and package registries.

We present Mosaic, our vision for a new modular architecture to build cloud applications that delays the packaging of applications until they are deployed on their target resources. Applications in Mosaic are composed of individual Cloud Modules: a language- and hardware-independent representation for application code and library dependencies, where each module offers a public API to communicate with other modules. We discuss a prototype implementation based on the WebAssembly instruction format and demonstrate its potential benefits to leverage hardware-optimized libraries to improve performance, bypass communication stacks for distributed applications that exhibit co-location, and reduce storage bloat through deduplication and sharing.

[SoCC 2024] Is It Time To Put Cold Starts In The Deep Freeze? (PDF)
C. Segarra, I. Durev, P. Pietzuch

Abstract:

Cold-start times have been the "end-all, be-all" metric for research in serverless cloud computing over the past decade. Reducing the impact of cold starts matters, because they can be the biggest contributor to a serverless function's end-to-end execution time. Recent studies from cloud providers, however, indicate that, in practice, a majority of serverless functions are triggered by non-interactive workloads. To substantiate this, we study the types of serverless functions used in 35 publications and find that over 80% of functions are not semantically latency sensitive. If a function is non-interactive and latency insensitive, is end-to-end execution time the right metric to optimize in serverless? What if cold starts do not matter that much, after all?

In this vision paper, we explore what serverless environments in which cold starts do not matter would look like. We make the case that serverless research should focus on supporting latency insensitive, batch, workloads. Based on this, we explore the design space for DFaaS, a serverless framework with an execution model in which functions can be arbitrarily delayed. DFaaS users annotate each function with a delay tolerance and, as long as the deadline has not passed, the runtime may interrupt or migrate function execution. Our micro-benchmarks suggest that, by targeting batch workloads, DFaaS can improve substantially the resource usage of serverless clouds and lower costs for users.

[SESAME @ EuroSys 2024] Serverless Confidential Containers: Challenges and Opportunities (PDF)
C. Segarra, T. Feldman-Fitzthum, D. Buono, P. Pietzuch

Abstract:

Serverless computing allows users to execute pieces of code (so called functions) on-demand in the cloud without having to provision any hardware resources. However, by executing in the cloud and delegating control over hardware resources, the integrity of the execution and the confidentiality of function code and data are at the mercy of the cloud provider and serverless runtime. Confidential computing aims to remove trust from the cloud provider by executing applications inside hardware enclaves. In spite of the increasing adoption of confidential computing, designing a confidential serverless runtime with moderate performance overhead remains an open challenge.

In this short article we present our experience porting the Knative serverless runtime to a confidential setting using Confidential Containers (CoCo), a technology that allows the execution of unmodified (encrypted) container images inside confidential VMs (cVMs). Our results show that cVMs are not ready to execute container-based serverless functions. Starting a serverless function in a CoCo from an encrypted container image with attestation takes up to 17 seconds. Starting 16 serverless functions concurrently takes more than three minutes, 20× slower than its non-confidential counterpart. We analyze the main sources of overhead, and outline the research challenges to bridge the gap between confidential and serverless computing.

[BITCOIN 2021] A Novel Framework for the Analysis of Unknown Transactions in Bitcoin: Theory, Model, and Experimental Results (PDF)
M. Caprolu, M. Pontecorvi, M. Signorini, C. Segarra, R. Pietro

Abstract:

Bitcoin (BTC) is probably the most transparent payment network in the world, thanks to the full history of transactions available to the public. Though, Bitcoin is not a fully anonymous environment, rather a pseudonymous one, accounting for a number of attempts to beat its pseudonimity using clustering techniques. There is, however, a recurring assumption in all the cited deanonymization techniques: that each transaction output has an address attached to it. That assumption is false. An evidence is that, as of block height 591,872, there are several millions transactions with at least one output for which the Bitcoin Core client cannot infer an address.

In this paper, we present a novel approach based on sound graph theory for identifying transaction inputs and outputs. Our solution implements two simple yet innovative features: it does not rely on BTC addresses and explores all the transactions stored in the blockchain. All the other existing solutions fail with respect to one or both of the cited features. In detail, we first introduce the concept of Unknown Transaction and provide a new framework to parse the Bitcoin blockchain by taking them into account. Then, we introduce a theoretical model to detect, study, and classify—for the first time in the literature— unknown transaction patterns in the user network. Further, in an extensive experimental campaign, we apply our model to the Bitcoin network to uncover hidden transaction patterns within the Bitcoin user network. Results are striking: we discovered more than 30, 000 unknown transaction DAGs, with a few of them exhibiting a complex yet ordered topology and potentially connected to automated payment services. To the best of our knowledge, the proposed framework is the only one that enables a complete study of the unknown transaction patterns, hence enabling further research in the fields—for which we provide some directions.

[SRDS 2020] MQT-TZ: Hardening IoT Brokers Using ARM TrustZone (PDF)
C. Segarra, R. Delgado-Gonzalo, V. Schiavoni

Abstract:

The publish-subscribe paradigm is an efficient communication scheme with strong decoupling between the nodes, that is especially fit for large-scale deployments. It adapts natively to very dynamic settings and it is used in a diversity of real-world scenarios, including finance, smart cities, medical environments, or IoT sensors. Several of the mentioned application scenarios require increasingly stringent security guarantees due to the sensitive nature of the exchanged messages as well as the privacy demands of the clients/stakeholders/receivers. MQTT is a lightweight topic-based publish-subscribe protocol popular in edge and IoT settings, a de-facto standard widely adopted nowadays by the industry and researchers. However, MQTT brokers must process data in clear, hence exposing a large attack surface.

This paper presents MQT-TZ, a secure MQTT broker leveraging ARM TRUSTZONE, a trusted execution environment (TEE) commonly found even on inexpensive devices largely available on the market (such as Raspberry Pi units). We define a mutual TLS-based handshake and a two-layer encryption for end-to-end security using the TEE as a trusted proxy. The experimental evaluation of our fully implemented prototype with micro-, macro-benchmarks, as well as with real-world industrial workloads from a MedTech use-case, highlights several tradeoffs using TRUSTZONE TEE. We report several lessons learned while building and evaluating our system. We release MQT-TZ as open-source.

[EuroSys 2020] Kollaps: Decentralized and Dynamic Topology Emulation (PDF)
P. Gouveia, J. Neves, C. Segarra, L. Liechti, S. Issa, V. Schiavoni, M. Matos

Abstract:

The performance and behavior of large-scale distributed applications is highly influenced by network properties such as latency, bandwidth, packet loss, and jitter. For instance, an engineer might need to answer questions such as: What is the impact of an increase in network latency in application response time? How does moving a cluster between geographical regions affect application throughput? How network dynamics affects application stability? Answering these questions in a systematic and reproducible way is very hard, given the variability and lack of control over the underlying network. Unfortunately, state-of-the-art network emulation or testbeds scale poorly (i.e., MiniNet), focus exclusively on the control-plane (i.e., CrystalNet) or ignore network dynamics (i.e., EmuLab).

Kollaps is a fully distributed network emulator that address these limitations. Kollaps hinges on two key observations. First, from an application’s perspective, what matters are the emergent end-to-end properties (e.g., latency, bandwidth, packet loss, and jitter) rather than the internal state of the routers and switches leading to those properties. This premise allows us to build a simpler, dynamically adaptable, emulation model that circumvent maintaining the full network state. Second, this simplified model is maintainable in a fully decentralized way, allowing the emulation to scale with the number of machines for the application.

Kollaps is fully decentralized, agnostic of the application language and transport protocol, scales to thousands of processes and is accurate when compared against a bare-metal deployment or state-of-the-art approaches that emulate the full state of the network. We showcase how Kollaps can accurately reproduce results from the literature and predict the behaviour of a complex unmodified distributed key-value store (i.e., Cassandra) under different deployments.

[MIE 2020] MQT-TZ: Secure MQTT Broker for Biomedical Signal Processing on the Edge (PDF)
C. Segarra, R. Delgado-Gonzalo, V. Schiavoni

Abstract:

Physical health records belong to healthcare providers, but the information contained within belongs to each patient. In an increasing manner, more health-related data is being acquired by wearables and other IoT devices following the ever-increasing trend of the Quantified Self. Even though data protection regulations (e.g., GDPR) encourage the usage of privacy-preserving processing techniques, most of the current IoT infrastructure was not originally conceived for such purposes. One of the most used communication protocols, MQTT, is a lightweight publish-subscribe protocol commonly used in the Edge and IoT applications. In MQTT, the broker must process data on clear text, hence exposing a large attack surface for a malicious agent to steal/tamper with this health-related data. In this paper, we introduce MQT-TZ, a secure MQTT broker leveraging Arm TRUSTZONE, a popular Trusted Execution Environment (TEE). We define a mutual TLS-based handshake and a two-layer encryption for end-to-end security using the TEE as a trusted proxy. We provide quantitative evaluation of our open-source PoC on streaming ECGs in real time and highlight the trade-offs.

[IEEE EMBC 2019] Secure Stream Processing for Medical Data (PDF)
C. Segarra, E. Muntane, M. Lemay, V. Schiavoni, R. Delgado-Gonzalo

Abstract:

Medical data belongs to whom it produces it. In an increasing manner, this data is usually processed in unauthorized third-party clouds that should never have the opportunity to access it. Moreover, recent data protection regulations (e.g., GDPR) pave the way towards the development of privacy-preserving processing techniques. In this paper, we present a proof of concept of a streaming IoT architecture that securely processes cardiac data in the cloud combining trusted hardware and Spark. The additional security guarantees come with no changes to the application’s code in the server. We tested the system with a database containing ECGs from wearable devices comprised of 8 healthy males performing a standardized range of in-lab physical activities (e.g., run, walk, bike). We show that, when compared with standard SPARK STREAMING, the addition of privacy comes at the cost of doubling the execution time.

[DAIS 2019] Using Trusted Execution Environments for Secure Stream Processing of Medical Data (PDF)
C. Segarra, R. Delgado-Gonzalo, M. Lemay, P. Aublin, P. Pietzuch, V. Schiavoni

Abstract:

Processing sensitive data, such as those produced by body sensors, on third-party untrusted clouds is particularly challenging without compromising the privacy of the users generating it. Typically, these sensors generate large quantities of continuous data in a streaming fashion. Such vast amount of data must be processed efficiently and securely, even under strong adversarial models. The recent introduction in the mass-market of consumer-grade processors with Trusted Execution Environments (TEEs), such as Intel SGX, paves the way to implement solutions that overcome less flexible approaches, such as those atop homomorphic encryption. We present a secure streaming processing system built on top of Intel SGX to showcase the viability of this approach with a system specifically fitted for medical data. We design and fully implement a prototype system that we evaluate with several realistic datasets. Our experimental results show that the proposed system achieves modest overhead compared to vanilla Spark while offering additional protection guarantees under powerful attackers and threat models.

Last update: 27/05/2025