Shadi Ibrahim @ Inria

Rennes - Bretagne Atlantique Research Center

Job Opening

One Ph.D position on energy-efficient data management in the Cloud is available. Interested students please contact me for further information.

One Post-doc position is available (see below)

Stream data processing in shared Fog environments
Advisors: Shadi Ibrahim (Myriads team)
Main contacts: shadi.ibrahim (at) inria.fr
Application deadline: as early as possible!

The mutual low-latency objective for both Data Stream Processing (DSP) and Fog environments has resulted in a continuous growth of DSP deployments on Fog environments. The success of DSP deployments in the Fog relies on operators placements and the ability to sustain low latency. Accordingly, much work have focused on placement strategies across Edge-servers or across hybrid Cloud and Edge environments. Previous efforts have focused on reducing the volume of communication overhead between nodes (inter-node communication) and dividing the computation between edge servers and clouds. Unfortunately, they are oblivious to (1) the dynamic nature of data streams (i.e., data volatility and bursts) and to (2) the bandwidth and resource heterogeneity in the Edge, which negatively affects the performance of stream data applications.
In a recent work, we addressed the problem of data stream dynamicity. In particular, we showed that Maximum Sustainable Throughput (MST) -- which refers to the amount of data that a DSP system can ingest while keeping stable performance -- should be considered as an optimization objective for operators placements in the Edge. Accordingly, we design and evaluate a MST-driven operators placement (based on constraint programming) for stream data applications [1].

The goal of this project is to investigate how to enable dynamic operators placements in heterogeneous and dynamic environments like Fogs and meet the requirements of diverse stream data applications. Accordingly, we will develop a new scheduling framework (operators’ placement) that allows a stream data application to receive the compute and I/O resources it requires to compute, transfer and store data when running in a shared Fog environment. The proposed framework will be integrated in one of state-of the art data stream engines such as Flick [2], Storm [3] or Spark [4] and evaluated at large-scale using syntactic applications and real-world stream data application.

[1] Thomas Lambert, David Guyon, and Shadi Ibrahim. 2020. Rethinking Operators Placement of Stream Data Application in the Edge. In The 29th ACM International Conference on Information and Knowledge Management (CIKM ’20), October 19–23, 2020, Virtual Event, Ireland.
[2] “Apache flink,” https://flink.apache.org.
[3] Apache Storm. 2020. https://storm.apache.org/
[4] M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica, “Discretized streams: Fault-tolerant streaming computation at scale,” in Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, ser. SOSP ’13, 2013, pp. 423–438.

Scheduling stream data executions in heterogenous environments
Advisors: Shadi Ibrahim (Myriads team)
Main contacts: shadi.ibrahim (at) inria.fr
Application deadline: Filled

Stream data processing applications are emerging as first-class citizens in many enterprises and academia. Several stream data engines have been developed to meet the low latency requirement of these applications including Flink [1], Spark Streaming [2], Storm [3], and Google MillWheel [4]. The input rates of data streams do not usually remain unchanged. In addition, the multi-tenant nature of clouds and edge makes resource heterogeneity as the norm in such environments. Therefore, It is important to consider both resource heterogeneity and data volatility when scheduling stream data applications.
The goal of this work is to investigate scheduling techniques (i.e., task cloning and replication), operators placements and scheduling policies to improve the performance of stream data applications to meet the heterogeneity of resources in distributed data centers. Furthermore, we will extend the work to different distributed environments including Edge, and disaggregated computing.
[1] “Apache flink,” https://flink.apache.org.
[2] M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica, “Discretized streams: Fault-tolerant streaming computation at scale,” in Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, ser. SOSP ’13, 2013, pp. 423–438.
[3] “Storm,” http://storm-project.net/.
[4] T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle, “Millwheel: fault-tolerant stream processing at internet scale,” Proceedings of the VLDB Endowment, vol. 6, no. 11, pp. 1033–1044, 2013.

Scalable and fast in-memory stream data processing middleware
Advisors: Shadi Ibrahim (STACK team)
Main contacts: shadi.ibrahim (at) inria.fr
Application deadline: Filled
Location: Inria, Nantes

Stream data engines like Flink [1], Spark Streaming [2], Storm [3], and Google MillWheel [4] usually operate under two paradigms: (1) data stream paradigm where streams are processed record by record to favor low latency, and (2) operation stream paradigm where streams are proceeded in micro-batches to achieve high throughput. This limits their performance in terms of throughput and latency when the data stream exhibits high variation in their load.
The goal of this work is to provide scalable and fast in-memory stream data processing middleware. We will define a software architecture to provide on-line fast data processing, the obvious way is to do in-memory data processing and implemented in a middleware level prototype. Moreover, we will introduce a pluggable task scheduling framework to provide the most appropriate allocation of tasks to nodes considering the current load of the streams, the resource heterogeneity (caused by resource contention in clouds) and the network latency, thus, reducing the gap between the high performance of the in-memory access and the limited yet “varying" network latency.
[1] “Apache flink,” https://flink.apache.org.
[2] M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica, “Discretized streams: Fault-tolerant streaming computation at scale,” in Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, ser. SOSP ’13, 2013, pp. 423–438.
[3] “Storm,” http://storm-project.net/.
[4] T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle, “Millwheel: fault-tolerant stream processing at internet scale,” Proceedings of the VLDB Endowment, vol. 6, no. 11, pp. 1033–1044, 2013.

Towards scalable and reliable Big Data stream computations on Clouds
Advisors: Shadi Ibrahim (STACK team)
Main contacts: shadi.ibrahim (at) inria.fr
Application deadline: Filled
Location: Inria, Nantes

Stream data processing applications are emerging as first-class citizens in many enterprises and academia. For example, click-through rates of links in social networks, abuse prevention, etc. Hadoop MapReduce can not deal with stream data applications as it requires the data to be initially stored in a distributed file system in order to process them. Several systems have been introduced for stream data processing such as Flink [1], Spark Streaming [2], Storm [3], and Google MillWheel [4], etc. These systems keep computation in-memory for low-latency and preserve scalability through using data-partitioning or dividing the streams into a set of deterministic batch computations. However, they are designed to work in dedicated environments and they do not consider the performance variability (i.e., network, I/O, etc.) caused by resource contention in the cloud. This variability may in turn cause high and unpredictable latency when output streams are transmitted to further analysis. Moreover, they overlook the dynamic nature of data streams and the volatility in their computation requirements. Finally, they still address failures in a best-effort manner.
The goal of this work is to propose new approaches for reliable, stream Big Data processing on clouds by (1) exploring new mechanisms that expose resource heterogeneity (observed variability in resource utilization at runtime) when scheduling stream data applications; (2) investigating how to automatically adapt to node failures and adapt the failure handling techniques to the characteristics of the running application and to the root cause of failures.
[1] “Apache flink,” https://flink.apache.org.
[2] M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica, “Discretized streams: Fault-tolerant streaming computation at scale,” in Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, ser. SOSP ’13, 2013, pp. 423–438.
[3] “Storm,” http://storm-project.net/.
[4] T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle, “Millwheel: fault-tolerant stream processing at internet scale,” Proceedings of the VLDB Endowment, vol. 6, no. 11, pp. 1033–1044, 2013.

Fast and efficient data-intensive workflow executions in distributed Data-centers
Advisors: Shadi Ibrahim (STACK team)
Main contacts: shadi.ibrahim (at) inria.fr
Application deadline: Filled
Location: Inria, Nantes

Many large companies are now deploying their services globally to guarantee low latency to users around the world and to ensure high availability and low cost. For example, social networks like Facebook and twitter store their data on geo-distributed DCs to provide services worldwide with low latency. However, to run data-intensive workflows on top of those geo-distributed environments (e.g., video stream processing, geo-distributed scientific data analytics, etc), several challenges arise to existing data-intensive distributed workflow frameworks (e.g., MapReduce, Hadoop, Spark, Tenserflow), due to the low capacity of WAN links, the heterogeneity of resources, the multi-levels of network heterogeneities in geo-distributed DCs. The goal of this work is to investigate novel scheduling policies and mechanisms for fast data-intensive workflow executions in massively distributed environments. In particular, how to improve the overall performance by considering the nature and size workflows inputs (e.g., streams, fixed datasets, etc) and intermediate data, number of iterations, and the heterogeneity of resources when allocating tasks/jobs inside and in-between DCs.

Resource management and scheduling for Stream Data Applications on Clouds
Advisors: Shadi Ibrahim (STACK team)
Main contacts: shadi.ibrahim (at) inria.fr
Application deadline: Filled
Location: Inria, Nantes

Stream data processing applications are emerging as first-class citizens in many enterprises and academia. However, It is now commonplace for an organization to use the same infrastructure for a variety of data intensive applications. While few work focused on scheduling multiple data-intensive applications with mixed requirements (e.g., deadlines), no work has focused on different types of applications (i.e., stream data applications). Moreover, when sharing cloud resources, fairness, consolidation and performance come into question. For instance, how can we keep preserving high system utilization and avoiding QoS violation of the diverse applications. To address these challenges, an adaptive job scheduling framework will be developed. The framework will be accommodated with several scheduling polices that can be adaptively tuned in response to the application’s behavior and requirement.

Adapted from a template by FreeHTML5.co