Charmed Apache Spark Components Overview
The Charmed Apache Spark Solution bundles the following components:
- spark8t, which is a Python package to enhance Apache Spark capabilities allowing to manage Spark jobs and service accounts, with hierarchical level of configuration
- Charmed Apache Spark Rock OCI-compliant Image, that bundles Apache Spark binaries together with Canonical tooling to be used to start your Apache Spark workload on Kubernetes, to use Charmed Apache Spark CLI tooling or derive your own images from secured and supported bases;
- Apache Spark Client Snap, to simplify Apache Spark installation on edge nodes or local machines, by leveraging on confined SNAPs and exposing simple Snap commands to run and manage Spark jobs
- Charmed Bundle to deploy, manage and operate Charmed Apache Spark using Juju. This includes:
- Spark History Server to expose a web UI for analysing the logs of previous Spark jobs
- Charmed Apache Kyuubi to provide a JDBC/ODBC endpoint for running Hive powered by Apache Spark engines
- Integration Hub for Apache Spark to enable easy configuration of Apache Spark service accounts, providing a native Juju integration with S3 Integrator and Azure Storage Integrator for enabling object-storage persistence and with the Canonical Observability Stack (COS) for enabling resource usage monitoring and alerting.
The following image shows how the different artifacts interacts with each other:
flowchart TD
spark8t["`**spark8t**
(*python package*)
exposes functionalities to create, configure and manage Apache Spark users via a Python SDK`"]
spark-rock["`**Charmed Apache Spark Rock**
(*OCI Image*)
provides a reliable Apache Spark image to run Apache Spark applications and Apache Spark CLI tooling`"]
spark-client["`**Spark Client Snap**
(*SNAP*)
simplify client integration with an Apache Spark Kubernetes cluster via a snap package to be installed in edge nodes or locally`"]
spark-k8s-bundle["`**Charmed Apache Spark**
(*Charmed Operator*)
manages the entire lifecycle of Spark jobs`"]
spark8t --> spark-rock
spark8t --> spark-client
spark-rock --> spark-client
spark-rock --> spark-k8s-bundle
The Charmed Apache Spark solution can be used to deploy and manage Apache Spark workloads using the provided distribution on any conformant Kubernetes (for versions 1.26 and above), like
- MicroK8s, which is the simplest production-grade conformant K8s. Lightweight and focused. Single command install on Linux, Windows and macOS. Refer to here for more information.
- Charmed Kubernetes, which is a platform independent, model-driven distribution of Kubernetes powered by juju
- AWK EKS, which is the managed Kubernetes service provided by Amazon Web Services to run Kubernetes in the AWS cloud and on-premises data centers.
Setup instructions are available in the Apache Spark Tutorial Set up the environment chapter