Protecting Databricks cluster init scripts

This blog site was co-authored by Elia Florio, Sr. Director of Detection & & Action at Databricks and Florian Roth and Marius Bartholdy, security scientists with SEC Consult.

Securing the Databricks platform and continually raising the bar with security enhancements is the objective of our Security group and the primary reason we purchase our bug bounty program Through this program, we motivate (and benefit) submissions from gifted market specialists who bring prospective issues to our attention. Collaborating with the bigger security neighborhood, we can reveal and remediate recently found item problems and make the Databricks platform a much more protected and safe location.

When possible and fascinating to the security neighborhood, we likewise share success stories from partnerships that come out of our bug bounty program. Today we want to display how a well-written report from SEC Consult assisted speed up the sunsetting of particular deprecated tradition functions and the adoption of our brand-new function, work space files

This blog site consists of different areas authored by Databricks and SEC Seek advice from respectively. It enters into the technical information of the found security issues, the impacted setups, the effect, and the options executed to attend to the vulnerability. We want to thank SEC Seek advice from for their professionalism and cooperation on this disclosure.


At end of January 2023, Databricks got a report from SEC Consult about a possible benefit escalation concern that might enable a verified, low-privileged user of a cluster to raise opportunities and get admin-level access to other clusters within the border of the exact same work space and company.

Our preliminary examination lined up with the finder’s report and revealed that exploitation of this concern needed (a) a possible aggressor to be in ownership of a legitimate verified account, and (b) the relevant work space to have either tradition international init script for clusters allowed, or additionally, the existence of a preconfigured init script (cluster-named or cluster-scoped) kept on DBFS. In contrast to the case of cluster init scripts kept in DBFS, where the vulnerability can just be made use of where a script exists, enablement of tradition international init script (without a script file) suffices to be exposed to this concern.

In both cases (tradition international init script allowed or cluster init scripts kept in DBFS), a verified low-privileged user might include or take control of an init script and perform extra commands utilizing the raised opportunities connected with running init scripts. Databricks has actually not discovered proof of such benefit escalations happening in practice.

It is essential to keep in mind that tradition international init scripts currently reached deprecation status almost 3 years earlier which clients might disable such tradition scripts by the easy switch of a toggle currently present in the item UI ( AWS| Azure).

Easy toggle available to disable legacy global init script
Figure 1: Easy toggle offered to disable tradition international init script

The following table sums up the most typical circumstances for various kinds of init scripts ( AWS| Azure| GCP):

Init script type

Appropriate cloud

Vulnerability status

Formerly deprecated

Tradition International

AWS, Azure




AWS, Azure




AWS, Azure, GCP

Not Susceptible


( kept on DBFS)

AWS, Azure, GCP



( kept as work space files)

AWS, Azure, GCP

Not Susceptible


( kept on AWS/Azure/GCP)

AWS, Azure, GCP

Not Susceptible


In reaction to this report from SEC Consult, we seized the day to solidify our platform and keep clients safe with a series of extra actions and brand-new item functions:

  • We instantly disabled the production of brand-new offices utilizing the deprecated init script types (particularly: tradition international init script and cluster-named scripts);
  • We revealed a rigorous End-Of-Life due date (September 1, 2023) for all deprecated init script types to additional speed up the migration to much safer options;
  • We engaged staying clients who didn’t follow our earlier suggestions of disabling deprecated init scripts and assisted them move to much safer options by offering tools to automate the procedure both for tradition international init scripts and cluster-named init scripts;
  • Our Item and Engineering groups included assistance for cluster-scoped init scripts to be kept in work space files ( AWS| Azure| GCP), a more protected option just recently made usually offered We likewise altered the default area of cluster-scoped init scripts in the item UI to be work space files and included a noticeable message for users who still try to utilize DBFS to shop init scripts.
New recommendation to use workspace files instead of DBFS
Figure 2: New suggestion to utilize work space files rather of DBFS

Files support in the work space enables Databricks users to save Python source code, recommendation information sets, or any other kind of file material (consisting of init scripts) straight together with their note pads ( AWS| Azure| GCP). Workspace files extend abilities formerly offered in Databricks Repos throughout the whole platform, even if users are not dealing with variation control systems. Workspace files likewise enables you to protected access to private files or folders utilizing that things’s Gain access to Control Lists (ACLs) ( AWS| Azure| GCP), which can be set up to restrict access to users or groups.

Workspace files is the new default location to store init scripts
Figure 3: Workspace files is the brand-new default area to shop init scripts

Assistance and suggestions

We have actually been motivating clients to move far from tradition and deprecated init scripts for the past 3 years and this security finding just recently reported by SEC Seek advice from just highlights why clients ought to finish this migration journey as quickly as possible. At the exact same time, the intro of work space files for init scripts ( AWS| Azure| GCP) marks the preliminary turning point of our prepare for providing a contemporary and more protected storage option to DBFS.

Consumers can increase the security for their Databricks releases and reduce the security concern talked about in this blog site by doing the following:

  1. Instantly disable tradition international init scripts ( AWS| Azure) if not actively utilized: it’s a safe, simple, and instant action to close this prospective attack vector.
  2. Consumers with tradition international init scripts released ought to initially move tradition scripts to the brand-new international init script type ( this note pad can be utilized to automate the migration work) and, after this migration action, continue to disable the tradition variation as shown in the previous action.
  3. Cluster-named init scripts are likewise impacted by the concern and are likewise deprecated: clients still utilizing this kind of init scripts ought to disable cluster-named init scripts ( AWS| Azure), move them to cluster-scoped scripts, and ensure that the scripts are kept in the brand-new work space files storage area ( AWS| Azure| GCP). This note pad can be utilized to automate the migration work.
  4. Existing cluster-scoped init scripts kept on DBFS ought to be moved to the option, much safer work space files area ( AWS| Azure| GCP).
  5. Usage Databricks Security Analysis Tool (SAT) to automate security medical examination of your Databricks work space setups versus Databricks security finest practices.

The following area is a recreation of the technical report authored by the SEC Consult’s scientist Florian Roth and Marius Bartholdy. While the research study explained listed below was carried out and evaluated with Azure Databricks as an example, the findings connected to the deprecated init scripts types impact other cloud suppliers as stated in the table above.

Thank you once again to SEC Consult, and all of the security scientists who are dealing with us to make Databricks more protected every day. If you are a security scientist, we will see you at

Investigating Databricks init scripts security

By Florian Roth and Marius Bartholdy, SEC Consult

A low-privileged user had the ability to break the seclusion in between Databricks calculate clusters within the border of the exact same work space and company by getting remote code execution. This consequently would have permitted an opponent to gain access to all files and tricks in the work space in addition to intensifying their opportunities to those of a work area administrator.

The Databricks File System (DBFS) is completely available by every user in a work area. Given that Cluster-scoped and tradition international init scripts were kept there also, a verified aggressor with default approvals might:

  1. Discover and customize an existing cluster-scoped init script.
  2. Location a brand-new script in the default area for tradition international init scripts.

1) Attack chain utilizing existing init script
The default alternative to supply deprecated init script types (such as tradition international or cluster-named) was to submit them to the DBFS. Due to the DBFS being shared in between all calculate clusters inside the exact same work space, it was possible to discover or think any pre-existing init scripts that had actually formerly been set up on a cluster and kept in DBFS. This might be attained by noting the material of existing DBFS directory sites:

display screen(" dbfs:/ databricks/scripts"))

All files might possibly be cluster-scoped init scripts, for that reason the objective was to change them in some way. While it was not possible to straight overwrite the file, with the following code it might be relabelled and a brand-new script with the old name might be produced. The brand-new harmful script included an easy reverse shell that would be occasionally introduced. Given that the cluster setup was just knowledgeable about the script names, as quickly as the init script was set off once again, a reverse shell, with root opportunities on the calculate cluster, was gotten:

New init script and shell
Figure 4: New init script and shell

Tricks can just be recovered at runtime by the calculate circumstances itself by means of a handled identity. Even work space administrators can not read them. Given that they are nevertheless offered to the calculate cluster as quickly as it is initialized, it was possible to obtain their clear text worths. Stimulate setup tricks can be discovered at/ tmp/custom-spark. conf, while tricks in the environment variables are available by checking out the/ proc/<< process-id>>/ environ file of the best procedure.

Securing Databricks cluster init scripts

Utilizing a vulnerability at first discovered by Joosua Santasalo from Secureworks, it is possible to leakage Databricks API tokens of other users, consisting of administrators if they run on the exact same circumstances. The initial finding was remediated by separating users from each other and specifically from administrators. Nevertheless, with the provided vulnerability the seclusion might be broken by carrying out attacker-controlled scripts, and the old make use of was subsequently legitimate once again.

Utilizing the formerly developed reverse-shell it was possible to record control-plane traffic. As quickly as we began a job with the administrative user, for instance running an easy note pad, the token was sent out unencrypted and might be dripped:

Figure 5 - packet capture on the backdoored cluster revealing the apiToken
Figure 5: package capture on the backdoored cluster exposing the apiToken

The recorded token might then be utilized to validate demands to the Databricks REST API. The copying enables seeing secret scopes and for that reason validated that the token had administrative opportunities:

Taptured token could then be used to authenticate requests to the Databricks REST API.

2) Attack chain utilizing tradition international init scripts
The exact same attack vector impacted tradition international init scripts. These were deprecated in 2020, however left allowed by default in all offices and were likewise kept on the DBFS, particularly at dbfs:/ databricks/init/. Any cluster would perform their material on initialization. For that reason, merely producing a brand-new script because directory site would ultimately result in code execution on all clusters.

Like this post? Please share to your friends:
Leave a Reply

;-) :| :x :twisted: :smile: :shock: :sad: :roll: :razz: :oops: :o :mrgreen: :lol: :idea: :grin: :evil: :cry: :cool: :arrow: :???: :?: :!: