This article describes the discovery of the first "in-the-wild" Spark Rest API Remote Code Execution (RCE) vulnerability made by Fengwei Zhang and the team at Alibaba Cloud Security on July 7, 2018.
I am posting on behalf of the Apache Spark PMC. This is *not* considered a security vulnerability and should not be advertised as such. This simply says that if one runs an inherently private service (Spark standalone master), but without enabling any ACLs or network security to block public access to it, that it can be accessed publicly. Of course it can.There is in general no expectation that a Spark cluster is publicly accessible. (The standalone master is also intended for 'simple' usage, and secure environments typically use another resource manager with its own security mechanisms.) Undoubtedly, someone somewhere has left one running on the public internet. However, these are not software problems, but unreasonably poor choices from individual deployments.The remedy is indeed to not provide public access to these services, or otherwise adopt other Spark resource managers with more elaborate security integrations. We can improve documentation to make this very clear.However, this should not be described as a security vulnerability. It suggests there will be a CVE and that a software patch is required. It isn't. Normal network security practices, and other Spark resource manager mechanisms, provide security to prevent this.Of course, we would not normally post about this in public. I do so because this blog was posted publicly at the same time it was raised with the Apache Spark PMC. If it were a vulnerability requiring a fix, this disclosure would be considered highly irresponsible. We encourage anyone reporting a vulnerability to follow the standard protocol at https://apache.org/security/