Security Posts

Infocon: green

ISC Stormcast For Friday, June 14th, 2024
Categories: Security Posts

Understanding Apple’s On-Device and Server Foundation Models release

By Artem Dinaburg Earlier this week, at Apple’s WWDC, we finally witnessed Apple’s AI strategy. The videos and live demos were accompanied by two long-form releases: Apple’s Private Cloud Compute and Apple’s On-Device and Server Foundations Models. This blog post is about the latter. So, what is Apple releasing, and how does it compare to the current open-source ecosystem? We integrate the video and long-form releases and parse through the marketing speak to bring you the nuggets of information within. The sound of silence No NVIDIA/CUDA Tax. What’s unsaid is as important as what is, and those words are CUDA and NVIDIA. Apple goes out of its way to specify that it is not dependent on NVIDIA hardware or CUDA APIs for anything. The training uses Apple’s AXLearn (which runs on TPUs and Apple Silicon), Server model inference runs on Apple Silicon (!), and the on-device APIs are CoreML and Metal. Why? Apple hates NVIDIA with the heat of a thousand suns. Tim Cook would rather sit in a data center and do matrix multiplication with an abacus than spend millions on NVIDIA hardware. Aside from personal enmity, it is a good business idea. Apple has its own ML stack from the hardware on up and is not hobbled by GPU supply shortages. Apple also gets to dogfood its hardware and software for ML tasks, ensuring that it’s something ML developers want. What’s the downside? Apple’s hardware and software ML engineers must learn new frameworks and may accidentally repeat prior mistakes. For example, Apple devices were originally vulnerable to LeftoverLocals, but NVIDIA devices were not. If anyone from Apple is reading this, we’d love to audit AXLearn, MLX, and anything else you have cooking! Our interests are in the intersection of ML, program analysis, and application security, and your frameworks pique our interest. The models There are (at least) five models being released. Let’s count them:
  1. The ~3B parameter on-device model used for language tasks like summarization and Writing Tools.
  2. The large Server model is used for language tasks too complex to do on-device.
  3. The small on-device code model built into XCode used for Swift code completion.
  4. The large Server code model (“Swift Assist”) that is used for complex code generation and understanding tasks.
  5. The diffusion model powering Genmoji and Image Playground.
There may be more; these aren’t explicitly stated but plausible: a re-ranking model for working with Semantic Search and a model for instruction following that will use app intents (although this could just be the normal on-device model). The ~3B parameter on-device model. Apple devices are getting an approximately 3B parameter on-device language model trained on web crawl and synthetic data and specially tuned for instruction following. The model is similar in size to Microsoft’s Phi-3-mini (3.8B parameters) and Google’s Gemini Nano-2 (3.25B parameters). The on-device model will be continually updated and pushed to devices as Apple trains it with new data. What model is it? A reasonable guess is a derivative of Apple’s OpenELM. The parameter count fits (3B), the training data is similar, and there is extensive discussion of LoRA and DoRA support in the paper, which only makes sense if you’re planning a system like Apple has deployed. It is almost certainly not directly OpenELM since the vocabulary sizes do not match and OpenELM has not undergone safety tuning. Apple’s on-device and server model architectures. A large (we’re guessing 130B-180B) Mixture-of-Experts Server model. For tasks that can’t be completed on a device, there is a large model running on Apple Silicon Servers in their Private Compute Cloud. This model is similar in size and capability to GPT-3.5 and is likely implemented as a Mixture-of-Experts. Why are we so confident about the size and MoE architecture? The open-source comparison models in cited benchmarks (DBRX, Mixtral) are MoE and approximately of that size; it’s too much for a mere coincidence. Apple’s Server model compared to open source alternatives and the GPT series from OpenAI. The on-device code model is cited in the platform state of the union; several examples of Github Copilot-like behavior integrated into XCode are shown. There are no specifics about the model, but a reasonable guess would be a 2B-7B code model fine-tuned for a specific task: fill-in-middle for Swift. The model is trained on Swift code and Apple SDKs (likely both code and documentation). From the demo video, the integration into XCode looks well done; XCode gathers local symbols and proper context for the model to better predict the correct text. Apple’s on-device code model doing FIM completions for Swift code via XCode. The server code model is branded as “Swift Assist” and also appears in the platform state of the union. It looks to be Apple’s answer to GitHub Copilot Chat. Not much detail is given regarding the model, but looking at its demo output, we guess it’s a 70B+ parameter model specifically trained on Swift Code, SDKs, and documentation. It is probably fine-tuned for instruction following and code generation tasks using human-created and synthetically generated data. Again, there is tight integration with XCode regarding providing relevant context to the model; the video mentions automatically identifying and using image and audio assets present in the project. Swift Assist completing a description to code generation task, integrated into XCode. The Image Diffusion Model. This model is discussed in the Platforms State of the Union and implicitly shown via Genmoji and Image Playground features. Apple has considerable published work on image models, more so than language models (compare the amount of each model type on Apple’s HF page). Judging by their architecture slide, there is a base model with a selection of adapters to provide fine-grained control over the exact image style desired. Image Playground showing the image diffusion model and styling via adapters. Adapters: LoRAs (and DoRAs) galore The on-device models will come with a set of LoRAs and/or DoRAs (Adapters, in Apple parlance) that specialize the on-device model to be very good at specific tasks. What’s an adapter? It’s effectively a diff against the original model weights that makes the model good at a specific task (and conversely, worse at general tasks). Since adapters do not have to modify every weight to be effective, they can be small (10s of megabytes) compared to a full model (multiple gigabytes). Adapters can also be dynamically added or removed from a base model, and multiple adapters can stack onto each other (e.g., imagine stacking Mail Replies + Friendly Tone). For Apple, shipping a base model and adapters makes perfect sense: the extra cost of shipping adapters is low, and due to complete control of the OS and APIs, Apple has an extremely good idea of the actual task you want to accomplish at any given time. Apple promises continued updates of adapters as new training data is available and we imagine new adapters can fill specific action niches as needed. Some technical details: Apple says their adapters modify multiple layers (likely equivalent to setting target_modules=”all-linear” in HF’s transformers). Adapter rank determines how strong an effect it has against the base model; conversely, higher-rank adapters take up more space since they modify more weights. At rank=16 (which from a vibes/feel standpoint is a reasonable compromise between effect and adapter size), the adapters take up 10s of megabytes each (as compared to gigabytes for a 3B base model) and are kept in some kind of warm cache to optimize for responsiveness. Suppose you’d like to learn more about adapters (the fundamental technology, not Apple’s specific implementation) right now. In that case, you can try via Apple-native MLX examples or HF’s transformers and PEFT packages. A selection of Apple’s language model adapters. A vector database? Apple doesn’t explicitly state this, but there’s a strong implication that Siri’s semantic search feature is a vector database; there’s an explicit comparison that shows Siri now searches based on meaning instead of keywords. Apple allows application data to be indexed, and the index is multimodal (images, text, video). A local application can provide signals (such as last accessed time) to the ranking model used to sort search results. Siri now searches by semantic meaning, which may imply there is a vector database underneath. Delving into technical details Training and data Let’s talk about some of the training techniques described. They are all ways to parallelize training very large language models. In essence, these techniques are different means to split & replicate the model to train it using an enormous amount of compute and data. Below is a quick explanation of the techniques used, all of which seem standard for training such large models:
  • Data Parallelism: Each GPU has a copy of the full model but is assigned a chunk of the training data. The gradients from all GPUs are aggregated and used to update weights, which are synchronized across models.
  • Tensor Parallelism: Specific parts of the model are split across multiple GPUs. PyTorch docs say you will need this once you have a big model or GPU communication overhead becomes an issue.
  • Sequence Parallelism was the hardest topic to find; I had to dig to page 6 of this paper. Parts of the transformer can be split to process multiple data items at once.
  • FSDP shards your model across multiple GPUs or even CPUs. Sharding reduces peak GPU memory usage since the whole model does not have to be kept in memory, at the expense of communication overhead to synchronize state. FDSP is supported by PyTorch and is regularly used for finetuning large models.
Surprise! Apple has also crawled the web for training with AppleBot. A raw crawl naturally contains a lot of garbage, sensitive data, and PII, which must be filtered before training. Ensuring data quality is hard work! HuggingFace has a great blog post about what was needed to improve the quality of their web crawl, FineWeb. Apple had to do something similar to filter out their crawl garbage. Apple also has licensed training data. Who the data partners are is not mentioned. Paying for high-quality data seems to be the new normal, with large tech companies striking deals with big content providers (e.g., StackOverflow, Reddit, NewsCorp). Apple also uses synthetic data generation, which is also fairly standard practice. However, it begs the question: How does Apple generate the synthetic data? Perhaps the partnership with OpenAI lets them legally launder GPT-4 output. While synthetic data can do wonders, it is not without its downside—there are forgetfulness issues with training on a large synthetic data corpus. Optimization This section describes how Apple optimizes its device and server models to be smaller and enable faster inference on devices with limited resources. Many of these optimizations are well known and already present in other software, but it’s great to see this level of detail about what optimizations are applied in production LLMs. Let’s start with the basics. Apple’s models use GQA (another match with OpenELM). They share vocabulary embedding tables, which implies that some embedding layers are shared between the input and the output to save memory. The on-device model has a 49K token vocabulary (a key difference from OpenELM). The hosted model has a 100K token vocabulary, with special tokens for language and “technical tokens.” The model vocabulary means how many letters and short sequences of words (or tokens) the model recognizes as unique. Some tokens are also used for signaling special states to the model, for instance, the end of the prompt, a request to fill in the middle, a new file being processed, etc. A large vocabulary makes it easier for the model to understand certain concepts and specific tasks. As a comparison, Phi-3 has a vocabulary size of 32K, Llama3 has a vocabulary of 128K tokens, and Qwen2 has a vocabulary of 152K tokens. The downside of a large vocabulary is that it results in more training and inference time overhead. Quantization & palletization The models are compressed via palletization and quantization to 3.5 bits-per-weight (BPW) but “achieve the same accuracy as uncompressed models.” What does “achieve the same accuracy” mean? Likely, it refers to an acceptable quantization loss. Below is a graph from a PR to llama.cpp with state-of-the-art quantization losses for different techniques as of February 2024. We are not told what Apple’s acceptable loss is, but it’s doubtful a 3.5 BPW compression will have zero loss versus a 16-bit float base model. Using “same accuracy” seems misleading, but I’d love to be proven wrong. Compression also affects metrics beyond accuracy, so the model’s ability may be degraded in ways not easily captured by benchmarks. Quantization error compared with bits per weight, from a PR to llama.cpp. The loss at 3.5 BPW is noticeably not zero. What is Low Bit Palletization? It’s one of Apple’s compression strategies, described in their CoreML documentation. The easiest way to understand it is to use its namesake, image color pallets. An uncompressed image stores the color values of each pixel. A simple optimization is to select some number of colors (say, 16) that are most common to the image. The image can then be encoded as indexes into the color palette and 16 full-color values. Imagine the same technique applied to model weights instead of pixels, and you get palletization. How good is it? Apple publishes some results for the effectiveness of 2-bit and 4-bit palletization. The two-bit palletization looks to provide ~6-7x compression from float16, and 4-bit compression measures out at ~3-4x, with only a slight latency penalty. We can ballpark and assume the 3.5 BPW will compress ~5-6x from the original 16-bit-per-weight model. Palletization graphic from Apple’s CoreML documentation. Note the similarity to images and color pallets. Palletization only applies to model weights; when performing inference, a source of substantial memory usage is runtime state. Activations are the outputs of neurons after applying some kind of transformation function, storing these in deep models can take up a considerable amount of memory, and quantizing them is a way to fit a bigger model for inference. What is quantization? It’s a way to map intervals of a large range (like 16 bits) into a smaller range (like 4 or 8 bits). There is a great graphical demonstration in this WWDC 2024 video. Quantization is also applied to embedding layers. Embeddings map inputs (such as words or images) into a vector that the ML model can utilize. The amount/size of embeddings depends on the vocabulary size, which we saw was 49K tokens for on-device models. Again, quantizing this lets us fit a bigger model into less memory at the cost of accuracy. How does Apple do quantization? The CoreML docs reveal the algorithms are GPTQ and QAT. Faster inference The first optimization is caching previously computed values via the KV Cache. LLMs are next-token predictors; they always generate one token at a time. Repeated recomputation of all prior tokens through the model naturally involves much duplicate effort, which can be saved by caching previous results! That’s what the KV cache does. As a reminder, cache management is one of the two hard problems of computer science. KV caching is a standard technique implemented in HF’s transformers package, llama.cpp, and likely all other open-source inference solutions. Apple promises a time-to-first-token of 0.6ms per prompt token and an inference speed of 30 tokens per second (before other optimizations like token speculation) on an iPhone 15. How does this compare to current open-source models? Let’s run some quick benchmarks! On an M3 Max Macbook Pro, phi3-mini-4k quantized as Q4_K (about 4.5 BPW) has a time-to-first-token of about 1ms/prompt token and generates about 75 tokens/second (see below). Apple’s 40% latency reduction on time-to-first-token on less powerful hardware is a big achievement. For token generation, llama.cpp does ~75 tokens/second, but again, this is on an M3 Max Macbook Pro and not an iPhone 15. The speed of 30 tokens per second doesn’t provide much of an anchor to most readers; the important part is that it’s much faster than reading speed, so you aren’t sitting around waiting for the model to generate things. But this is just the starting speed. Apple also promises to deploy token speculation, a technique where a slower model guides how to get better output from a larger model. Judging by the comments in the PR that implemented this in llama.cpp, speculation provides 2-3x speedup over normal inference, so real speeds seen by consumers may be closer to 60 tokens per second. Benchmarks and marketing There’s a lot of good and bad in Apple’s reported benchmarks. The models are clearly well done, but some of the marketing seems to focus on higher numbers rather than fair comparisons. To start with a positive note, Apple evaluated its models on human preference. This takes a lot of work and money but provides the most useful results. Now, the bad: a few benchmarks are not exactly apples-to-apples (pun intended). For example, the graph comparing human satisfaction summarization compares Apple’s on-device model + adapter against a base model Phi-3-mini. While the on-device + adapter performance is indeed what a user would see, a fair comparison would have been Apple’s on-device model + adapter vs. Phi-3-mini + a similar adapter. Apple could have easily done this, but they didn’t. A benchmark comparing an Apple model + adapter to a base Phi-3-mini. A fairer comparison would be against Phi-3-mini + adapter. The “Human Evaluation of Output Harmfulness” and “Human Preference Evaluation on Safety Prompts” show that Apple is very concerned about the kind of content its model generates. Again, the comparison is not exactly apples-to-apples: Mistral 7B was specifically released without a moderation mechanism (see the note at the bottom). However, the other models are fair game, as Phi-3-mini and Gemma claim extensive model safety procedures. Mistral-7B does so poorly because it is explicitly not trained for harmfulness reduction, unlike the other competitors, which are fair game. Another clip from one of the WWDC videos really stuck with us. In it, it is implied that macOS Sequoia delivers large ML performance gains over macOS Sonoma. However, the comparison is really a full-weight float16 model versus a quantized model, and the performance gains are due to quantization. The small print shows full weights vs. 4-bit quantization, but the big print makes it seem like macOS Sonoma versus macOS Sequoia. The rest of the benchmarks show impressive results in instruction following, composition, and summarization and are properly done by comparing base models to base models. These benchmarks correspond to high-level tasks like composing app actions to achieve a complex task (instruction following), drafting messages or emails (composition), and quickly identifying important parts of large documents (summarization). A commitment to on-device processing and vertical integration Overall, Apple delivered a very impressive keynote from a UI/UX perspective and in terms of features immediately useful to end-users. The technical data release is not complete, but it is quite good for a company as secretive as Apple. Apple also emphasizes that complete vertical integration allows them to use AI to create a better device experience, which helps the end user. Finally, an important part of Apple’s presentation that we had not touched on until now is its overall commitment to maintaining as much AI on-device as possible and ensuring data privacy in the cloud. This speaks to Apple’s overall position that you are the customer, not the product. If you enjoyed this synthesis of Apple’s machine learning release, consider what we can do for your machine learning environment! We specialize in difficult, multidisciplinary problems that combine application and ML security. Please contact us to know more.
Categories: Security Posts

PCC: Bold step forward, not without flaws

By Adelin Travers Earlier this week, Apple announced Private Cloud Compute (or PCC for short). Without deep context on the state of the art of Artificial Intelligence (AI) and Machine Learning (ML) security, some sensible design choices may seem surprising. Conversely, some of the risks linked to this design are hidden in the fine print. In this blog post, we’ll review Apple’s announcement, both good and bad, focusing on the context of AI/ML security. We recommend Matthew Green’s excellent thread on X for a more general security context on this announcement: Disclaimer: This breakdown is based solely on Apple’s blog post and thus subject to potential misinterpretations of wording. We do not have access to the code yet, but we look forward to Apple’s public PCC Virtual Environment release to examine this further! Review summary This design is excellent on the conventional non-ML security side. Apple seems to be doing everything possible to make PCC a secure, privacy-oriented solution. However, the amount of review that security researchers can do will depend on what code is released, and Apple is notoriously secretive. On the AI/ML side, the key challenges identified are on point. These challenges result from Apple’s desire to provide additional processing power for compute-heavy ML workloads today, which incidentally requires moving away from on-device data processing to the cloud. Homomorphic Encryption (HE) is a big hope in the confidential ML field but doesn’t currently scale. Thus, Apple’s choice to process data in its cloud at scale requires decryption. Moreover, the PCC guarantees vary depending on whether Apple will use a PCC environment for model training or inference. Lastly, because Apple is introducing its own custom AI/ML hardware, implementation flaws that lead to information leakage will likely occur in PCC when these flaws have already been patched in leading AI/ML vendor devices. Running commentary We’ll follow the release post’s text in order, section-by-section, as if we were reading and commenting, halting on specific passages. Introduction
When I first read this post, I’ll admit that I misunderstood this passage as Apple starting an announcement that they had achieved end-to-end encryption in Machine Learning. This would have been even bigger news than the actual announcement. That’s because Apple would need to use Homomorphic Encryption to achieve full end-to-end encryption in an ML context. HE allows computation of a function, typically an ML model, without decrypting the underlying data. HE has been making steady progress and is a future candidate for confidential ML (see for instance this 2018 paper). However, this would have been a major announcement and shift in the ML security landscape because HE is still considered too slow to be deployed at the cloud scale and in complex functions like ML. More on this later on. Note that Multi-Party Computation (MPC)—which allows multiple agents, for instance the server and the edge device, to compute different parts of a function like an ML model and aggregate the result privately—would be a distributed scheme on both the server and edge device which is different from what is presented here. The term “requires unencrypted access” is the key to the PCC design challenges. Apple could continue processing data on-device, but this means abiding by mobile hardware limitations. The complex ML workloads Apple wants to offload, like using Large Language Models (LLM), exceed what is practical for battery-powered mobile devices. Apple wants to move the compute to the cloud to provide these extended capabilities, but HE doesn’t currently scale to that level. Thus to provide these new capabilities of service presently, Apple requires access to unencrypted data. This being said, Apple’s design for PCC is exceptional, and the effort required to develop this solution was extremely high, going beyond most other cloud AI applications to date. Thus, the security and privacy of ML models in the cloud is an unsolved and active research domain when an auditor only has access to the model. A good example of these difficulties can be found in Machine Unlearning—a privacy scheme that allows removing data from a model—that was shown to be impossible to formally prove by just querying a model. Unlearning must thus be proven at the algorithm implementation level. When the underlying entirely custom and proprietary technical stack of Apple’s PCC is factored in, external audits become significantly more complex. Matthew Green notes that it’s unclear what part of the stack and ML code and binaries Apple will release to audit ML algorithm implementations. This is also definitely true. Members of the ML Assurance team at Trail of Bits have been releasing attacks that modify the ML software stack at runtime since 2021. Our attacks have exploited the widely used pickle VM for traditional RCE backdoors and malicious custom ML graph operators on Microsoft’s ONNXRuntime. Sleepy Pickles, our most recent attack, uses a runtime attack to dynamically swap an ML model’s weights when the model is loaded. This is also true; the design later introduced by Apple is far better than many other existing designs. Designing Private Cloud compute From an ML perspective, this claim depends on the intended use case for PCC, as it cannot hold true in general. This claim may be true if PCC is only used for model inference. The rest of the PCC post only mentions inference which suggests that PCC is not currently used for training. However, if PCC is used for training, then data will be retained, and stateless computation that leaves no trace is likely impossible. This is because ML models retain data encoded in their weights as part of their training. This is why the research field of Machine Unlearning introduced above exists. The big question that Apple needs to answer is thus whether it will use PCC for training models in the future. As others have noted, this is an easy slope to slip into. Non-targetability is a really interesting design idea that hasn’t been applied to ML before. It also mitigates hardware leakage vulnerabilities, which we will see next. Introducing Private Cloud Compute nodes As others have noted, using Secure Enclaves and Secure Boot is excellent since it ensures only legitimate code is run. GPUs will likely continue to play a large role in AI acceleration. Apple has been building its own GPUs for some time, with its M series now in the third generation rather than using Nvidia’s, which are more pervasive in ML. However, enclaves and attestation will provide only limited guarantees to end-users, as Apple effectively owns the attestation keys. Moreover, enclaves and GPUs have had vulnerabilities and side channels that resulted in exploitable leakage in ML. Apple GPUs have not yet been battle-tested in the AI domain as much as Nvidia’s; thus, these accelerators may have security issues that their Nvidia counterparts do not have. For instance, Apple’s custom hardware was and remains affected by the LeftoverLocals vulnerability when Nvidia’s hardware was not. LeftoverLocals is a GPU hardware vulnerability released by Trail of Bits earlier this year. It allows an attacker collocated with a victim on a vulnerable device to listen to the victim’s LLM output. Apple’s M2 processors are still currently impacted at the time of writing. This being said, the PCC design’s non-targetability property may help mitigate LeftoverLocals for PCC since it prevents an attacker from identifying and achieving collocation to the victim’s device. This is important as Swift is a compiled language. Swift is thus not prone to the dynamic runtime attacks that affect languages like Python which are more pervasive in ML. Note that Swift would likely only be used for CPU code. The GPU code would likely be written in Apple’s Metal GPU programming framework. More on dynamic runtime attacks and Metal in the next section. Stateless computation and enforceable guarantees Apple’s solution is not end-to-end encrypted but rather an enclave-based solution. Thus, it does not represent an advancement in HE for ML but rather a well-thought-out combination of established technologies. This is, again, impressive, but the data is decrypted on Apple’s server. As presented in the introduction, using compiled Swift and signed code throughout the stack should prevent attacks on ML software stacks at runtime. Indeed, the ONNXRuntime attack defines a backdoored custom ML primitive operator by loading an adversary-built shared library object, while the Sleepy Pickle attack relies on dynamic features of Python. Just-in-Time (JIT) compiled code has historically been a steady source of remote code execution vulnerabilities. JIT compilers are notoriously difficult to implement and create new executable code by design, making them a highly desirable attack vector. It may surprise most readers, but JIT is widely used in ML stacks to speed up otherwise slow Python code. JAX, an ML framework that is the basis for Apple’s own AXLearn ML framework, is a particularly prolific user of JIT. Apple avoids the security issues of JIT by not using it. Apple’s ML stack is instead built in Swift, a memory safe ahead-of-time compiled language that does not need JIT for runtime performance. As we’ve said, the GPU code would likely be written in Metal. Metal does not enforce memory safety. Without memory safety, attacks like LeftoverLocals are possible (with limitations on the attacker, like machine collocation). No privileged runtime access This is an interesting approach because it shows Apple is willing to trade off infrastructure monitoring capabilities (and thus potentially reduce PCC’s reliability) for additional security and privacy guarantees. To fully understand the benefits and limits of this solution, ML security researchers would need to know what exact information is captured in the structured logs. A complete analysis thus depends on Apple’s willingness or unwillingness to release the schema and pre-determined fields for these logs. Interestingly, limiting the type of logs could increase ML model risks by preventing ML teams from collecting adequate information to manage these risks. For instance, the choice of collected logs and metrics may be insufficient for the ML teams to detect distribution drift—when input data no longer matches training data and the model performance decreases. If our understanding is correct, most of the collected metrics will be metrics for SRE purposes, meaning that data drift detection would not be possible. If the collected logs include ML information, accidental data leakage is possible but unlikely. Non-targetability This is excellent as lower levels of the ML stack, including the physical layer, are sometimes overlooked in ML threat models. The term “metadata” is important here. Only the metadata can be filtered away in the manner Apple describes. However, there are virtually no ways of filtering out all PII in the body content sent to the LLM. Any PII in the body content will be processed unencrypted by the LLM. If PCC is used for inference only, this risk is mitigated by structured logging. If PCC is also used for training, which Apple has yet to clarify, we recommend not sharing PII with systems like these when it can be avoided. It might be possible for an attacker to obtain identifying information in the presence of side channel vulnerabilities, for instance, linked to implementation flaws, that leak some information. However, this is unlikely to happen in practice: the cost placed on the adversary to simultaneously exploit both the load balancer and side channels will be prohibitive for non-nation state threat actors. An adversary with this level of control should be able to spoof the statistical distribution of nodes unless the auditing and statistical analysis are done at the network level. Verifiable transparency
This is nice to see! Of course, we do not know if these will need to be analyzed through extensive reverse engineering, which will be difficult, if not impossible, for Apple’s custom ML hardware. It is still a commendable rare occurrence for projects of this scale. PCC: Security wins, ML questions Apple’s design is excellent from a security standpoint. Improvements on the ML side are always possible. However, it is important to remember that those improvements are tied to some open research questions, like the scalability of homomorphic encryption. Only future vulnerability research will shed light on whether implementation flaws in hardware and software will impact Apple. Lastly, only time will tell if Apple continuously commits to security and privacy by only using PCC for inference rather than training and implementing homomorphic encryption as soon as it is sufficiently scalable.
Categories: Security Posts

Ransomware attackers quickly weaponize PHP vulnerability with 9.8 severity rating

ArsTechnica: Security Content - Fri, 2024/06/14 - 21:40
Enlarge (credit: Getty Images) Ransomware criminals have quickly weaponized an easy-to-exploit vulnerability in the PHP programming language that executes malicious code on web servers, security researchers said. As of Thursday, Internet scans performed by security firm Censys had detected 1,000 servers infected by a ransomware strain known as TellYouThePass, down from 1,800 detected on Monday. The servers, primarily located in China, no longer display their usual content; instead, many list the site’s file directory, which shows all files have been given a .locked extension, indicating they have been encrypted. An accompanying ransom note demands roughly $6,500 in exchange for the decryption key. The output of PHP servers infected by TellYouThePass ransomware. (credit: Censys) The accompanying ransom note. (credit: Censys) When opportunity knocks The vulnerability, tracked as CVE-2024-4577 and carrying a severity rating of 9.8 out of 10, stems from errors in the way PHP converts Unicode characters into ASCII. A feature built into Windows known as Best Fit allows attackers to use a technique known as argument injection to convert user-supplied input into characters that pass malicious commands to the main PHP application. Exploits allow attackers to bypass CVE-2012-1823, a critical code execution vulnerability patched in PHP in 2012.Read 11 remaining paragraphs | Comments
Categories: Security Posts

Retired engineer discovers 55-year-old bug in Lunar Lander computer game code

ArsTechnica: Security Content - Fri, 2024/06/14 - 20:04
Enlarge / Illustration of the Apollo lunar lander Eagle over the Moon. (credit: Getty Images) On Friday, a retired software engineer named Martin C. Martin announced that he recently discovered a bug in the original Lunar Lander computer game's physics code while tinkering with the software. Created by a 17-year-old high school student named Jim Storer in 1969, this primordial game rendered the action only as text status updates on a teletype, but it set the stage for future versions to come. The legendary game—which Storer developed on a PDP-8 minicomputer in a programming language called FOCAL just months after Neil Armstrong and Buzz Aldrin made their historic moonwalks—allows players to control a lunar module's descent onto the Moon's surface. Players must carefully manage their fuel usage to achieve a gentle landing, making critical decisions every ten seconds to burn the right amount of fuel. In 2009, just short of the 40th anniversary of the first Moon landing, I set out to find the author of the original Lunar Lander game, which was then primarily known as a graphical game, thanks to the graphical version from 1974 and a 1979 Atari arcade title. When I discovered that Storer created the oldest known version as a teletype game, I interviewed him and wrote up a history of the game. Storer later released the source code to the original game, written in FOCAL, on his website.Read 7 remaining paragraphs | Comments
Categories: Security Posts

The best VPN services for iPhone and iPad in 2024: Expert tested and reviewed

Zero Day | ZDNet RSS Feed - Fri, 2024/06/14 - 19:51
We went hands-on with the best VPNs for your iPhone and iPad to find the best iOS VPNs to help you stream content and surf the web while keeping your devices safe.
Categories: Security Posts

Announcing the Burp Suite Professional chapter in the Testing Handbook

By Maciej Domanski Based on our security auditing experience, we’ve found that Burp Suite Professional’s dynamic analysis can uncover vulnerabilities hidden amidst the maze of various target components. Unpredictable security issues like race conditions are often elusive when examining source code alone. While Burp is a comprehensive tool for web application security testing, its extensive features may present a complex barrier. That’s where we, Trail of Bits, stand ready with our new Burp Suite guide in the Testing Handbook. This chapter aims to cut through this complexity, providing a clear and concise roadmap for running Burp Suite and achieving quick and tangible results. The new chapter starts with an essential discussion on where Burp can support you. This section provides in-depth insights into how Burp can enhance your ability to conduct security testing, especially in the face of challenges like obfuscated front-end code, intricate infrastructural components, variations in deployment environments, or client-side data handling issues. The chapter provides a step-by-step guide to setting up Burp for your specific application quickly and effectively. It guides you through minimizing setup errors and ensuring potential vulnerabilities are not overlooked—a game-changer in terms of your security auditing outcomes. We also explore using key Burp extensions to supercharge your application testing processes and discover more vulnerabilities. Our Burp chapter concludes with numerous professional tips and tricks to empower you to perform advanced practices and to reveal hidden Burp characteristics that could revolutionize your security testing routine. Real-world knowledge, real-world results The Testing Handbook series encapsulates our extensive real-world knowledge and experience. Our insights go beyond mere documentation recitations, offering tried-and-tested strategies from the Trail of Bits team’s security auditing experience. With this new chapter, we hope to impart the knowledge and confidence you need to dive into Burp Suite and truly harness its potential to secure your web applications. Ready to supercharge your security testing with Burp Suite? Dive into the chapter now.
Categories: Security Posts

Perplexity: Un buscador que cura los resultados con GenAI ( y te ayuda "en tus juegos de Rol donde eres el malo" )

Un informático en el lado del mal - Fri, 2024/06/14 - 08:35
La propuesta de me parece más que interesante. Su propuesta es tener un Carwler de Internet como hace Google o Bing, pero curar resultados y ayudar a buscar información en Internet usando GenAI, - con todos los "issues" que aún estamos resolviendo en el mundo del GenAI -,  que seguro que es hacia donde van a ir todos los motores de búsqueda. Se trata de aprovechar la ventaja que dan los modelos multimodales de GenAI para mostrar los mejores resultados de la mejor forma.
Figura 1: Perplexity: Un buscador que cura los resultados con GenAI( y te ayuda "en tus juegos de Rol donde eres el malo" )
Si miramos cómo lo hace Google para buscar información de una persona, por ejemplo de mí, podéis ver que los resultados son curados. Es decir, hay una composición con fuentes varias para traer fotos, vídeos, una pequeña bio, e incluso metadatos asociados a la persona. Por supuesto, además en Google tienes filtros por categorías de fuentes, tipos de contenido y rangos de fechas, lo que permite hacer búsquedas más afinadas.
Figura 2: Resultado de Google a la búsqueda de Chema Alonso
Además, si os fijáis en la parte final, aparecen una serie de "Más Preguntas" que Google ya están generando usando GenAI con el contenido que se busca. Es decir, esas posibles preguntas que aparecen al final de cualquier búsqueda de Google proponiendo nuevas búsquedas son generadas con GenAI. Basta con preguntarle a ChatGPT sobre qué búsquedas podrían interesarle a alguien que haya buscado info de Chema Alonso y te da una propuesta similar.
Figura 3: Propuesta de "Más Preguntas" generada por ChatGPT
Pero Google va con cuidado con los resultados que muestra, sobre todo desde que tuvo el susto por culpa de una "Hallucination" o "Alucinación" que Google Bard tuvo una en su evento de presentación, donde se hizo una demo integrado en Google Search y dio mal las fechas al  preguntarle por unos proyectos de la NASA. El error le costó una caída en Bolsa de 100 Billions de USD

Figura 4: Google Bard erró con la respuesta
En el caso de que es un unicornio en valoración, los resultados, todos, están generados por GenAI. La gente asume que está innovando, creciendo, y es lo que es. Un buscador que quiere cambiar la manera de encontrar información en Internet, haciendo que todos los resultados estén generados por GenAI. Lo que permite tener resultados curados para todo. Aquí tenéis la página generada para buscar información sobre mí.
Figura 5: Página de resultados "curados" con GenAI de Chema Alonso
Por supuesto, la propuesta es interesante, y creo que la búsqueda de información en Internet será con GenAI. Parece clarísimo. Pero al usar GenAI, sigue adoleciendo de los mismos problemas que hemos visto ya muchas veces, como son las Alucinaciones, ataques de Prompt Injection, etcétera... por ahora.  Así que si juegas un poco con este buscador, acabas encontrándote con todo esto. Por ejemplo, si le preguntamos por los libros del hacker del gorro, tenemos estos resultados.
Figura 6: Libros del hacker del gorro
En este caso anterior, he probado la gestión de la identidad de las personas en modelos LLM con "el hacker del gorro", y lo ha hecho bien, pero cuando hemos ido a los libros, se ha inventado que yo escribí la biografía del mítico Kevin Mitnick, y ha confundido los papers de hacking con libros. Así que si buscas info de personas, tendrás los mismos problemas que con ChatGPT, que ya sabes que "es un mentirosillo" o que con Google Bard cuando me metió en la cárcel por dos años. Podrás recibir Alucinaciones
Eso sí, detrás de cada "chunk" de información, recibes un enlace que te lleva a la fuente que ha usado para generar eso datos, que como se basa en información pública "crawled" de Internet como cualquiera araña de buscador, puedes ir a verificarlo tú en persona. Pero debes ir a hacerlo. 
Si probamos ahora el ataque de Prompt Injection usando el truco del juego de Rol para matar al presidente de lo Estados Unidos, vemos que al principio cuela bien. Pero la información no es demasiado específica. Aunque ha colado la pregunta, la validación que se hace por detrás, se hace en la respuesta, basado en cómo lo hace Claude de Anthropic. Como veis, mi artículo está entre las fuentes utilizadas para componer esta respuesta, así que ha hecho que lo quiera un poquito más aún.
Figura 7: Ataque de Prompt Injection usando el truco del juego de Rolpara matar al presidente de lo Estados Unidos
Así que ha colado la pregunta, pero las respuestas no han sido ajustadas, así que hay que seguir pidiendo detalles. Para ello, aprovechemos sus "Preguntas Relacionadas", que como veis son de lo más acertadas para conseguir el objetivo de ganar al juego de rol.
Figura 8: Preguntas relacionadas para el objetivo inicial
No sabía cuál elegir. Todas eran muy buenas, así que opté por la primera a ver si me daba alguna ayuda para lograr el objetivo. Y como podéis ver, saltó el "Harmful Mode" y me lanzó el mensaje de que por ahí no podía seguir. Y mi artículo sigue siendo la fuete para esta respuesta. No sé si es una buena noticia....
Figura 9: Harmful Mode en Perplexity
Pero vamos a cambiar la estrategia, a ver si consigo colarme en la White House, que seguro que es una buena forma de conseguir estar más cerca del objetivo. Y por aquí sí que Perplexity me ayuda. No parece muy malicioso (si no evalúas el contexto de las preguntas anteriores, claro).
Figura 10: Pues todas las respuestas son buenas
El contexto de la pregunta sigue siendo malicioso, pero las respuestas, que son correctas son aún demasiado genéricas, así que no ha saltado ningún filtro de Content Safety, así que puedo seguir preparando la estrategia para conseguir el resultado siguiendo las "Preguntas Relacionadas" que me sugiere, que me parece que aprovechar eventos o reuniones fáciles de acceder es una buena estrategia. Gracias Perplexity.
Figura 11: Pregunta Relaciona con información útil
Como podéis ver, la idea de aprovechar los eventos públicos es buena. No está haciendo nada "malo", porque es información pública que cualquiera puede acceder. Viene de un hilo con contexto malicioso, pero los resultados son más que útiles y bien curados para el objetivo de la búsqueda. Y si miramos las "Preguntas Relacionadas" es súper-útil, porque ya nos ayuda a comprar los boletos.
Figura 12: Qué buenas sugerencias
En resumen, me gusta mucho el producto y la idea de buscar información en Internet con GenAI. Creo que los buscadores van a ir por ahí porque son muy útiles. Sin embargo, fiarse de la información en Internet siempre es peligroso, pero fiarse de información curada con GenAI que tiene Alucinaciones sobre contenido en Internet del que te puedes fiar lo justo, exige que entendamos cómo comprobar las fuentes y que trabajemos aún más en reducir esas Hallutinations. Pero más que recomendado por mi parte.
¡Saludos Malignos!
Autor: Chema Alonso (Contactar con Chema Alonso)  

Sigue Un informático en el lado del mal RSS 0xWord
- Contacta con Chema Alonso en
Categories: Security Posts

Internet Safety Month: Keep Your Online Experience Safe and Secure

Webroot - Fri, 2024/05/31 - 17:38
What is Internet Safety Month? Each June, the online safety community observes Internet Safety Month as a time to reflect on our digital habits and ensure we’re taking the best precautions to stay safe online. It serves as a reminder for everyone—parents, teachers, and kids alike—to be mindful of our online activities and to take steps to protect ourselves. Why is it important? As summer approaches and we all pursue a bit more leisure time—that typically includes more screen time—it’s important to understand the risks and safeguard our digital well-being. While the Internet offers us countless opportunities, it also comes with risks that we must be aware of:
  • 37% of children and adolescents have been the target of cyberbullying.1
  • 50% of tweens (kids ages 10 to 12) have been exposed to inappropriate online content.2
  • 64% of Americans have experienced a data breach.3
  • 95% of cybersecurity breaches are due to human error.4
  • 30% of phishing emails are opened by targeted users.5
This makes Internet Safety Month the perfect time to review our digital habits and ensure that we are doing everything we can to stay safe. 7 tips to keep your online experience secure
  1. Protect your devices from malware
Malware is malicious software designed to harm your computer or steal your personal information. It can infect your device through malicious downloads, phishing emails, or compromised websites, leading to potential loss of access to your computer, data, photos, and other valuable files.

How to protect it
Install reputable antivirus software like Webroot on all your devices and keep it updated. Regularly scan your devices for malware and avoid clicking on suspicious links or downloading unknown files. 2. Be skeptical of offers that appear too good to be true
If an offer seems too good to be true, it probably is. Scammers often use enticing offers or promotions to lure victims into sharing personal information or clicking on malicious links. These can lead to financial loss, identity theft, or installation of malware.

How to protect it
If an offer seems too good to be true, it probably is. Research the company or website before pursuing an offer or providing any personal information. 3. Monitor your identity for fraud activity Identity theft happens when someone swipes your personal information to commit fraud or other crimes. This can wreak havoc on your finances, tank your credit score, and bring about a host of other serious consequences. How to protect it
Consider using an identity protection service like Webroot Premium that monitors your personal information for signs of unauthorized use. Review your bank and credit card statements regularly for any unauthorized transactions. 4. Ensure your online privacy with a VPN
Without proper protection, your sensitive information—like passwords and credit card details—can be easily intercepted by cybercriminals while browsing. Surfing the web and using public Wi-Fi networks often lack security, giving hackers a prime opportunity to snatch your data. How to protect it
Use a Virtual Private Network (VPN) when connecting to the internet. A VPN encrypts your internet traffic, making it unreadable to hackers. Choose a reputable VPN service and enable it whenever you connect to the internet. 5. Avoid clicking on links from unknown sources
Clicking on links in emails, text messages, or social media from unknown or suspicious sources can expose you to phishing attacks or malware. These seemingly harmless clicks can quickly compromise your security and personal information. How to protect it
Verify the sender’s identity before clicking on any links. Hover over links to see the actual URL before clicking. If you’re unsure about a link, type the company’s name directly into your browser instead. 6. Avoid malicious websites
Malicious websites are crafted to deceive you into downloading malware or revealing sensitive information. Visiting these sites can expose your device to viruses, phishing attempts, and other online threats, putting your security at risk. How to protect it
Install a web threat protection tool or browser extension that can block access to malicious websites. Products like Webroot Internet Security Plus and Webroot AntiVirus make it easy to avoid threatening websites with secure web browsing on your desktop, laptop, tablet, or mobile phone. 7. Keep your passwords safe Weak or reused passwords can easily be guessed or cracked by attackers, compromising your online accounts. But keeping track of all your unique passwords can be difficult if you don’t have them stored securely in a password manager. If one account is compromised, attackers can gain access to your other accounts, potentially leading to identity theft or financial loss. How to protect your passwords
Use a password manager to create and store strong, unique passwords for each of your online accounts. A password manager encrypts your passwords and helps you automatically fill them in on websites, reducing the risk of phishing attacks and password theft. Take action now As we celebrate Internet Safety Month, take a moment to review your current online habits and security measures. Are you doing everything you can to protect yourself and your family? If not, now is the perfect time to make some changes. By following these tips, you can enjoy a safer and more secure online experience. Remember, Internet Safety Month is not just about protecting yourself—it’s also about spreading awareness and educating others. You can share this flyer, “9 Things to Teach Kids to Help Improve Online Safety,” with your friends and family to spread the word and help create a safer online community for everyone. Sources: [1] Forbes. The Ultimate Internet Safety Guide for Kids. [2] Forbes. The Ultimate Internet Safety Guide for Kids. [3] Pew Research Center [4] Information Week. What Cybersecurity Gets Wrong. [5] MIT. Learn how to avoid a phishing scam. The post Internet Safety Month: Keep Your Online Experience Safe and Secure appeared first on Webroot Blog.
Categories: Security Posts

2024 RSA Recap: Centering on Cyber Resilience

AlienVault Blogs - Thu, 2024/05/16 - 12:00
Cyber resilience is becoming increasingly complex to achieve with the changing nature of computing. Appropriate for this year’s conference theme, organizations are exploring “the art of the possible”, ushering in an era of dynamic computing as they explore new technologies. Simultaneously, as innovation expands and computing becomes more dynamic, more threats become possible – thus, the approach to securing business environments must also evolve. As part of this year’s conference, I led a keynote presentation around the possibilities, risks, and rewards of cyber tech convergence. We explored the risks and rewards of cyber technology convergence and integration across network & security operations. More specifically, we looked into the future of more open, adaptable security architectures, and what this means for security teams. LevelBlue Research Reveals New Trends for Cyber Resilience This year, we also launched the inaugural LevelBlue Futures™ Report: Beyond the Barriers to Cyber Resilience. Led by Theresa Lanowitz, Chief Evangelist of AT&T Cybersecurity / LevelBlue, we hosted an in-depth session based on our research that examined the complexities of dynamic computing. This included an analysis of how dynamic computing merges IT and business operations, taps into data-driven decision-making, and redefines cyber resilience for the modern era. Some of the notable findings she discussed include:
  • 85% of respondents say computing innovation is increasing risk, while 74% confirmed that the opportunity of computing innovation outweighs the corresponding increase in cybersecurity risk.
  • The adoption of Cybersecurity-as-a-Service (CSaaS) is on the rise, with 32% of organizations opting to outsource their cybersecurity needs rather than managing them in-house.
  • 66% of respondents share cybersecurity is an afterthought, while another 64% say cybersecurity is siloed. This isn’t surprising when 61% say there is a lack of understanding of cybersecurity at the board level.
Theresa was also featured live on-site discussing these findings with prominent cyber media in attendance. She emphasized what today’s cyber resilience barriers look like and what new resilience challenges are promised for tomorrow. Be sure to check out some of those interviews below. New Research from LevelBlue Reveals 2024 Cyber Resilience Trends – Theresa Lanowitz – RSA24 #2 LevelBlue & Enterprise Strategy Group: A Look at Cyber Resilience For access to the full LevelBlue Futures™ Report, download a complimentary copy here. * { box-sizing: border-box; } .rowPic { display: flex; flex-wrap: wrap; padding: 0 4px; } /* Create four equal columns that sits next to each other */ .columnPic{ flex: 50%; max-width: 50%; padding: 0 4px; } .columnPic img { margin-top: 8px; vertical-align: middle; } /* Responsive layout - makes a two column-layout instead of four columns */ @media (max-width: 800px) { .columnPic{ flex: 50%; max-width: 50%; } } /* Responsive layout - makes the two columns stack on top of each other instead of next to each other */ @media (max-width: 600px) { .columnPic { flex: 100%; max-width: 100%; } }
Categories: Security Posts

Sifting through the spines: identifying (potential) Cactus ransomware victims

Fox-IT - Thu, 2024/04/25 - 06:00
Authored by Willem Zeeman and Yun Zheng Hu This blog is part of a series written by various Dutch cyber security firms that have collaborated on the Cactus ransomware group, which exploits Qlik Sense servers for initial access. To view all of them please check the central blog by Dutch special interest group Cyberveilig Nederland [1] The effectiveness of the public-private partnership called Melissa [2] is increasingly evident. The Melissa partnership, which includes Fox-IT, has identified overlap in a specific ransomware tactic. Multiple partners, sharing information from incident response engagements for their clients, found that the Cactus ransomware group uses a particular method for initial access. Following that discovery, NCC Group’s Fox-IT developed a fingerprinting technique to identify which systems around the world are vulnerable to this method of initial access or, even more critically, are already compromised. Qlik Sense vulnerabilities Qlik Sense, a popular data visualisation and business intelligence tool, has recently become a focal point in cybersecurity discussions. This tool, designed to aid businesses in data analysis, has been identified as a key entry point for cyberattacks by the Cactus ransomware group. The Cactus ransomware campaign Since November 2023, the Cactus ransomware group has been actively targeting vulnerable Qlik Sense servers. These attacks are not just about exploiting software vulnerabilities; they also involve a psychological component where Cactus misleads its victims with fabricated stories about the breach. This likely is part of their strategy to obscure their actual method of entry, thus complicating mitigation and response efforts for the affected organizations. For those looking for in-depth coverage of these exploits, the Arctic Wolf blog [3] provides detailed insights into the specific vulnerabilities being exploited, notably CVE-2023-41266, CVE-2023-41265 also known as ZeroQlik, and potentially CVE-2023-48365 also known as DoubleQlik. Threat statistics and collaborative action The scope of this threat is significant. In total, we identified 5205 Qlik Sense servers, 3143 servers seem to be vulnerable to the exploits used by the Cactus group. This is based on the initial scan on 17 April 2024. Closer to home in the Netherlands, we’ve identified 241 vulnerable systems, fortunately most don’t seem to have been compromised. However, 6 Dutch systems weren’t so lucky and have already fallen victim to the Cactus group. It’s crucial to understand that “already compromised” can mean that either the ransomware has been deployed and the initial access artifacts left behind were not removed, or the system remains compromised and is potentially poised for a future ransomware attack. Since 17 April 2024, the DIVD (Dutch Institute for Vulnerability Disclosure) and the governmental bodies NCSC (Nationaal Cyber Security Centrum) and DTC (Digital Trust Center) have teamed up to globally inform (potential) victims of cyberattacks resembling those from the Cactus ransomware group. This collaborative effort has enabled them to reach out to affected organisations worldwide, sharing crucial information to help prevent further damage where possible. Identifying vulnerable Qlik Sense servers Expanding on Praetorian’s thorough vulnerability research on the ZeroQlik and DoubleQlik vulnerabilities [4,5], we found a method to identify the version of a Qlik Sense server by retrieving a file called product-info.json from the server. While we acknowledge the existence of Nuclei templates for the vulnerability checks, using the server version allows for a more reliable evaluation of potential vulnerability status, e.g. whether it’s patched or end of support. This JSON file contains the release label and version numbers by which we can identify the exact version that this Qlik Sense server is running. Figure 1: Qlik Sense product-info.json file containing version information Keep in mind that although Qlik Sense servers are assigned version numbers, the vendor typically refers to advisories and updates by their release label, such as “February 2022 Patch 3”. The following cURL command can be used to retrieve the product-info.json file from a Qlik server: curl -H "Host: localhost" -vk 'https://<ip>/resources/autogenerated/product-info.json?.ttf' Note that we specify ?.ttf at the end of the URL to let the Qlik proxy server think that we are requesting a .ttf file, as font files can be accessed unauthenticated. Also, we set the Host header to localhost or else the server will return 400 - Bad Request - Qlik Sense, with the message The http request header is incorrect. Retrieving this file with the ?.ttf extension trick has been fixed in the patch that addresses CVE-2023-48365 and you will always get a 302 Authenticate at this location response: > GET /resources/autogenerated/product-info.json?.ttf HTTP/1.1 > Host: localhost > Accept: */* > < HTTP/1.1 302 Authenticate at this location < Cache-Control: no-cache, no-store, must-revalidate < Location: https://localhost/internal_forms_authentication/?targetId=2aa7575d-3234-4980-956c-2c6929c57b71 < Content-Length: 0 < Nevertheless, this is still a good way to determine the state of a Qlik instance, because if it redirects using 302 Authenticate at this location it is likely that the server is not vulnerable to CVE-2023-48365. An example response from a vulnerable server would return the JSON file: > GET /resources/autogenerated/product-info.json?.ttf HTTP/1.1 > Host: localhost > Accept: */* > < HTTP/1.1 200 OK < Set-Cookie: X-Qlik-Session=893de431-1177-46aa-88c7-b95e28c5f103; Path=/; HttpOnly; SameSite=Lax; Secure < Cache-Control: public, max-age=3600 < Transfer-Encoding: chunked < Content-Type: application/json;charset=utf-8 < Expires: Tue, 16 Apr 2024 08:14:56 GMT < Last-Modified: Fri, 04 Nov 2022 23:28:24 GMT < Accept-Ranges: bytes < ETag: 638032013040000000 < Server: Microsoft-HTTPAPI/2.0 < Date: Tue, 16 Apr 2024 07:14:55 GMT < Age: 136 < {"composition":{"contentHash":"89c9087978b3f026fb100267523b5204","senseId":"qliksenseserver:14.54.21","releaseLabel":"February 2022 Patch 12","originalClassName":"Composition","deprecatedProductVersion":"4.0.X","productName":"Qlik Sense","version":"14.54.21","copyrightYearRange":"1993-2022","deploymentType":"QlikSenseServer"}, <snipped> We utilised Censys and Google BigQuery [6] to compile a list of potential Qlik Sense servers accessible on the internet and conducted a version scan against them. Subsequently, we extracted the Qlik release label from the JSON response to assess vulnerability to CVE-2023-48365. Our vulnerability assessment for DoubleQlik / CVE-2023-48365 operated on the following criteria:
  1. The release label corresponds to vulnerability statuses outlined in the original ZeroQlik and DoubleQlik vendor advisories [7,8].
  2. The release label is designated as End of Support (EOS) by the vendor [9], such as “February 2019 Patch 5”.
We consider a server non-vulnerable if:
  1. The release label date is post-November 2023, as the advisory states that “November 2023” is not affected.
  2. The server responded with HTTP/1.1 302 Authenticate at this location.
Any other responses were disregarded as invalid Qlik server instances. As of 17 April 2024, and as stated in the introduction of this blog, we have detected 5205 Qlik Servers on the Internet. Among them, 3143 servers are still at risk of DoubleQlik, indicating that 60% of all Qlik Servers online remain vulnerable. Figure 2: Qlik Sense patch status for DoubleQlik CVE-2023-48365 The majority of vulnerable Qlik servers reside in the United States (396), trailed by Italy (280), Brazil (244), the Netherlands (241), and Germany (175). Figure 3: Top 20 countries with servers vulnerable to DoubleQlik CVE-2023-48365 Identifying compromised Qlik Sense servers Based on insights gathered from the Arctic Wolf blog and our own incident response engagements where the Cactus ransomware was observed, it’s evident that the Cactus ransomware group continues to redirect the output of executed commands to a True Type font file named qle.ttf, likely abbreviated for “qlik exploit”. Below are a few examples of executed commands and their output redirection by the Cactus ransomware group: whoami /all > ../Client/qmc/fonts/qle.ttf quser > ../Client/qmc/fonts/qle.ttf In addition to the qle.ttf file, we have also observed instances where qle.woff was used: Figure 4: Directory listing with exploitation artefacts left by Cactus ransomware group It’s important to note that these font files are not part of a default Qlik Sense server installation. We discovered that files with a font file extension such as .ttf and .woff can be accessed without any authentication, regardless of whether the server is patched. This likely explains why the Cactus ransomware group opted to store command output in font files within the fonts directory, which in turn, also serves as a useful indicator of compromise. Our scan for both font files, found a total of 122 servers with the indicator of compromise. The United States ranked highest in exploited servers with 49 online instances carrying the indicator of compromise, followed by Spain (13), Italy (11), the United Kingdom (8), Germany (7), and then Ireland and the Netherlands (6). Figure 5: Top 20 countries with known compromised Qlik Sense servers Out of the 122 compromised servers, 46 were not vulnerable anymore. When the indicator of compromise artefact is present on a remote Qlik Sense server, it can imply various scenarios. Firstly, it may suggest that remote code execution was carried out on the server, followed by subsequent patching to address the vulnerability (if the server is not vulnerable anymore). Alternatively, its presence could signify a leftover artefact from a previous security incident or unauthorised access. While the root cause for the presence of these files is hard to determine from the outside it still is a reliable indicator of compromise. Responsible disclosure by the DIVD
We shared our fingerprints and scan data with the Dutch Institute of Vulnerability Disclosure (DIVD), who then proceeded to issue responsible disclosure notifications to the administrators of the Qlik Sense servers. Call to action Ensure the security of your Qlik Sense installations by checking your current version. If your software is still supported, apply the latest patches immediately. For systems that are at the end of support, consider upgrading or replacing them to maintain robust security. Additionally, to enhance your defences, it’s recommended to avoid exposing these services to the entire internet. Implement IP whitelisting if public access is necessary, or better yet, make them accessible only through secure remote working solutions. If you discover you’ve been running a vulnerable version, it’s crucial to contact your (external) security experts for a thorough check-up to confirm that no breaches have occurred. Taking these steps will help safeguard your data and infrastructure from potential threats. References
Categories: Security Posts

Cybersecurity Concerns for Ancillary Strength Control Subsystems

BreakingPoint Labs Blog - Thu, 2023/10/19 - 19:08
Additive manufacturing (AM) engineers have been incredibly creative in developing ancillary systems that modify a printed parts mechanical properties.  These systems mostly focus on the issue of anisotropic properties of additively built components.  This blog post is a good reference if you are unfamiliar with isotropic vs anisotropic properties and how they impact 3d printing.  […] The post Cybersecurity Concerns for Ancillary Strength Control Subsystems appeared first on BreakPoint Labs - Blog.
Categories: Security Posts

Update on Naked Security

Naked Security Sophos - Tue, 2023/09/26 - 12:00
To consolidate all of our security intelligence and news in one location, we have migrated Naked Security to the Sophos News platform.
Categories: Security Posts

Thu, 1970/01/01 - 02:00
Syndicate content