Security Posts

Infocon: green

An infection from Rig exploit kit
Categorías: Security Posts

Vision X Best of Show Special Prize at Interop Tokyo 2019

BreakingPoint Labs Blog - Hace 49 mins 55 segs
Ixia's Vision X - 2019 Tokyo Interop Best of Show Special Prize Winner  There are a number of…
Categorías: Security Posts

How The New TLS1.3 Standard Will Affect Your Encryption Tactics

BreakingPoint Labs Blog - Hace 49 mins 55 segs
The IETF released a new version of their encryption standard called RFC 8446 (Transport Layer…
Categorías: Security Posts

Why SPAN when you can Tap?

BreakingPoint Labs Blog - Hace 49 mins 55 segs
In networking, as is the case with life, there are usually multiple ways of trying to get to the…
Categorías: Security Posts

Introducing Ixia’s Newest Packet Broker: Vision X

BreakingPoint Labs Blog - Hace 49 mins 55 segs
As the FIFA Women’s World Cup matches start up, you can bet I will be live-streaming games — at…
Categorías: Security Posts

Investigating Windows Graphics Vulnerabilities: A Reverse Engineering and Fuzzing Story

BreakingPoint Labs Blog - Hace 49 mins 55 segs
Introduction It is not surprising that vulnerabilities targeting Windows applications and…
Categorías: Security Posts

Join us at Cisco Live ‘19

BreakingPoint Labs Blog - Hace 49 mins 55 segs
The one constant in networking is change, and that usually means a little added complexity, at…
Categorías: Security Posts

Game of Vulnerabilities: Bluekeep

BreakingPoint Labs Blog - Hace 49 mins 55 segs
If you have been following what’s happening in the field of computer security, or perhaps even if…
Categorías: Security Posts

Dynamic Analysis of a Windows Malicious Self-Propagating Binary

BreakingPoint Labs Blog - Hace 49 mins 55 segs
Dynamic analysis (execution of malware in a controlled, supervised environment) is one of the most…
Categorías: Security Posts

GDPR is here to stay

BreakingPoint Labs Blog - Hace 49 mins 55 segs
What is GDPR? General Data Protection Regulation, or GDPR, is a European regulatory package for…
Categorías: Security Posts

How To Optimize Your Security Defenses

BreakingPoint Labs Blog - Hace 49 mins 55 segs
As I mentioned in a blog a couple months ago, there is an absolute myriad of security architectures…
Categorías: Security Posts

Your Google Calendar Isn't Safe, an Eye-Controlled TV, and More News

Wired: Security - Hace 1 hora 26 mins
Catch up on the most important news from today in two minutes or less.
Categorías: Security Posts

Vulnerability Spotlight: Two bugs in KCodes NetUSB affect some NETGEAR routers

Cisco Talos - Lun, 2019/06/17 - 19:17


Dave McDaniel of Cisco Talos discovered these vulnerabilities.
Executive summaryKCodes’ NetUSB kernel module contains two vulnerabilities that could allow an attacker to inappropriately access information on some NETGEAR wireless routers. Specific models of these routers utilize the kernel module from KCodes, a Taiwanese company. The module is custom-made for each device, but they all contain similar functions.

The module shares USB devices over TCP, allowing clients to use various vendor-made drivers and software to connect to these devices. An attacker could send specific packets on the local network to exploit vulnerabilities in NetUSB, forcing the routers to disclose sensitive information and even giving the attacker the ability to remotely execute code.

In accordance with our coordinated disclosure policy, Cisco Talos reached out to KCodes and NETGEAR regarding this vulnerability. After working with KCodes, they provided an update to NETGEAR, which is scheduled to release an update. Talos decided to release the details of our vulnerability after surpassing our 90-day deadline.
Vulnerability detailsKCodes NetUSB unauthenticated remote kernel arbitrary memory read vulnerability (TALOS-2018-0775/CVE-2019-5016)

An exploitable arbitrary memory read vulnerability exists in the KCodes NetUSB.ko kernel module which enables the ReadySHARE Printer functionality of at least two NETGEAR Nighthawk Routers and potentially several other vendors/products. A specially crafted index value can cause an invalid memory read, resulting in a denial of service or remote information disclosure. An unauthenticated attacker can send a crafted packet on the local network to trigger this vulnerability.

Read the complete vulnerability advisory here for additional information.

KCodes NetUSB unauthenticated remote kernel information disclosure vulnerability (TALOS-2018-0776/CVE-2019-5017)

An exploitable information disclosure vulnerability exists in the KCodes NetUSB.ko kernel module that enables the ReadySHARE Printer functionality of at least two NETGEAR Nighthawk Routers and potentially several other vendors/products. An unauthenticated, remote attacker can craft and send a packet containing an opcode that will trigger the kernel module to return several addresses. One of which can be used to calculate the dynamic base address of the module for further exploitation.

Read the complete vulnerability advisory here for additional information.
Versions testedTalos tested and confirmed that TALOS-2019-0776 and TALOS-2019-0775 affects the NETGEAR Nighthawk AC3200 (R8000), firmware version 1.0.4.28_10.1.54 — NetUSB.ko 1.0.2.66. The NETGEAR Nighthawk AC3000 (R7900), firmware version 1.0.3.8_10.0.37 (11/1/18) — NetUSB.ko 1.0.2.69 is also affected by TALOS-2019-0775.



CoverageThe following SNORTⓇ rules will detect exploitation attempts. Note that additional rules may be released at a future date and current rules are subject to change pending additional vulnerability information. For the most current rule information, please refer to your Firepower Management Center or Snort.org.

Snort Rules: 49087

Categorías: Security Posts

Practical security recommendations – for you and your business

AlienVault Blogs - Lun, 2019/06/17 - 15:00
Cybercrime is costing UK businesses billions each and every year. Small businesses in particular are under threat, as they often take a more relaxed approach and a ‘not much to steal’ mindset. However, this lack of diligence has caused many companies to close permanently. Let’s ensure yours isn’t one of them. Time to start making the issue a priority! Here are some practical security recommendations for you and your business. Monitor and identify possible threats First things first, you need to analyse how secure your systems are. Take a proactive rather than reactive approach to cybercrime. Do a thorough risk assessment, analysing all areas of your business, paying close attention to any weak spots. Instead of waiting for an attack to happen and taking the necessary actions; reduce the chance of risks completely. This involves:
  • Being aware of all the latest cyber threats (from phishing to hacking, there are many out there, constantly evolving and taking on new forms)
  • Keeping your operating systems up to date
  • Backing up data
  • Protecting all software
  • Using an effective password policy
Remember: this isn’t a one-off concern, but an ongoing issue. So, ensure cyber crime is a priority and keep monitoring all potential threats. Educate your employees Whether you’re a team of two or one hundred, every employee needs to be educated on the steps you’re taking to mitigate against cybercrime. Bear in mind, this includes anyone who works from home. Ensure all laptops or tablets have the necessary endpoint security software. This also includes any third parties or contractors who have access to any files on your system. Dedicate at least one employee to being responsible for the issue: keeping everyone informed and taking the required actions to improve security posture. Consider All Lines of Defence A firewall is often the first line of defence in protecting you against attacks. These can be both internal and external. Employees should consider installing one on their home computers, for example. However, this isn’t the only line of defence to consider. Ask yourself questions such as:
  • Is your password policy robust?
  • Do you have the necessary cybersecurity insurance in place?
  • Do you have a record of everyone with administrative privileges?
  • Is your customer data safe?
  • How would your business cope in a temporary downtime period?
  • Could you consider multi-factor identification?
Find a partner Finding a business partner is essential for many reasons, such as growth or development of new products. It can also help you develop your security policy. A partner could have access to new software that could benefit your business. Alternatively, they could identify weak areas within your company you may not have considered. There are many cyber security businesses looking to collaborate, so use this to your advantage!   Consider Mobile Devices It’s not just computers, laptops and tablets you need to worry about. Ensure that you include mobile phones in your cyber security policy. This could be requiring that the company’s password policy applies to all devices using the network, for example. Install anti-malware software There’s an abundance of anti-malware software out there. Should anyone cause a security incident, this can protect your company data and ensure an attack doesn’t happen. Consider New Technologies As technology is constantly evolving, so are ways to breach it. The world of cyber crime is an ever-adapting one! Therefore, with new business innovation and growth sometimes comes new risks. Ensure your security policy incorporates this and factors it in. Don’t stop monitoring potential attacks: make the issue a priority.
Categorías: Security Posts

Tricky Scam Plants Phishing Links in Your Google Calendar

Wired: Security - Lun, 2019/06/17 - 13:00
Scammers are taking advantage of default calendar settings to try to trick users into clicking malicious links.
Categorías: Security Posts

Leaves of Hash

Trail of Bits has released Indurative, a cryptographic library that enables authentication of a wide variety of data structures without requiring users to write much code. Indurative is useful for everything from data integrity to trustless distributed systems. For instance, developers can use Indurative to add Binary Transparency to a package manager — so users can verify the authenticity of downloaded binaries — in only eight lines of code. Under the hood, Indurative uses Haskell’s new DerivingVia language extension to automatically map types that instantiate FoldableWithIndex to sparse Merkle tree representations, then uses those representations to create and verify inclusion (or exclusion) proofs. If you understood what that means, kudos, you can download Indurative and get started. If not, then you’re in luck! The whole rest of this blog post is written for you. “That looks like a tree, let’s call it a tree” In 1979, Ralph Merkle filed a patent for a hash-based signature scheme. This patent introduced several novel ideas, perhaps most notably that of an “authentication tree,” or, as it’s now known, a Merkle tree. This data structure is now almost certainly Merkle’s most famous work, even if it was almost incidental to the patent in which it was published, as it vastly improves efficiency for an incredible variety of cryptographic problems. Hash-based signatures require a “commitment scheme” in which one party sends a commitment to a future message such that i) there is exactly one message they can send that satisfies the commitment, ii) given a message, it is easy to check if it satisfies the commitment, and iii) the commitment doesn’t give away the message’s contents. Commitment schemes are used everywhere from twitter to multi-party computation. Typically, a commitment is just a hash (or “digest”) of the message. Anyone can hash a message and see if it’s equal to the commitment. Finding a different message with the same hash is a big deal. That didn’t quite work for Merkle’s scheme though: he wanted to commit to a whole set of different messages, then give an inclusion proof that a message was in the set without revealing the whole thing. To do that, he came up with this data structure: An example of a binary hash tree. Hashes 0-0 and 0-1 are the hash values of data blocks L1 and L2, respectively, and hash 0 is the hash of the concatenation of hashes 0-0 and 0-1 (Image and caption from Wikimedia) Think of a binary tree where each node has an associated hash. The leaves are each associated with the hash of a message in the set. Each branch is associated with the hash of its childrens’ hashes, concatenated. In this scheme, we can then just publish the top hash as a commitment. To prove some message is included in the set, we start at the leaf associated with its hash and walk up the tree. Every time we walk up to a branch, we keep track of the side we entered from and the hash associated with the node on the other side of that branch. We can then check proofs by redoing the concatenation and hashing at each step, and making sure the result is equal to our earlier commitment. This is a lot easier to understand by example. In the image above, to prove L3’s inclusion, our proof consists of [(Left, Hash 1-1), (Right, Hash 0)] because we enter Hash 1 from the left, with Hash 1-1 on the other side, then Top Hash from the right, with Hash 0 on the other side. To check this proof, we evaluate hash(Hash 0 + hash(hash(L3) + Hash 1-1)). If this is equal to Top Hash, the proof checks! Forging these proofs is, at each step, as hard as finding a hash collision, and proof size is logarithmic in message set size. This has all kinds of applications. Tahoe-LAFS, Git, and ZFS (see: Wikipedia) all use it for ensuring data integrity. It appears in decentralization applications from IPFS to Bitcoin to Ethereum (see again: Wikipedia). Lastly, it makes certificate transparency possible (more on that later). The ability to authenticate a data structure turns out to solve all kinds of hard computer science problems. “You meet your metaphor, and it’s good” Of course, a Merkle tree is not the only authenticated data structure possible. It’s not hard to imagine generalizing the approach above to trees of arbitrary branch width, and even trees with optional components. We can construct authenticated versions of pretty much any DAG-like data structure, or just map elements of the structure onto a Merkle tree. In fact, as Miller et al. found in 2014, we can construct a programming language where all data types are authenticated. In Authenticated Data Structures, Generically the authors create a fork of the OCaml compiler to do exactly that, and prove it to be both sound and performant. The mechanics for doing so are fascinating, but beyond the scope of this post. I highly recommend reading the paper. One interesting thing to note in Miller et al.’s paper is that they re-contextualize the motivation for authenticated data structures. Earlier in this post, we talked about Merkle trees as useful for commitment schemes and data integrity guarantees, but Miller et al. instead chooses to frame them as useful for delegation of data. Specifically, the paper defines an authenticated data structure as one “whose operations can be carried out by an untrusted prover, the results of which a verifier can efficiently check as authentic.” If we take a moment to think, we can see that this is indeed true. If I have a Merkle tree with millions of elements in it, I can hand it over to a third party, retaining only the top hash, then make queries to this data expecting both a value and an inclusion proof. As long as the proof checks, I know that my data hasn’t been tampered with. In the context of trustless distributed systems, this is significant (we’ll come back to exactly why later, I promise). In fact, I can authenticate not just reads, but writes! When I evaluate an inclusion proof, the result is a hash that I check against the digest I have saved. If I request the value at some index in the tree, save the proof, then request to write to that same index, by evaluating the old proof with the value I’m writing, I can learn what the digest will be after the write has taken place. Once again, an example may be helpful. Recall our earlier (diagrammed) example, where to prove L3’s inclusion, our proof consists of [(Left, Hash 1-1), (Right, Hash 0)]. If we want to write a new value, we first retrieve L3 and the associated proof. Then, just as we checked our proof by calculating hash(Hash 0 + hash(hash(L3) + Hash 1-1)) and ensured it was equal to the root hash, we calculate hash(Hash 0 + hash(hash(new_L3) + Hash 1-1)) and update our saved digest to the result. If this isn’t intuitive, looking back at the diagram can be really helpful. The combination of authenticated reads and writes allow for some very powerful new constructions. Specifically, by adding authentication “checkpoints” to a program in Miller et al.’s new language judiciously, we can cryptographically ensure that a client and server always agree on program state, even if the client doesn’t retain any of the data a program operates on! This is game-changing for systems that distribute computation to semi-trusted nodes (yes, like blockchains). This sounds like a wild guarantee with all manner of caveats, but it’s much less exciting than that. Programs ultimately run on overcomplicated Turing machines. Program state is just what’s written to the tape. Once you’ve accepted that all reads and writes can be authenticated for whatever data structure you’d like, the rest is trivial. Much of Miller et al.’s contribution is ultimately just nicer semantics! “We love the things we love for what they are” So far, we’ve achieved some fairly fantastical results. We can write code as usual, and cryptographically ensure client and server states are synchronized without one of them even having the data operated upon. This is a powerful idea, and it’s hard not to read it and seek to expand on it or apply it to new domains. Consequently, there have been some extremely impressive developments in the field of authenticated data structures even since 2014. One work I find particularly notable is Authenticated Data Structures, as a Library, for Free! by Bob Atkins, written in 2016. Atkins builds upon Miller et al.’s work so that it no longer requires a custom compiler, a huge step towards practical adoption. It does require that developers provide an explicit serialization for their data type, as well as a custom retrieval function. It now works with real production code in OCaml relatively seamlessly. There is still, however, the problem of indexing. Up until now we’ve been describing our access in terms of Merkle tree leaves. This works pretty well for data structures like an array, but it’s much harder to figure out how to authenticate something like a hashmap. Mapping the keys to leaves is trivial, but how do you verify that there was a defined value for a given key in the first place? Consider a simple hashmap from strings to integers. If the custodian of the authenticated hashmap claims that some key “hello” has no defined value, how do we verify that? The delegator could keep a list of all keys and authenticate that, but that’s ugly and inelegant, and effectively grows our digest size linearly with dataset size. Ideally, we’d still like to save only one hash, and synchronizing this key list between client and server is fertile breeding ground for bugs. Fortunately, Ben Laurie and Emilia Kasper of Google developed a novel solution for this in 2016. Their work is part of Trillian, the library that enables certificate transparency in Chrome. In Revocation Transparency, they introduce the notion of a sparse Merkle tree, a Merkle tree of infeasible size (in their example, depth 256, so a node per thousand atoms in the universe) where we exploit the fact that almost all leaves in this tree have the same value to compute proofs and digests in efficient time. I won’t go too far into the technical details, but essentially, with 2^256 leaves, each leaf can be assigned a 256-bit index. That means that given some set of key/value data, we can hash each key (yielding a 256-bit digest) and get a unique index into the tree. We associate the hash of the value with that leaf, and have a special null hash for leaves not associated with any value. There’s another diagram below I found very helpful: “An example sparse Merkle tree of height=4 (4-bit keys) containing 3 keys. The 3 keys are shown in red and blue. Default nodes are shown in green. Non-default nodes are shown in purple.” (Image and caption from AergoBlog) Now we know the hash of every layer-two branch that isn’t directly above one of our defined nodes as well, since it’s just hash(hash(null) + hash(null)). Extending this further, for a given computation we only need to keep track of nodes above at least one of our defined nodes, every other value can be calculated quickly on-demand. Calculating a digest, generating a proof, and checking a proof are all logarithmic in the size of our dataset. Also, we can verify that a key has no associated value by simply returning a retrieval proof valid for a null hash. Sparse Merkle trees, while relatively young, have already seen serious interest from industry. Obviously, they are behind Revocation Transparency, but they’re also being considered for Ethereum and Loom. There are more than a few libraries (Trillian being the most notable) that just implement a sparse Merkle tree data store. Building tooling on top of them isn’t particularly hard (check out this cool example). “Give me a land of boughs in leaf” As exciting as all these developments are, one might still wish for a “best of all worlds” solution: authenticated semantics for data structures as easy to use as Miller et al.’s, implemented as a lightweight library like Atkins’s, and with the support for natural indexing and exclusion proofs of Laurie and Kasper’s. That’s exactly what Indurative implements. Indurative uses a new GHC feature called DerivingVia that landed in GHC 8.6 last summer. DerivingVia is designed to allow for instantiating polymorphic functions without either bug-prone handwritten instances or hacky, unsound templating and quasiquotes. It uses Haskell’s newtype system so that library authors can write one general instance which developers can automatically specialize to their type. DerivingVia means that Indurative can offer authenticated semantics for essentially any indexed type that can be iterated through with binary-serializable keys and values. Indurative works out-of-the-box on containers from the standard library, containers and unordered-containers. It can derive these semantics for any container meeting these constraints, with any hash function (and tree depth), and any serializable keys and values, without the user writing a line of code. Earlier we briefly discussed the example of adding binary transparency to a package-management server in less than ten lines of code. If developers don’t have to maintain parallel states between the data structures they already work with and their Merkle tree authenticated store, we hope that they can focus on shipping features without giving up cryptographic authenticity guarantees. Indurative is still alpha software. It’s not very fast yet (it can be made waaaay faster), it may have bugs, and it uses kind of sketchy Haskell (UndecidableInstances, but I think we do so soundly). It’s also new and untested cryptographic software, so you might not want to rely on it for production use just yet. But, we’ve worked hard on commenting all the code and writing tests because we think that even if it isn’t mature, it’s really interesting. Please try it, let us know how it works, and let us know what you want to see. If you have hard cryptographic engineering problems, and you think something like Indurative might be the solution, drop us a line.
Categorías: Security Posts

Las nuevas aventuras del Doctor Tech en Youtube #MovistarHome

Un informático en el lado del mal - Lun, 2019/06/17 - 06:43
Como ya os he ido contando, nuestro querido compañero Nikotxan está haciendo una webserie sobre el Doctor Tech para contarnos cosas de la tecnología en general y de Movistar Home en particular, que es el primer producto que está explicando. A lo largo de los últimos meses hemos ido sacando de uno en uno hasta un total de cuatro.
Figura 1: Las nuevas aventuras del Doctor Tech en Youtube
Ahora los hemos puesto todos juntos en una Playlist de Youtube a la que te puedes suscribir para estar informado, dentro del canal de Telefónica, donde ya tienes disponibles todos los capítulos publicados hasta el momento.
Figura 2: Lista de vídeos en Youtube de Las nuevas aventuras del Dr. Tech
El último episodio, inspirado en mis queridos súper-héroes ha salido hace poco, así que aquí te dejo todos por si quieres verlos del tirón como hacemos con las series hoy en día. ¡Qué tengas una feliz semana!
Figura 3: Un oso, el genio de la lámpara y Movistar Home

Figura 4: Osovisión 

Figura 5: Unboxing Movistar Home 

Figura 6: Apaga la luz

Saludos Malignos!
Sigue Un informático en el lado del mal - Google+ RSS 0xWord
Categorías: Security Posts

An infection from Rig exploit kit, (Mon, Jun 17th)

SANS Internet Storm Center, InfoCON: green - Lun, 2019/06/17 - 05:49
Introduction Rig exploit kit (EK) is one of a handful of EKs still active as reported in May 2019 by Malwarebytes.  Even though EKs are far less active than in previous years, EK traffic is still sometimes noted in the wild.  Twitter accounts like @nao_sec, @david_jursa, @jeromesegura, and @tkanalyst occasionally tweet about EK activity.  Today's diary reviews a recent example of infection traffic caused by Rig EK. Recent developments For the past year, Rig EK has been using Flash exploits based on CVE-2018-8174 as noted in this May 2018 blog post from @kafeine.  Since then, other sources have reported Rig EK delivering a variety of malware like the Grobios Trojan or malware based on a Monero cryptocurrency miner.  Like other EKs, Rig EK is most often used in malvertising distribution campaigns.  In today's infection, Rig EK delivered AZORult, and the infection followed-up with other malware I was unable to identify. Infection traffic I used a gate from malvertising traffic in a recent tweet from @nao_sec.  See images below for details.
Shown above:  Traffic from the infection filtered in Wireshark.
Shown above:  A closer look at the Rig EK traffic.
Shown above:  Rig EK landing page.

Shown above:  Rig EK sends a Flash exploit.

Shown above:  Rig EK sending its malware payload (encrypted over the network, but decoded on the infected host).

Shown above:  An example of AZORult post-infection traffic.

Shown above:  Follow-up malware EXE retrieved by my infected Windows host.
Indicators of Compromise (IOCs) Redirect domain that led to Rig EK:
  • 194.113.104[.]153 port 80 - makemoneyeasy[.]live - GET /
Rig EK:
  • 5.23.55[.]246 port 80 - 5.23.55[.]246 - various URLs
AZORult post-infection traffic:
  • 104.28.8[.]132 port 80 - mixworld1[.]tk - POST /mix1/index.php
Infected Windows host retrieved follow-up malware:
  • 209.217.225[.]74 port 80 - hotelesmeflo[.]com - GET /chachapoyas/wp-content/themes/sketch/msr.exe
SHA256 hash: a666f74574207444739d9c896bc010b3fb59437099a825441e6c745d65807dfc
  • File size: 9,261 bytes
  • File description: Flash exploit used by Rig EK on 2019-06-17
SHA256 hash: 2de435b78240c20dca9ae4c278417f2364849a5d134f5bb1ed1fd5791e3e36c5
  • File size: 354,304 bytes
  • File description: Payload sent by Rig EK on 2019-06-17 (AZORult)
SHA256 hash: a4f9ba5fce183d2dfc4dba4c40155c1a3a1b9427d7e4718ac56e76b278eb10d8
  • File size: 2,952,704 bytes
  • File description: Follow-up malware hosted on URL at hotelesmeflo[.]com on 2019-06-17
Final words My infected Windows host retrieved follow-up malware after the initial AZORult infection.  However, I was using a virtual environment, and I didn't see any further post-infection traffic, so I could not identify the follow-up malware. A pcap of the infection traffic along with the associated malware and artifacts can be found here. ---
Brad Duncan
brad [at] malware-traffic-analysis.net (c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.
Categorías: Security Posts

ISC Stormcast For Monday, June 17th 2019 https://isc.sans.edu/podcastdetail.html?id=6542, (Mon, Jun 17th)

SANS Internet Storm Center, InfoCON: green - Lun, 2019/06/17 - 03:40
(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.
Categorías: Security Posts

Using Anomaly Detection to find malicious domains

Fox-IT - Mar, 2019/06/11 - 15:00
Applying unsupervised machine learning to find ‘randomly generated domains. Authors: Ruud van Luijk and Anne Postma At Fox-IT we perform a variety of research and investigation projects to detect malicious activity to improve the service of  our Security Operations Center. One of these areas is applying data science techniques to real world data in real world production environments, such as anomalous SMB sequences, beaconing patterns, and other unexpected patterns. This blog entry will share an application of machine learning to detect random-like patterns, indicating possible malicious activity. Attackers use domain generation algorithm[1] (DGA) to make a resilient Command and Control[2] (C2) infrastructure. Automatic and large scale malware operations pose a challenge on the C2 infrastructure of malware. If defenders identify key domains of the malware, these can be taken down or sinkholed, weakening the C2. To overcome this challenge, attackers may use a domain generation algorithm. A DGA is used to dynamically generate a large number of seemingly random domain names and then selecting a small subset of these domains for C2 communication. The generated domains are computed based on a given seed, which can consist of numeric constants, the current date, or even the Twitter trend of the day. Based on this same seed, each infected device will produce the same domain. The rapid change of C2 domains in use allows attackers to create a large network of servers, that is resilient to sinkholing, takedowns, and blacklisting. If you sinkhole one domain, another pops up the next day or the next minute. This technique is commonly used by multiple malware families and actors. For example, Ramnit, Gozi, and Quakbot use generated domains in the malware. Methods for detection Machine-learning approaches are proven to be effective to detect DGA domains in contrast to static rules. The input of these machine-learning approaches may for example consist of the entropy, frequency of occurrence, top-level domain, number of dictionary words, length of the domain, and n-gram. However, many of these approaches need labelled data. You need to know a lot of ‘good’ domains, and a lot of DGA domains. Good domains can be taken, for example, from the Alexa and Majestic million sets and DGA domains can be generated from known malicious algorithms. While these DGA domains are valid, they are only valid for the remainder of the usage of that specific algorithm. If there is a new type of DGA, chances are your model is not correct anymore and does not detect newly generated domains. Language regions pose a challenge on the ‘good’ domains. Each language has different structures and combinations. Taking the Alexa or Majestic million is a one-size-fits-all approach. Nuances might get lost. To overcome the challenges of labelled data, unsupervised machine learning might be a solution. These approaches do not need an explicit DGA training set – you only need to know what is normal or expected. A majority of research move to variants of neural networks, which require a lot of computational power to train and predict. With the amount of network data this is not necessarily a deal-breaker if there is ample computing power, but it certainly is a factor to consider. An easier to implement solution is to look solely at the occurrences of n-grams to define what is normal. N-grams are sequences of N consecutive elements such as words or letters, where bi-grams (2-grams) are sequences of two, tri-grams (3-grams) are sequences of three, etc. To illustrate with the domain ‘google.com’: “This is an intuitive way to dissect language. Because, what are the odds you see a ‘kzp’ in a domain? And what are the odds you see ‘oog’ in a domain?”   We calculate the domain probability by multiplying the probability of each of the tri-grams and normalise by dividing it by the length of the domain. We chose an unconditional probability, meaning we ignore the dependency between n-grams as this speeds up training and calculation times. We also ignored the top level domain (e.g. “.co.uk”, “.org”) as these are common in both normal as in DGA domains and will focus our model to the parts of the domain that is distinctive. If the domain probability is below a predefined threshold, the domain is deviant from the baseline and likely a DGA domain. Results To evaluate this technique we trained on roughly 8 million non-unique common names of a network, thereby creating a baseline of what is normal for this network. We evaluated the model by scoring one million non-unique common names and roughly 125.000 DGA domains over multiple algorithms, provided by Johannes Bader[3]. We excluded some domains that are known to use random generated (sub)-domains from both the training- and evaluation set, such as content delivery networks. Figure below illustrates the log probability distributions of the blue baseline domains, i.e. the domains you would expect to see, and the red DGA domains. Although a clear distinction between the two distributions can be seen there is also a small overlap between the -10 and -7.5 visible. This is because some DGA domains are much alike to regular domains, some baseline domain are random-like, and for some domains our model wasn’t able to correctly distinguish it from DGA domains. For our detection to be practically useful in large operations, such as Security Operation Centers, we need a very low false positive rate. We also assumed that every baseline has a small contamination ratio. We chose for a ratio of 0.001%. We also use this as the cut-off value between predicting a domain as DGA or not. During hunting this threshold may be increased or completely ignored. True DGA True Normal Predicted DGA 94.67% ~0 Predicted Normal 6.33% ~100% Total 100% 100% If we take the cut-off value at this point we get an accuracy (the percentage correct) of 99.35%  and an F1-score of 97.26. Conclusion DGA domains are a tactic used by various malware families. Machine learning approaches are proven to be useful in the detection of this tactic, but lack to generalize in a simple and strong solution for production. By relaxing some restrictions on the math and compensating this with a lot of baseline data, a simple and effective solution can be found. This simple and effective solution does not rely on labelled data, is on par with scientific research and has the benefit to take into account the common language of regular domains used in the network. We demonstrated this solution with hostnames in common names, but it is also applicable for HTTP and DNS. Moreover, a wide range of applications is possible since it detects deviations from the expected. For example random generated file names, deviating hostnames, unexpected sequences of connections, etc.
  1. This technique is recently added to the MITTRE ATT&CK tactics. https://attack.mitre.org/techniques/T1483/
  2. For more information about C2, see: https://attack.mitre.org/tactics/TA0011/
  3. https://github.com/baderj/domain_generation_algorithms
Categorías: Security Posts
Distribuir contenido