Current Previous Research

  • Standard components of the semantic web include a format for storing data (Resource Description Framework (RDF)), a language for querying that data (SPARQL), and a logic which enables using rules to make inferences over that data [9]. Our framework leverages this foundation, specifying access control policies via rules which, when matched, cause automatic generation of RDF triples which encode policy facts. In contrast to relational databases where a value’s location (i.e., table, column) provides its context, the meaning of an RDF tuple is encoded in the tuple itself. Hence, access control policy must be expressible in terms of tuple contents. 

    Motivated by an effort to enable users to have a single logical view of their on-line information (e.g., pictures, tweets, virtual identities, social networks) — referred to as a Personal Index or Pix—we present a framework for specifying and enforcing access control policies for data stored using RDF and queried via SPARQL. Since a user’s on-line information may be diverse, extensive and ever-growing, sharing policies over that data must be intuitive for the non-expert and easy-to-maintain. Hence, we allow users to specify policy via high-level rules. Policy enforcement is via a query rewriting module which, in contrast to previous approaches, does not need visibility into the set of configured policies. PixACL policies are easily composed and can be queried with the same expressive language used to query data.

  • Some slides 
    here. Early draft of paper here.


We address the semantic gap problem in behavioral monitoring by using hierarchical behavior graphs to infer high-level behaviors from myriad low-level events. Our experimental system traces the execution of a process, performing data-flow analysis to identify meaningful actions such as "proxying," "keystroke logging," "data leaking," and "downloading and executing a program" from complex combinations of rudimentary system calls. To preemptively address evasive malware behavior, our specifications are carefully crafted to detect alternative sequences of events that achieve the same high-level goal. We tested eleven benign programs, variants from seven malicious bot families, four trojans, and three mass-mailing worms and found that we were able to thoroughly identify high-level behaviors across this diverse code base. Moreover, we effectively distinguished malicious execution of high-level behaviors from benign by identifying remotely-initiated actions.
Automated bot/botnet detection is a difficult problem given the high level of attacker power. We propose a systematic approach for evaluating the evadability of detection methods. An evasion tactic has two associated costs: implementation complexity and effect on botnet utility. An evasion tactic's implementation complexity is based on the ease with which bot writers can incrementally modify current bots to evade detection. Modifying a bot in order to evade a detection method may result in a less useful botnet; to explore this, we identify aspects of botnets that impact their revenue-generating capability. For concreteness, we survey some leading automated bot/botnet detection methods, identify evasion tactics for each, and assess the costs of these tactics. We also reconsider assumptions about botnet control that underly many botnet detection methods.
BotSwat is a behavior-based malware detector which targets the command-execution behavior of malicious bots on their infected hosts. In response to receiving a command over the command-and-control network, a bot performs some actions — thus transforming the infected computer into a platform from which attacks are launched. A bot command often consists of some keyword (identifying the target action) along with parameters which determine how the action should be performed; e.g., a bot web-download command typically takes two arguments: one identifying the URL from which to download and another identifying the local file path at which to store the downloaded data. Botnets have been used in phishing, distributed denial-of-service, malware distribution, spamming, scanning, and harvesting the host system for license or product keys etc.

BotSwat tracks data received over the network as a program processes this 
tainted data. When a process uses tainted data in a system call argument, BotSwat identifies bot command execution. BotSwat detects bots from families that together constitute 98.2% of known variants. Moreover, BotSwat has very low false positives since it is able to identify locally-initiated actions, i.e. those behaviors performed by a program at the behest of its local user. BotSwat's generality enables it to identify novel bot commands and behaviors and to provide detection across diverse bot implementations.