Research

Overview

I study methods and build tools for making computers and computer programs more secure. Most recently, I have concentrated on the security of web browsers and web applications; these programs are particularly vulnerable due to the inherent insecurities of the open Internet, and to the proliferation of malicious web sites and malicious content republished at reputable sites.

Much of my work focuses on building general defenses against classes of attacks, rather than fixing a single vulnerability in a single program. For example, I work on secure programming languages because they can give all programs written in the language protection against a class of threats.

Jump to: Computer security | Programming languages | Distributed computing

Computer security

Web application security

The number one reported class of security vulnerabilities today is script injection. It has affected all major sites (Yahoo, Google, MySpace, Facebook, etc.) and it occurs when a web site mistakenly publishes user-generated content that includes a malicious program (script). This is a pervasive problem on the web, since many web sites exist solely to publish user-generated content (blog postings, comments, home pages, photos, etc.).

Working with colleagues at the University of Maryland, I showed that it is almost impossible for a web site, working alone, to prevent script injection. Instead, we proposed that web sites communicate to web browsers policies that tell which scripts are trusted by the site, and rely on the web browser to enforce the policies. Our method, called BEEP, can prevent script injection with 100% accuracy.

resources

BEEP home page
Defeating Script Injection Attacks with Browser-Enforced Embedded Policies. Trevor Jim, Nikhil Swamy, and Michael Hicks. 16th International World Wide Web Conference (WWW 2007), May 2007.

Application-specific firewalls

The second largest class of security vulnerabilities being reported today are buffer overflows. These are severe vulnerabilities: an attacker can crash or even take over an application by exploiting a buffer overflow.

I have built a tool called Yakker that protects applications against buffer overflows caused by malicious inputs. Yakker acts as an application-specific firewall: it examines the network protocol messages that are the inputs to network servers and filters out malformed messages as well as other messages that can trigger overflows.

The main novelty of Yakker is that it is highly automated, producing a simple firewall directly from existing human-readable specifications, namely, the “Request for Comments” documents that define most of the basic protocols of the Internet.

resources

Yakker is distributed as part of the Cyclone project.
Yakker: A parser generator for network protocol messages. Trevor Jim. Slides for a talk given at New Jersey Programming Languages and Systems Seminar, November, 2005.

Trust management

A trust management system implements security policies that are constructed by composing the policies of multiple parties. Carl Gunter and I implemented QCM, the first trust management system based on a declarative query language; this is now the dominant approach to trust management systems. QCM was also the first distributed trust management system: it uses local policy information (e.g., local databases or locally available digital certicates) but can also retrieve remote policy information automatically, as necessary. Carl and I also showed how revocation of the certificates used in a trust management system could be expressed entirely within the same trust management system.

I built a second trust management system, SD3, that extended QCM with recursive policies. SD3 was the first trust management system based on Datalog. SD3 featured a self-certifying evaluator, which produces machine-checkable proofs of its output, reducing the size of the trusted computing base significantly. Dan Suciu and I studied the theory of SD3, showing that it has many possible distributed evaluation strategies.

resources

QCM home page

SD3 home page

Design of an Application-Level Security Infrastructure. Carl A. Gunter and Trevor Jim. DIMACS Workshop on Design and Formal Verification of Security Protocols, September 3–5, 1997.
Policy-Directed Certificate Retrieval. Carl A. Gunter and Trevor Jim. Technical Report MS–CIS–99–07, Department of Computer and Information Science, University of Pennsylvania. September 1998. A revised version is published in Software Practice and Experience, 30(15):1609–1640, 2000.
The SwitchWare Active Network Implementation. D. Scott Alexander, Michael W. Hicks, Pankaj Kakkar, Angelos D. Keromytis, Marianne Shaw, Jonathan T. Moore, Carl A. Gunter, Trevor Jim, Scott M. Nettles, and Jonathan M. Smith. Notes of the ACM SIGPLAN Workshop on ML, 67–76, September 1998.
What is QCM? Carl A. Gunter and Trevor Jim. August 1999.
Generalized Certificate Revocation. Carl A. Gunter and Trevor Jim. ACM Symposium on Principles of Programming Languages, January 2000.
Certificate Distribution with Local Autonomy. Pankaj Kakkar, Michael McDougall, Carl A. Gunter, and Trevor Jim. The Second International Working Conference on Active Networks, October 2000.
SD3: A Trust Management System with Certified Evaluation. Trevor Jim. IEEE Symposium on Security and Privacy, May 2001.
Dynamically Distributed Query Evaluation. Trevor Jim and Dan Suciu. ACM Symposium on Principles of Database Systems, May 2001.

System call monitoring

A system call monitor watches the sequence of system calls made by an application, looking for “abnormal” system calls. This can indicate that the application has been compromised, and is being controlled by an attacker. System call monitors are an old idea, but are not used often, because they can be slow, and because errors in the model of “abnormal” behavior can cause an administrative nightmare of false positives.

Authenticated system calls address both of these drawbacks. First, they use a novel and very efficient means of enforcement: the notion of “abnormal” is embedded in the application itself, with cryptographic protections, and the application passes this in to the kernel with each system call. This eliminates the need for a separate (and slow) watchdog process. Second, they have no false positives because we define the notion of “abnormal” using a conservative static program analysis.

Our conference paper won a best student paper award.

resources

System Call Monitoring Using Authenticated System Calls. Mohan Rajagopalan, Matti Hiltunen, Trevor Jim, and Richard Schlichting. IEEE Transactions on Secure and Dependable Computing, 3(3), 216–229, July–September 2006.
Authenticated System Calls. Mohan Rajagopalan, Matti Hiltunen, Trevor Jim, and Richard Schlichting. DSN–2005: The International Conference on Dependable Systems and Networks, June 2005.

DNS

The Domain Name System (DNS) has well-known insecurities and there has been a more secure successor (DNSSEC) in the works for >10 years. However, DNSSEC has yet to be adopted, and we are starting to see real attacks on DNS (cache poisoning). Christof Fetzer and I examined some of the economic reasons for the failure of DNSSEC, and we proposed a scheme to “bootstrap” a more secure DNS from an existing public key directory (SSL).

In 1998 I proposed that a system like DNS could be implemented entirely within a distributed trust management system like QCM; I reimplemented this in SD3, and this is now a common benchmark for trust management systems.

resources

SEDNS home page
Enhancing DNS Security using the SSL Trust Infrastructure. Christof Fetzer, Gert Pfeifer, and Trevor Jim. Tenth IEEE International Workshop on Object-oriented Real-time Dependable Systems (WORDS 2005), February 2005.
Incentives and Disincentives for DNSSEC Deployment. Christof Fetzer and Trevor Jim. March 1, 2004.
Policy-Directed Certificate Retrieval. Carl A. Gunter and Trevor Jim. Technical Report MS–CIS–99–07, Department of Computer and Information Science, University of Pennsylvania. September 1998. A revised version is published in Software Practice and Experience, 30(15):1609–1640, 2000.
SD3: A Trust Management System with Certified Evaluation. Trevor Jim. IEEE Symposium on Security and Privacy, May 2001.

Programming languages

Cyclone

Safe programming languages have become mainstream: programmers are now using languages like Java, C#, and Javascript in preference to languages like C. The advantages of safety (no buffer overflows) and garbage collection (which enables safety and reduces development time) ensure that safe languages will only extend their dominance in the future.

However, C remains essential in cases where efficiency is paramount, or where lack of computational resources, memory, or bandwidth rule out garbage collection and other techniques typically used in safe languages. In particular, core components of safe languages like garbage collectors and runtime systems are usually written in C. This is a dangerous situation, given the dismal security record of C programs with regards to buffer overflows.

The Cyclone project addresses this by retrofitting safety into the C language. Cyclone is a dialect of C that prevents unsafe programs (no buffer overflows) while retaining one of C's defining characteristics: control is in the hands of the programmer. For example, Cyclone allows manual memory management (alloc and free instead of garbage collection), but requires the programmer to add annotations to programs that Cyclone will check to ensure safety. It is possible to write provably-safe garbage collectors entirely in Cyclone.

resources

Cyclone home page
Cyclone: A Safe Dialect of C. Trevor Jim, Greg Morrisett, Dan Grossman, Michael Hicks, James Cheney, and Yanling Wang. USENIX Annual Technical Conference, pages 275–288, Monterey, CA, June 2002.
Region-based Memory Management in Cyclone. Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling Wang, and James Cheney. ACM Conference on Programming Language Design and Implementation, pages 282–293, Berlin, Germany, June 2002.
Combining Garbage Collection and Safe Manual Memory Management. Michael Hicks, Dan Grossman, and Trevor Jim. Second workshop on Semantics, Program Analysis, and Computing Environments for Memory Management (SPACE 2004), Venice, Italy, January 2004.
Experience with Safe Manual Memory-Management in Cyclone. Michael Hicks, Greg Morrisett, Dan Grossman, and Trevor Jim. The 2004 International Symposium on Memory Management, October 2004.
Cyclone: A Type-Safe Dialect of C. Dan Grossman, Michael Hicks, Trevor Jim, and Greg Morrisett. C/C++ Users Journal, January 2005.

Typed assembly language and run-time code generation

Typed assembly language adds type annotations to machine code (binaries). These annotations can be mechanically checked and ensure properties like memory safety (no buffer overflows). Luke Hornof and I extended this idea to support run-time code generation: our machine code annotations permit us to write provably-safe programs that generate and execute provably-safe machine code at runtime.

We later integrated our work more closely into the Popcorn compiler of Morrisett et al., and this collaboration led to our work on the Cyclone programming language.

resources

Certifying Compilation and Run-time Code Generation. Luke Hornof and Trevor Jim. Higher Order and Symbolic Computation, 12(4):337–375, December 1999. An earlier version of this paper appeared in ACM Workshop on Partial Evaluation and Semantics-Based Program Manipulation, January 22–23, 1999.
Compiling for Template-Based Run-Time Code Generation. Frederick Smith, Dan Grossman, Greg Morrisett, Luke Hornof, and Trevor Jim. Journal of Functional Programming, May 2003.

Type inference

I did a PhD thesis on problems in type inference. I was the first to point out the distinction between the properties of principal types and principal typings, and I showed that principal typings enable language features like “smartest recompilation,” incremental type inference, and accurate type error messages. Furthermore, I showed that the lack of principal typings is at the root of the undecidability of type inference for polymorphic recursion.

Separately, I built a type inference algorithm based on polarity that can infer types well beyond those inferred by Hindley-Milner type inference (e.g., it infers types with universal quantification at arbitrary rank). The algorithm makes critical use of principal typings and is also an example of what Hosaya and Pierce call local type inference.

I have also worked with Jens Palsberg on type inference problems in object type systems.

resources

What are principal typings and what are they good for? Trevor Jim. MIT/LCS TM–532, Massachusetts Institute of Technology, August 1995; revised November 1995. Extended version of a paper appearing in ACM Symposium on Principles of Programming Languages, 1996.
Rank 2 type systems and recursive definitions. Trevor Jim. MIT/LCS TM–531, Massachusetts Institute of Technology, August 1995; revised November 1995.
Type Inference with Simple Selftypes is NP-complete. Jens Palsberg and Trevor Jim. Nordic Journal of Computing 4(3):259–286, Fall 1997.
Type inference in systems of recursive types with subtyping. Trevor Jim and Jens Palsberg. July 1997.
Automatic Discovery of Covariant Read-Only Fields. Jens Palsberg, Tian Zhao, and Trevor Jim. ACM Transactions on Programming Languages and Systems, Volume 27, Issue 1, January 2005. An earlier version appeared in Ninth International Workshop on Foundations of Object-Oriented Languages, January 2002.

Standard ML of New Jersey

I was a member of the team that built the original Standard ML of New Jersey compiler. Andrew Appel and I built the first native back end for the compiler, which took took the (then) novel approach of a having a very large number of passes, each of which implemented a simple program transformation. We introduce the transformation to closure-passing style. We were also the first to implement a one-pass translation to continuation-passing style that avoids constructing so-called administrative redexes.

resources

SML/NJ home page
Continuation-passing, closure-passing style. Andrew W. Appel and Trevor Jim. Proc. Sixteenth ACM Symposium on Principles of Programming Languages, 293–302, January 1989.
Shrinking Lambda Expressions in Linear Time. Andrew W. Appel and Trevor Jim. Journal of Functional Programming 7(5):515–540, September 1997.

Denotational semantics

For my master's thesis I studied the problem of full abstraction in denotional semantics. Lack of full abstraction indicates a mismatch between a semantics and the programming language it intends to model. The basic finding is that Berry's stable domains cannot be a fully abstract model for any language defined by a simple class of term rewriting rules.

resources

Full Abstraction and the Context Lemma. Trevor Jim and Albert R. Meyer. SIAM Journal of Computing 25(3), 663–696, June 1996.

Distributed computing

Trust management

I built three trust management systems (SD3 and two versions of QCM) that are essentially distributed query languages applied to a security problem. What distinguishes these distributed databases from previous work is that the distributed structure of the system is dynamically discovered by queries into the database itself, in the same way that hyperlinks are discovered by following links to web pages. The connectivity of the system is therefore not static as in previous work.

DXQ

I am working with Mary Fernandez on another distributed query language, DXQ, that combines ideas from QCM and SD3 with Mary's XML query language implementation, Galax. The focus of this work is on distributed computation rather than the security applications of SD3 and QCM.

We are using DXQ to implement peer-to-peer systems for distributed resource management.

resources

DXQ home page

Galax home page

DXQ: A Distributed XQuery Scripting Language. Mary Fernandez, Trevor Jim, Kristi Morton, Nicola Onose, Jerome Simeon. 4th Int'l Workshop on XQuery Implementation, Practice and Experience (XIME-P), June 2007.
Highly Distributed XQuery with DXQ. Mary Fernandez, Trevor Jim, Kristi Morton, Nicola Onose, and Jerome Simeon. 27th ACM SIGMOD International Conference on Management of Data, demo paper, June 2007.

iMobile

resources

An Alerting and Notification Service on the AT&T Enterprise Messaging Network. Serban Jora, Rittwik Jana, Yih-Farn Chen, Matti Hiltunen, Trevor Jim, Huale Huang, Ashish K. Singh, and Radhakrishnan Muthumanickam. IASTED International Conference on Internet and Multimedia Systems and Applications (EuroIMSA 2005), February 2005.
iMobile EE: An Enterprise Mobile Service Platform. Yih-Farn Chen, Huale Huang, Rittwik Jana, Trevor Jim, Matti Hiltunen, Sam John, Servan Jora, Radhakrishnan Muthumanickam, and Bin Wei. ACM Journal on Wireless Networks, July 2003.

DOSE

I built a distributed database system called DOSE with Christof Fetzer. DOSE took a free, off-the-shelf database and deployed it on a cluster to create a cheap dependable distributed database—the RAID idea applied to databases instead of disk drives.

resources

Dependable Distributed Computing using Free Databases. Christof Fetzer and Trevor Jim. Service Availability: Second International Service Availability Symposium, ISAS 2005, pages 123–136, Lecture Notes in Computer Science, Volume 3694, April 2005.

SwitchWare

The SwitchWare project at the University of Pennsylvania built a network in which packets contain small programs that execute at each node they pass through. We used my QCM trust management system to implement security policies in the network.

resources

The SwitchWare Active Network Implementation. D. Scott Alexander, Michael W. Hicks, Pankaj Kakkar, Angelos D. Keromytis, Marianne Shaw, Jonathan T. Moore, Carl A. Gunter, Trevor Jim, Scott M. Nettles, and Jonathan M. Smith. Notes of the ACM SIGPLAN Workshop on ML, 67–76, September 1998.

File synchronization

I implemented a file synchronizer called Unison with Benjamin Pierce. Unison supports bi-directional, cross-platform synchronization.

Unison is widely used and is included with most Linux distributions.

resources

Unison home page