Trevor Jim: Papers

Papers

EyeDecrypt—Private Interactions in Plain Sight abstract, preprint, demo.
Andrea Forte and Juan Garay and Trevor Jim and Yevgeniy Vahlis. 9th Conference on Security and Cryptography for Networks (SCN 2014), September 3-5, 2014.

We introduce EyeDecrypt, a novel technology for privacy-preserving human-computer interaction. EyeDecrypt allows only authorized users to decipher data shown on a display, such as an electronic screen or plain printed material; in the former case, the authorized user can then interact with the system (e.g., by pressing buttons on the screen), without revealing the details of the interaction to others who may be watching or to the system itself. The user views the decrypted data on a closely-held personal device, such as a pair of smart glasses with a camera and heads-up display, or a smartphone. The data is displayed as an image overlay on the personal device, which we assume cannot be viewed by the adversary. The overlay is a form of augmented reality that not only allows the user to view the protected data, but also to securely enter input into the system by randomizing the input interface. EyeDecrypt consists of three main components: a visualizable encryption scheme; a dataglyph-based visual encoding scheme for the ciphertexts generated by the encryption scheme; and a randomized input and augmented reality scheme that protects user inputs without harming usability. We describe all aspects of EyeDecrypt, from security definitions, constructions and analysis, to implementation details of a prototype developed on a smartphone.

Why is My Smartphone Slow? On The Fly Diagnosis of Poor Performance on Mobile Internet abstract, pdf.
Chaitrali Amrutkar, Matti Hiltunen, Trevor Jim, Kaustubh Joshi, Oliver Spatscheck, Patrick Traynor and Shobha Venkataraman. The 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2013), 24-27th June 2013, Budapest.

The perceived end-to-end performance of the mobile Internet can be impacted by multiple factors including websites, devices, and network components. Constant changes in these factors and network complexity make identifying root causes of high latency difficult. In this paper, we propose a multidimensional diagnosis technique using passive IP flow data collected at ISPs for investigating factors that impact the performance of the mobile Internet. We implement and evaluate our technique over four days of data from a major US cellular provider’s network. Our approach identifies several combinations of factors affecting performance. We investigate four combinations in-depth to confirm the latency causes chosen by our technique. Our findings include a popular gaming website showing poor performance on a specific device type for over 50% of the flows and web browser traffic on older devices accounting for 99% of poorly performing traffic. Our technique can direct operators in choosing factors having high impact on latency in the mobile Internet.

Delayed Semantic Actions in Yakker abstract, pdf.
Trevor Jim and Yitzhak Mandelbaum. 11th International Workshop on Language Descriptions, Tools, and Applications (LDTA), March, 2011.

Yakker is a parser generator that supports both semantic actions and speculative parsing techniques such as backtracking and context-free lookahead. To avoid executing semantic actions in parses which will eventually be discarded, we divide parsing into two phases. In the first (early) phase, the parser explores multiple possible parse trees before settling on a final parse tree or parse trees, without executing semantic actions. The second (late) phase executes the delayed semantic actions.

We structure the early phase as a transducer which maps the input language to an output language of labels. A string in the output language is a history of the semantic actions that would have been executed in a parse of the input. The late phase is implemented as a deterministic, recursive descent parse of the history.

We formalize delayed semantic actions and discuss a number of practical issues involved in implementing them in Yakker, including our support for regular right part grammars and dependent parsing, the design of the history data structure, and memory management techniques critical for efficient implementation.

Semantics and Algorithms for Data-dependent Grammars abstract, pdf.
Trevor Jim, Yitzhak Mandelbaum, and David Walker. 37th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL), January, 2010.

Traditional parser generation technologies are incapable of handling the demands of modern programmers. In this paper, we present the design and theory of a new parsing engine, Yakker, capable of handling the requirements of modern applications including (1) full scannerless context-free grammars with (2) regular expressions as right-hand sides for defining nonterminals. Yakker also includes (3) facilities for binding variables to intermediate parse results and (4) using such bindings within arbitrary constraints to control parsing. These facilities allow the kind of data-dependent parsing commonly needed in systems applications, particularly those that operate over binary data. In addition, (5) nonterminals may be parameterized by arbitrary values, which gives the system good modularity and abstraction properties in the presence of data-dependent parsing. Finally, (6) legacy parsing libraries, such as sophisticated libraries for dates and times, may be directly incorporated into parser specifications. We illustrate the importance and utility of this rich format specification language by presenting its use on examples ranging from difficult programming language grammars to web server logs to binary data specification. We also show that our grammars have important compositionality properties and explain why such properties are important in modern applications such as automatic grammar induction.

In terms of technical contributions, we provide a traditional high-level semantics for our new grammar formalization and show how to compile these grammars into nondeterministic automata. These automata are stack-based, somewhat like conventional push-down automata, but are also equipped with environments to track data-dependent parsing state. We prove the correctness of our translation of data-dependent grammars into these new automata and then show how to implement the automata efficiently using a variation of Earley's parsing algorithm.

An Emulation of GENI Access Control abstract, pdf.
Soner Sevinc, Larry Peterson, Trevor Jim, and Mary Fernandez. 2nd Workshop on Cyber Security Experimentation and Test (CSET), August, 2009.

This paper describes an emulation of a distributed access control system proposed for use in the GENI network testbed. We use our trust management system, CERT-DIST, to realize the system policy, and measure its performance by mapping PlanetLab’s centralized access control scheme to GENI’s distributed scheme and then replaying logs of PlanetLab access control activity. Our log analysis indicates that any such system must be resilient to both misconfigurations and attacks, and our emulation results show the effect of caching schemes and certificate expiration intervals in reducing load on servers and improving response time.

Using Static Analysis for Ajax Intrusion Detection abstract, pdf.
Arjun Guha, Shriram Krishnamurthi, and Trevor Jim. International World Wide Web Conference (WWW), April, 2009.

We present a static control-flow analysis for JavaScript programs running in a web browser. Our analysis tackles numerous challenges posed by modern web applications including asynchronous communication, frameworks, and dynamic code generation. We use our analysis to extract a model of expected client behavior as seen from the server, and build an intrusion-prevention proxy for the server: the proxy intercepts client requests and disables those that do not meet the expected behavior. We insert random asynchronous requests to foil mimicry attacks. Finally, we evaluate our technique against several real applications and show that it protects against an attack in a widely-used web application.

Efficient Earley Parsing with Regular Right-hand Sides abstract, pdf.
Trevor Jim and Yitzhak Mandelbaum. Workshop on Language Descriptions Tools and Applications (LDTA), March, 2009.

We present a new variant of the Earley parsing algorithm capable of efficiently supporting context-free grammars with regular right hand-sides. We present the core state-machine driven algorithm, the translation of grammars into state machines, and the reconstruction algorithm. We also include a theoretical framework for presenting the algorithm and for evaluating optimizations. Finally, we evaluate the algorithm by testing its implementation.

Highly Distributed XQuery with DXQ abstract, pdf.
Mary Fernandez, Trevor Jim, Kristi Morton, Nicola Onose, and Jerome Simeon. 27th ACM SIGMOD International Conference on Management of Data, demo paper, June 2007.

Many modern applications, from Grid computing to RSS handling, need to support data processing in a distributed environment. Currently, most such applications are implemented using a general purpose programming language, which can be expensive to maintain, hard to configure and modify, and require hand optimization of the distributed data processing operations. We present Distributed XQuery (DXQ), a simple, yet powerful, extension of XQuery to support distributed applications. This extension includes the ability to deploy networks of XQuery computing nodes, to remotely invoke XQuery programs on those nodes, and to ship code between nodes. We demonstrate the capabilities of DXQ by building a network of name servers that capture the functionality of DNS, the Domain Name System. We show that our system can flexibly accommodate different patterns of distributed computation, and discuss some simple but essential distributed optimizations.

Defeating Script Injection Attacks with Browser-Enforced Embedded Policies abstract, pdf.
Trevor Jim, Nikhil Swamy, and Michael Hicks. 16th International World Wide Web Conference (WWW 2007), May 2007.

Web sites that accept and display content such as wiki articles or comments typically filter the content to prevent injected script code from running in browsers that view the site. The diversity of browser rendering algorithms and the desire to allow rich content makes filtering quite difficult, however, and attacks such as the Samy and Yamanner worms have exploited filtering weaknesses. To solve this problem, this paper proposes a simple mechanism called Browser-Enforced Embedded Policies (BEEP). The idea is that a web site can embed a policy inside its pages that specifies which scripts are allowed to run. The browser, which knows exactly when it will run a script, can enforce this policy perfectly. We have added BEEP support to several browsers, and built tools to simplify adding policies to web applications. We found that supporting BEEP in browsers requires only small and localized modifications, modifying web applications requires minimal effort, and enforcing policies is generally lightweight.

System Call Monitoring Using Authenticated System Calls abstract, pdf.
Mohan Rajagopalan, Matti Hiltunen, Trevor Jim, and Richard Schlichting. IEEE Transactions on Secure and Dependable Computing, 3(3), 216–229, July–September 2006.

System call monitoring is a technique for detecting and controlling compromised applications by checking at runtime that each system call conforms to a policy that specifies the program's normal behavior. Here, we introduce a new approach to implementing system call monitoring based on authenticated system calls. An authenticated system call is a system call augmented with extra arguments that specify the policy for that call, and a cryptographic message authentication code that guarantees the integrity of the policy and the system call arguments. This extra information is used by the kernel to verify the system call. The version of the application in which regular system calls have been replaced by authenticated calls is generated automatically by an installer program that reads the application binary, uses static analysis to generate policies, and then rewrites the binary with the authenticated calls. This paper presents the approach, describes a prototype implementation based on Linux and the Plto binary rewriting system, and gives experimental results suggesting that the approach is effective in protecting against compromised applications at modest cost.

Authenticated System Calls abstract, pdf.
Mohan Rajagopalan, Matti Hiltunen, Trevor Jim, and Richard Schlichting. DSN–2005: The International Conference on Dependable Systems and Networks, June 2005.

System call monitoring is a technique for detecting and controlling compromised applications by checking at runtime that each system call conforms to a policy that specifies the program's normal behavior. Here, a new approach to system call monitoring based on authenticated system calls is introduced. An authenticated system call is a system call augmented with extra arguments that specify the policy for that call and a cryptographic message authentication code (MAC) that guarantees the integrity of the policy and the system call arguments. This extra information is used by the kernel to verify the system call. The version of the application in which regular system calls have been replaced by authenticated calls is generated automatically by an installer program that reads the application binary, uses static analysis to generate policies, and then rewrites the binary with the authenticated calls. This paper presents the approach, describes a prototype implementation based on Linux and the PLTO binary rewriting system, and gives experimental results suggesting that the approach is effective in protecting against compromised applications at modest cost.

Dependable Distributed Computing using Free Databases abstract, view online.
Christof Fetzer and Trevor Jim. Service Availability: Second International Service Availability Symposium, ISAS 2005, pages 123–136, Lecture Notes in Computer Science, Volume 3694, April 2005.

Designing and programming dependable distributed applications is very difficult. Databases provide features like transactions and replication that can help in the implementation of dependable applications. There are in particular various free databases that make it economically feasible to run a database on each computer in a system. Hence, one can partition database tables across multiple hosts to harness the processing power and disks of multiple machines. We describe a system that simplifies partitioning tables across multiple hosts. DOSE exposes the partitions to the programmer rather than giving the illusion of a single table. Our focus is on providing a simple implementation that works for freely-available databases, on automatic tuning of the partitions for best performance, and on applying the fault tolerance mechanisms of the databases to build dependable distributed systems. We show how we use this system to implement a distributed work queue.

Enhancing DNS Security using the SSL Trust Infrastructure abstract, pdf.
Christof Fetzer, Gert Pfeifer, and Trevor Jim. Tenth IEEE International Workshop on Object-oriented Real-time Dependable Systems (WORDS 2005), February 2005.

The main functionality of the Domain Name System (DNS) is to translate symbolic names into IP addresses. Due to the criticality of DNS for the proper functioning of the Internet, many improvements have been proposed for DNS in terms of security and dependability. However, the current secure DNS (DNSSEC) standard has still several problems that need further consideration. For example, online updates and denial of service attacks are not sufficiently addressed. These problems are serious obstacles that might prevent DNSSEC from replacing the traditional DNS. In this paper we discuss several of these technical and economic problems. To address these issues, we propose a simple extension to the existing DNS. It is SSL based and individual domains can decide independently of each other if and when to adopt the extensions. We show how to implement these extensions with the help of a simple proxy DNS server.

An Alerting and Notification Service on the AT&T Enterprise Messaging Network
Serban Jora, Rittwik Jana, Yih-Farn Chen, Matti Hiltunen, Trevor Jim, Huale Huang, Ashish K. Singh, and Radhakrishnan Muthumanickam. IASTED International Conference on Internet and Multimedia Systems and Applications (EuroIMSA 2005), February 2005.

Cyclone: A Type-Safe Dialect of C
Dan Grossman, Michael Hicks, Trevor Jim, and Greg Morrisett. C/C++ Users Journal, January 2005.

Automatic Discovery of Covariant Read-Only Fields abstract.
Jens Palsberg, Tian Zhao, and Trevor Jim. ACM Transactions on Programming Languages and Systems, Volume 27, Issue 1, January 2005. An earlier version appeared in Ninth International Workshop on Foundations of Object-Oriented Languages, January 2002.

Read-only fields are useful in object calculi, pi calculi, and statically typed intermediate languages because they admit covariant subtyping, unlike updateable fields. For example, Glew's translation of classes and objects to an intermediate calculus relies crucially on covariant subtyping of read-only fields to ensure that subclasses are translated to subtypes.In this article, we present a type inference algorithm for an Abadi-Cardelli object calculus in which fields are marked either as updateable or as read-only. The type inference problem is P-complete, and our algorithm runs in O(n³) time. The same complexity results hold for the calculus in which the fields are not explicitly annotated as updateable or read-only; perhaps surprisingly, the annotations do not make type inference easier. We show that type inference is equivalent to the problem of solving type constraints, and this forms the core of our algorithm and implementation.

Experience with Safe Manual Memory-Management in Cyclone abstract.
Michael Hicks, Greg Morrisett, Dan Grossman, and Trevor Jim. The 2004 International Symposium on Memory Management, October 2004.

The goal of the Cyclone project is to investigate type safety for low-level languages such as C. Our most difficult challenge has been providing programmers control over memory management while retaining type safety. This paper reports on our experience trying to integrate and effectively use two previously proposed, type-safe memory management mechanisms: statically-scoped regions and unique pointers. We found that these typing mechanisms can be combined to build alternative memory-management abstractions, such as reference counted objects and arenas with dynamic lifetimes, and thus provide a flexible basis. Our experience—porting C programs and building new applications for resource-constrained systems—confirms that experts can use these features to improve memory footprint and sometimes to improve throughput when used instead of, or in combination with, conservative garbage collection.

Incentives and Disincentives for DNSSEC Deployment abstract, pdf.
Christof Fetzer and Trevor Jim. March 1, 2004.

Like many of the Internet's foundational infrastructures, the Domain Name System (DNS) suffers from a number of security vulnerabilities. Efforts to produce a more secure successor, DNSSEC, have been underway for more than ten years, and a specification is only now reaching a final form; this is an eternity in Internet time. We argue that the reasons for this delay are mainly economic rather than technical, and that these reasons will persist even after the specification is finalized and robust implementations are available. Thus, widespread adoption of DNSSEC is by no means certain. We offer several suggestions to increase DNSSEC's odds of success.

Combining Garbage Collection and Safe Manual Memory Management pdf.
Michael Hicks, Dan Grossman, and Trevor Jim. Second workshop on Semantics, Program Analysis, and Computing Environments for Memory Management (SPACE 2004), Venice, Italy, January 2004.

iMobile EE: An Enterprise Mobile Service Platform abstract, pdf.
Yih-Farn Chen, Huale Huang, Rittwik Jana, Trevor Jim, Matti Hiltunen, Sam John, Servan Jora, Radhakrishnan Muthumanickam, and Bin Wei. ACM Journal on Wireless Networks, July 2003.

iMobile is an enterprise mobile service platform that allows resource-limited mobile devices to communicate with each other and to securely access corporate contents and services. The original iMobile architecture consists of devlets that provide protocol interfaces to different mobile devices and infolets that access and transcode information based on device profiles. iMobile Enterprise Edition (iMobile EE) is a redesign of the original iMobile architecture to address the security, scalability, and availability requirements of a large enterprise such as AT&T. iMobile EE incorporates gateways that interact with corporate authentication services, replicated iMobile servers with backend connections to corporate services, a reliable message queue that connects iMobile gateways and servers, and a comprehensive service profile database that governs operations of the mobile service platform. The iMobile EE architecture was also extended to provide personalized multimedia services, allowing mobile users to remotely control, record, and request video contents. iMobile EE aims to provide a scalable, secure, and modular software platform that makes enterprise services easily accessible to a growing list of mobile devices roaming among various wireless networks.

Compiling for Template-Based Run-Time Code Generation abstract.
Frederick Smith, Dan Grossman, Greg Morrisett, Luke Hornof, and Trevor Jim. Journal of Functional Programming, May 2003.

Cyclone is a type-safe programming language that provides explicit run-time code generation. The Cyclone compiler uses a template-based strategy for run-time code generation in which pre-compiled code fragments are stitched together at run time. This strategy keeps the cost of code generation low, but it requires that optimizations, such as register allocation and code motion, are applied to templates at compile time. This paper describes a principled approach to implementing such optimizations. In particular, we generalize standard flow-graph intermediate representations to support templates, define a formal mapping from (a subset of) Cyclone to this representation, and describe a dataflow-analysis framework that supports standard optimizations across template boundaries

Region-based Memory Management in Cyclone abstract, pdf.
Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling Wang, and James Cheney. ACM Conference on Programming Language Design and Implementation, pages 282–293, Berlin, Germany, June 2002.

Cyclone is a type-safe programming language derived from C. The primary design goal of Cyclone is to let programmers control data representation and memory management without sacrificing type-safety. In this paper, we focus on the region-based memory management of Cyclone and its static typing discipline. The design incorporates several advancements, including support for region subtyping and a coherent integration with stack allocation and a garbage collector. To support separate compilation, Cyclone requires programmers to write some explicit region annotations, but a combination of default annotations, local type inference, and a novel treatment of region effects reduces this burden. As a result, we integrate C idioms in a region-based framework. In our experience, porting legacy C to Cyclone has required altering about 8% of the code; of the changes, only 6% (of the 8%) were region annotations.

Cyclone: A Safe Dialect of C abstract, pdf.
Trevor Jim, Greg Morrisett, Dan Grossman, Michael Hicks, James Cheney, and Yanling Wang. USENIX Annual Technical Conference, pages 275–288, Monterey, CA, June 2002.

Cyclone is a safe dialect of C. It has been designed from the ground up to prevent the buffer overflows, format string attacks, and memory management errors that are common in C programs, while retaining C's syntax and semantics. This paper examines safety violations enabled by C's design, and shows how Cyclone avoids them, without giving up C's hallmark control over low-level details such as data representation and memory management.

SD3: A Trust Management System with Certified Evaluation abstract, pdf.
Trevor Jim. IEEE Symposium on Security and Privacy, May 2001.

We introduce SD3, a trust management system consisting of a high-level policy language, a local policy evaluator, and a certificate retrieval system. A unique feature of SD3 is its certified evaluator: as the evaluator computes the answer to a query, it also computes a proof that the answer follows from the security policy. Before the answer is returned, the proof is passed through a simple checker, and incorrect proofs are reported as errors. The certified evaluator reduces the trusted computing base and greatly increases our confidence that the answers produced by the evaluator follow from the specification, despite complex optimizations. To illustrate SD3's capabilities, we show how to implement a secure name service, similar to DNSSEC, entirely in SD3.

Dynamically Distributed Query Evaluation pdf.
Trevor Jim and Dan Suciu. ACM Symposium on Principles of Database Systems, May 2001.

Certificate Distribution with Local Autonomy abstract, pdf.
Pankaj Kakkar, Michael McDougall, Carl A. Gunter, and Trevor Jim. The Second International Working Conference on Active Networks, October 2000.

Any security architecture for a wide area network system spanning multiple administrative domains will require support for policy delegation and certificate distribution across the network. Practical solutions will support local autonomy requirements of participating domains by allowing local policies to vary but imposing restrictions to ensure overall coherence of the system. This paper describes the design of a such a system to control access to experiments on the ABONE active network testbed. This is done through a special-purpose language extending the Query Certificate Manager (QCM) system to include protocols for secure mirroring. Our approach allows significant local autonomy while ensuring global security of the system by integrating verification with retrieval. This enables transparent support for a variety of certificate distribution protocols. We analyze requirements of the ABONE application, describe the design of a security infrastructure for it, and discuss steps toward implementation, testing and deployment of the system.

Generalized Certificate Revocation abstract, pdf.
Carl A. Gunter and Trevor Jim. ACM Symposium on Principles of Programming Languages, January 2000.

We introduce a language for creating and manipulating certificates, that is, digitally signed data based on public key cryptography, and a system for revoking certificates. Our approach provides a uniform mechanism for secure distribution of pubic key bindings, authorizations, and revocation information. An external language for the description of these and other forms of data is compiled into an intermediate language with a well-defined denotational and operational semantics. The internal language is used to carry out consistency checks for security, and optimizations for efficiency. Our primary contribution is a technique for treating revocation data dually to other sorts of information using a polarity discipline in the intermediate language.

A Polar Type System abstract, ps.gz.
Trevor Jim. Workshop on Intersection Types and Related Systems, July 2000.

We introduce a restriction of the type system of intersection types and universal type quantification, in which quantifiers must only appear at positive occurrences in types, and intersections at negative occurrences. We show that the system is an extension of the system of rank two intersection types, and, therefore, of ML. The system is much more powerful than ML; for example, all pure normal forms are typable in the system. We show that the system has principal pairs, and that type inference for the system is decidable. These properties are retained when the system is extended by a form of polymorphic recursion.

What is QCM? pdf.
Carl A. Gunter and Trevor Jim. August 1999.

Policy-Directed Certificate Retrieval abstract, ps.
Carl A. Gunter and Trevor Jim. Technical Report MS–CIS–99–07, Department of Computer and Information Science, University of Pennsylvania. September 1998. A revised version is published in Software Practice and Experience, 30(15):1609–1640, 2000.

Any large scale security architecture that uses certificates to provide security in a distributed system will need some automated support for moving certificates around in the network. We believe that for efficiency, this automated support should be tied closely to the consumer of the certificates: the policy verifier. As a proof of concept, we have built QCM, a prototype policy language and verifier that can direct a retrieval mechanism to obtain certificates from the network. Like previous verifiers, QCM takes a policy and certificates supplied by a requester and determines whether the policy is satisfied. Unlike previous verifiers, QCM can take further action if the policy is not satisfied: QCM can examine the policy to decide what certificates might help satisfy it and obtain them from remote servers on behalf of the requester. This takes place automatically, without intervention by the requester; there is no additional burden placed on the requester or the policy writer for the retrieval service we provide. We present examples that show how our technique greatly simplifies certificate-based secure applications ranging from key distribution to ratings systems, and that QCM policies are simple to write. We describe our implementation, and illustrate the operation of the prototype.

The SwitchWare Active Network Implementation ps.
D. Scott Alexander, Michael W. Hicks, Pankaj Kakkar, Angelos D. Keromytis, Marianne Shaw, Jonathan T. Moore, Carl A. Gunter, Trevor Jim, Scott M. Nettles, and Jonathan M. Smith. Notes of the ACM SIGPLAN Workshop on ML, 67–76, September 1998.

Certifying Compilation and Run-time Code Generation abstract, ps.gz.
Luke Hornof and Trevor Jim. Higher Order and Symbolic Computation, 12(4):337–375, December 1999. An earlier version of this paper appeared in ACM Workshop on Partial Evaluation and Semantics-Based Program Manipulation, January 22–23, 1999.

A certifying compiler takes a source language program and produces object code, as well as a ``certificate'' that can be used to verify that the object code satisfies desirable properties, such as type safety and memory safety. Certifying compilation helps to increase both compiler robustness and program safety. Compiler robustness is improved since some compiler errors can be caught by checking the object code against the certificate immediately after compilation. Program safety is improved because the object code and certificate alone are sufficient to establish safety: even if the object code and certificate are produced on an unknown machine by an unknown compiler and sent over an untrusted network, safe execution is guaranteed as long as the code and certificate pass the verifier.

Existing work in certifying compilation has addressed statically generated code. In this paper, we extend this to code generated at run time. Our goal is to combine certifying compilation with run-time code generation to produce programs that are both verifiably safe and extremely fast. To achieve this goal, we present two new languages with explicit run-time code generation constructs: Cyclone, a type safe dialect of C, and TAL/T, a type safe assembly language. We have designed and implemented a system that translates a safe C program into Cyclone, which is then compiled to TAL/T, and finally assembled into executable object code. This paper focuses on our overall approach and the front end of our system; details about TAL/T will appear in a subsequent paper.

Design of an Application-Level Security Infrastructure abstract, ps.gz.
Carl A. Gunter and Trevor Jim. DIMACS Workshop on Design and Formal Verification of Security Protocols, September 3–5, 1997.

We propose a security infrastructure based on authenticated data distribution, implemented via the automatic management of queries and certificates. This approach is appropriate for an infrastructure to be used by application programmers, because they are not experts in cryptographic algorithms or security protocols. Our query certificate managers hide the use of cryptography and message sending from the programmer, and hence prevent programmer errors that could lead to security failures. The system has a formal semantics and correctness guarantees.

Type inference in systems of recursive types with subtyping abstract, ps.gz.
Trevor Jim and Jens Palsberg. July 1997.

We present general methods for performing type inference and deciding subtyping in languages with recursive types. Our type inference algorithm generalizes a common idea of previous work: type inference is reduced to a constraint satisfaction problem, whose satisfiability can be decided by a process of closure and consistency checking. We prove a general correctness theorem for this style of type inference. We define subtyping co-inductively, and we prove by co-induction that a closed and consistent constraint set has a solution. Our theorem makes it easier to find new type inference algorithms. For example, we provide definitions of closure and consistency for recursive types with a greatest type, but not a least type; we show that the definitions satisfy the conditions of our theorem; and the theorem immediately provides a type inference algorithm, thereby solving an open problem.

Shrinking Lambda Expressions in Linear Time abstract, ps.gz.
Andrew W. Appel and Trevor Jim. Journal of Functional Programming 7(5):515–540, September 1997.

Functional-language compilers often perform optimizations based on beta and delta reduction. To avoid speculative optimizations that can blow up the code size, we might wish to use only shrinking reduction rules guaranteed to make the program smaller: these include dead-variable elimination, constant folding, and a restricted beta rule that inlines only functions that are called just once.

The restricted beta rule leads to a shrinking rewrite system that has not previously been studied. We show some efficient normalization algorithms that are immediately useful in optimizing compilers; and we give a confluence proof for our system, showing that the choice of normalization algorithm does not affect final code quality.

Type Inference with Simple Selftypes is NP-complete abstract, ps.gz.
Jens Palsberg and Trevor Jim. Nordic Journal of Computing 4(3):259–286, Fall 1997.

The metavariable self is fundamental in object-oriented languages. Typing self in the presence of inheritance has been studied by Abadi and Cardelli, Bruce, and others. A key concept in these developments is the notion of selftype, which enables flexible type annotations that are impossible with recursive types and subtyping. Bruce et al. demonstrated that, for the language TOOPLE, type checking is decidable. Open until now is the problem of type inference with selftype.

In this paper we present a simple type system with selftype, recursive types, and subtyping, and we prove that type inference is NP-complete. With recursive types and subtyping alone, type inference is known to be P-complete. Our example language is the object calculus of Abadi and Cardelli. Both our type inference algorithm and our lower bound are the first such results for a type system with selftype.

Full Abstraction and the Context Lemma abstract, ps.gz.
Trevor Jim and Albert R. Meyer. SIAM Journal of Computing 25(3), 663–696, June 1996.

It is impossible to add a combinator to PCF to achieve full abstraction for models such as Berry's stable domains in a way analogous to the addition of the ``parallel-or'' combinator that achieves full abstraction for the familiar cpo model. In particular, we define a general notion of rewriting system of the kind used for evaluating simply typed lambda terms in Scott's PCF. Any simply typed lambda calculus with such a ``PCF-like'' rewriting semantics is shown necessarily to satisfy Milner's Context Lemma. A simple argument demonstrates that any denotational semantics that is adequate for PCF, and in which certain simple Boolean functionals exist, cannot be fully abstract for any extension of PCF satisfying the Context Lemma. An immediate corollary is that stable domains cannot be fully abstract for any extension of PCF definable by PCF-like rules.

What are principal typings and what are they good for? abstract, ps.gz.
Trevor Jim. MIT/LCS TM–532, Massachusetts Institute of Technology, August 1995; revised November 1995. Extended version of a paper appearing in ACM Symposium on Principles of Programming Languages, 1996.

We demonstrate the pragmatic value of the principal typing property, a property distinct from ML's principal type property, by studying a type system with principal typings. The type system is based on rank 2 intersection types and is closely related to ML. Its principal typing property provides elegant support for separate compilation, including ``smartest recompilation'' and incremental type inference, and for accurate type error messages. Moreover, it motivates a new rule for typing recursive definitions that can type some interesting examples of polymorphic recursion.

Rank 2 type systems and recursive definitions abstract, ps.gz.
Trevor Jim. MIT/LCS TM–531, Massachusetts Institute of Technology, August 1995; revised November 1995.

We demonstrate an equivalence between the rank 2 fragments of the polymorphic lambda calculus (System F) and the intersection type discipline: exactly the same terms are typable in each system. An immediate consequence is that typability in the rank 2 intersection system is DEXPTIME-complete. We introduce a rank 2 system combining intersections and polymorphism, and prove that it types exactly the same terms as the other rank 2 systems. The combined system suggests a new rule for typing recursive definitions. The result is a rank 2 type system with decidable type inference that can type some interesting examples of polymorphic recursion. Finally, we discuss some applications of the type system in data representation optimizations such as unboxing and overloading.

Continuation-passing, closure-passing style abstract, ps.gz.
Andrew W. Appel and Trevor Jim. Proc. Sixteenth ACM Symposium on Principles of Programming Languages, 293–302, January 1989.

We implemented a continuation-passing style (CPS) code generator for ML. Our CPS language is represented as an ML datatype in which all functions are named and most kinds of ill-formed expressions are impossible. We separate the code generation into phases that rewrite this representation into ever-simpler forms. Closures are represented explicitly as records, so that closure strategies can be communicated from one phase to another. No stack is used. Our benchmark data shows that the new method is an improvement over our previous, abstract-machine based code generator.

Proceedings

I organized the 2000 Workshop on Proof-Carrying Code, and its proceedings are available online.

Contact info

Trevor Jim / trevor@att.com

TREVOR JIM

Proceedings

Contact info