Papers
EyeDecrypt—Private Interactions in Plain Sight
abstract,
preprint,
demo.
Andrea Forte and Juan Garay and Trevor Jim and Yevgeniy Vahlis.
9th Conference on Security and Cryptography for Networks (SCN 2014),
September 3-5, 2014.
We introduce EyeDecrypt, a novel technology for privacy-preserving
human-computer interaction. EyeDecrypt allows only authorized users to
decipher data shown on a display, such as an electronic screen or
plain printed material; in the former case, the authorized user can
then interact with the system (e.g., by pressing buttons on the
screen), without revealing the details of the interaction to others
who may be watching or to the system itself. The user views the
decrypted data on a closely-held personal device, such as a pair of
smart glasses with a camera and heads-up display, or a smartphone. The
data is displayed as an image overlay on the personal device, which we
assume cannot be viewed by the adversary. The overlay is a form of
augmented reality that not only allows the user to view the protected
data, but also to securely enter input into the system by randomizing
the input interface. EyeDecrypt consists of three main components: a
visualizable encryption scheme; a dataglyph-based visual encoding
scheme for the ciphertexts generated by the encryption scheme; and a
randomized input and augmented reality scheme that protects user
inputs without harming usability. We describe all aspects of
EyeDecrypt, from security definitions, constructions and analysis, to
implementation details of a prototype developed on a smartphone.
Why is My Smartphone Slow? On The Fly Diagnosis of Poor Performance on Mobile Internet
abstract,
pdf.
Chaitrali Amrutkar, Matti Hiltunen, Trevor Jim, Kaustubh Joshi, Oliver Spatscheck, Patrick Traynor and Shobha Venkataraman.
The 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2013),
24-27th June 2013, Budapest.
The perceived end-to-end performance of the mobile Internet can be
impacted by multiple factors including websites, devices, and network
components. Constant changes in these factors and network complexity
make identifying root causes of high latency difficult. In this paper,
we propose a multidimensional diagnosis technique using passive IP
flow data collected at ISPs for investigating factors that impact the
performance of the mobile Internet. We implement and evaluate our
technique over four days of data from a major US cellular provider’s
network. Our approach identifies several combinations of factors
affecting performance. We investigate four combinations in-depth to
confirm the latency causes chosen by our technique. Our findings
include a popular gaming website showing poor performance on a
specific device type for over 50% of the flows and web browser traffic
on older devices accounting for 99% of poorly performing traffic. Our
technique can direct operators in choosing factors having high impact
on latency in the mobile Internet.
Delayed Semantic Actions in Yakker
abstract,
pdf.
Trevor Jim and Yitzhak Mandelbaum.
11th International Workshop on Language Descriptions, Tools, and
Applications (LDTA),
March, 2011.
Yakker is a parser generator that supports both semantic actions and
speculative parsing techniques such as backtracking and context-free
lookahead. To avoid executing semantic actions in parses which will
eventually be discarded, we divide parsing into two phases. In the
first (early) phase, the parser explores multiple possible parse trees
before settling on a final parse tree or parse trees, without
executing semantic actions. The second (late) phase executes the
delayed semantic actions.
We structure the early phase as a transducer which maps the input
language to an output language of labels. A string in the output
language is a history of the semantic actions that would have
been executed in a parse of the input. The late phase is implemented
as a deterministic, recursive descent parse of the history.
We formalize delayed semantic actions and discuss a number of
practical issues involved in implementing them in Yakker, including
our support for regular right part grammars and dependent parsing, the
design of the history data structure, and memory management techniques
critical for efficient implementation.
Semantics and Algorithms for Data-dependent Grammars
abstract,
pdf.
Trevor Jim, Yitzhak Mandelbaum, and David Walker.
37th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL),
January, 2010.
Traditional parser generation technologies are incapable of handling
the demands of modern programmers. In this paper, we present the
design and theory of a new parsing engine, Yakker,
capable of handling the requirements of modern applications including
(1) full scannerless context-free grammars with (2) regular
expressions as right-hand sides for defining nonterminals. Yakker
also includes (3) facilities for binding variables to intermediate
parse results and (4) using such bindings within arbitrary constraints
to control parsing. These facilities allow the kind of data-dependent
parsing commonly needed in systems applications, particularly those
that operate over binary data. In addition, (5) nonterminals may be
parameterized by arbitrary values, which gives the system good
modularity and abstraction properties in the presence of
data-dependent parsing. Finally, (6) legacy parsing libraries, such
as sophisticated libraries for dates and times, may be directly
incorporated into parser specifications. We illustrate the importance
and utility of this rich format specification language by presenting
its use on examples ranging from difficult programming language
grammars to web server logs to binary data specification.
We also show that our grammars have important compositionality
properties and explain why such properties are important in modern
applications such as automatic grammar induction.
In terms of technical contributions, we provide a traditional
high-level semantics for our new grammar formalization
and show how to compile these grammars into nondeterministic automata.
These automata are stack-based, somewhat like conventional push-down automata,
but are also equipped with environments to track data-dependent parsing
state. We prove the correctness of our translation of data-dependent
grammars into these new automata and then show how to implement the
automata efficiently using a variation of Earley's parsing algorithm.
An Emulation of GENI Access Control
abstract,
pdf.
Soner Sevinc, Larry Peterson, Trevor Jim, and Mary Fernandez.
2nd Workshop on Cyber Security Experimentation and Test (CSET),
August, 2009.
This paper describes an emulation of a distributed access control
system proposed for use in the GENI network testbed. We use our trust
management system, CERT-DIST, to realize the system policy, and
measure its performance by mapping PlanetLab’s centralized access
control scheme to GENI’s distributed scheme and then replaying logs of
PlanetLab access control activity. Our log analysis indicates that any
such system must be resilient to both misconfigurations and attacks,
and our emulation results show the effect of caching schemes and
certificate expiration intervals in reducing load on servers and
improving response time.
Using Static Analysis for Ajax Intrusion Detection
abstract,
pdf.
Arjun Guha, Shriram Krishnamurthi, and Trevor Jim.
International World Wide Web Conference (WWW),
April, 2009.
We present a static control-flow analysis for JavaScript programs
running in a web browser. Our analysis tackles numerous challenges
posed by modern web applications including asynchronous communication,
frameworks, and dynamic code generation. We use our analysis to
extract a model of expected client behavior as seen from the server,
and build an intrusion-prevention proxy for the server: the proxy
intercepts client requests and disables those that do not meet the
expected behavior. We insert random asynchronous requests to foil
mimicry attacks. Finally, we evaluate our technique against several
real applications and show that it protects against an attack in a
widely-used web application.
Efficient Earley Parsing with Regular Right-hand Sides
abstract,
pdf.
Trevor Jim and Yitzhak Mandelbaum.
Workshop on Language Descriptions Tools and Applications (LDTA),
March, 2009.
We present a new variant of the Earley parsing algorithm capable of
efficiently supporting context-free grammars with regular right
hand-sides. We present the core state-machine driven algorithm, the
translation of grammars into state machines, and the reconstruction
algorithm. We also include a theoretical framework for presenting the
algorithm and for evaluating optimizations. Finally, we evaluate the
algorithm by testing its implementation.
Highly Distributed XQuery with DXQ
abstract,
pdf.
Mary Fernandez, Trevor Jim, Kristi Morton, Nicola Onose, and Jerome Simeon.
27th ACM SIGMOD International Conference on Management of Data, demo paper,
June 2007.
Many modern applications, from Grid computing to RSS handling, need to
support data processing in a distributed environment. Currently, most
such applications are implemented using a general purpose programming
language, which can be expensive to maintain, hard to configure and
modify, and require hand optimization of the distributed data
processing operations. We present Distributed XQuery (DXQ), a simple,
yet powerful, extension of XQuery to support distributed applications.
This extension includes the ability to deploy networks of XQuery
computing nodes, to remotely invoke XQuery programs on those nodes,
and to ship code between nodes. We demonstrate the capabilities of
DXQ by building a network of name servers that capture the
functionality of DNS, the Domain Name System. We show that our system
can flexibly accommodate different patterns of distributed
computation, and discuss some simple but essential distributed
optimizations.
Defeating Script Injection Attacks with Browser-Enforced Embedded Policies
abstract,
pdf.
Trevor Jim, Nikhil Swamy, and Michael Hicks.
16th International World Wide Web Conference (WWW 2007),
May 2007.
Web sites that accept and display content such as wiki articles or
comments typically filter the content to prevent injected script code
from running in browsers that view the site. The diversity of browser
rendering algorithms and the desire to allow rich content makes
filtering quite difficult, however, and attacks such as the Samy and
Yamanner worms have exploited filtering weaknesses. To solve this
problem, this paper proposes a simple mechanism called
Browser-Enforced Embedded Policies (BEEP). The idea is that a web site
can embed a policy inside its pages that specifies which scripts are
allowed to run. The browser, which knows exactly when it will run a
script, can enforce this policy perfectly. We have added BEEP support
to several browsers, and built tools to simplify adding policies to
web applications. We found that supporting BEEP in browsers requires
only small and localized modifications, modifying web applications
requires minimal effort, and enforcing policies is generally
lightweight.
System Call Monitoring Using Authenticated System Calls
abstract,
pdf.
Mohan Rajagopalan, Matti Hiltunen, Trevor Jim, and Richard Schlichting.
IEEE Transactions on Secure and Dependable Computing, 3(3),
216–229, July–September 2006.
System call monitoring is a technique for detecting and controlling
compromised applications by checking at runtime that each system call
conforms to a policy that specifies the program's normal
behavior. Here, we introduce a new approach to implementing system
call monitoring based on authenticated system calls. An authenticated
system call is a system call augmented with extra arguments that
specify the policy for that call, and a cryptographic message
authentication code that guarantees the integrity of the policy and
the system call arguments. This extra information is used by the
kernel to verify the system call. The version of the application in
which regular system calls have been replaced by authenticated calls
is generated automatically by an installer program that reads the
application binary, uses static analysis to generate policies, and
then rewrites the binary with the authenticated calls. This paper
presents the approach, describes a prototype implementation based on
Linux and the Plto binary rewriting system, and gives experimental
results suggesting that the approach is effective in protecting
against compromised applications at modest cost.
Authenticated System Calls
abstract,
pdf.
Mohan Rajagopalan, Matti Hiltunen, Trevor Jim, and Richard Schlichting.
DSN–2005:
The International Conference on Dependable Systems and Networks,
June 2005.
System call monitoring is a technique for detecting and controlling
compromised applications by checking at runtime that each system call
conforms to a policy that specifies the program's normal
behavior. Here, a new approach to system call monitoring based on
authenticated system calls is introduced. An authenticated system call
is a system call augmented with extra arguments that specify the
policy for that call and a cryptographic message authentication code
(MAC) that guarantees the integrity of the policy and the system call
arguments. This extra information is used by the kernel to verify the
system call. The version of the application in which regular system
calls have been replaced by authenticated calls is generated
automatically by an installer program that reads the application
binary, uses static analysis to generate policies, and then rewrites
the binary with the authenticated calls. This paper presents the
approach, describes a prototype implementation based on Linux and the
PLTO binary rewriting system, and gives experimental results
suggesting that the approach is effective in protecting against
compromised applications at modest cost.
Dependable Distributed Computing using Free Databases
abstract,
view online.
Christof Fetzer and Trevor Jim.
Service Availability: Second International Service Availability
Symposium, ISAS 2005, pages 123–136, Lecture Notes in Computer
Science, Volume 3694, April 2005.
Designing and programming dependable distributed applications is very
difficult. Databases provide features like transactions and
replication that can help in the implementation of dependable
applications. There are in particular various free databases that make
it economically feasible to run a database on each computer in a
system. Hence, one can partition database tables across multiple hosts
to harness the processing power and disks of multiple machines. We
describe a system that simplifies partitioning tables across multiple
hosts. DOSE exposes the partitions to the programmer rather than
giving the illusion of a single table. Our focus is on providing a
simple implementation that works for freely-available databases, on
automatic tuning of the partitions for best performance, and on
applying the fault tolerance mechanisms of the databases to build
dependable distributed systems. We show how we use this system to
implement a distributed work queue.
Enhancing DNS Security using the SSL Trust Infrastructure
abstract,
pdf.
Christof Fetzer, Gert Pfeifer, and Trevor Jim.
Tenth IEEE
International Workshop on Object-oriented Real-time Dependable
Systems (WORDS 2005), February 2005.
The main functionality of the Domain Name System (DNS) is to translate
symbolic names into IP addresses. Due to the criticality of DNS for
the proper functioning of the Internet, many improvements have been
proposed for DNS in terms of security and dependability. However, the
current secure DNS (DNSSEC) standard has still several problems that
need further consideration. For example, online updates and denial of
service attacks are not sufficiently addressed. These problems are
serious obstacles that might prevent DNSSEC from replacing the
traditional DNS. In this paper we discuss several of these technical
and economic problems. To address these issues, we propose a simple
extension to the existing DNS. It is SSL based and individual domains
can decide independently of each other if and when to adopt the
extensions. We show how to implement these extensions with the help of
a simple proxy DNS server.
An Alerting and Notification Service on the AT&T Enterprise Messaging Network
Serban Jora, Rittwik Jana, Yih-Farn Chen, Matti Hiltunen, Trevor Jim,
Huale Huang, Ashish K. Singh, and Radhakrishnan Muthumanickam.
IASTED International Conference on Internet and Multimedia Systems and Applications
(EuroIMSA 2005), February 2005.
Cyclone: A Type-Safe Dialect of C
Dan Grossman, Michael Hicks, Trevor Jim, and Greg Morrisett.
C/C++ Users Journal, January 2005.
Automatic Discovery of Covariant Read-Only Fields
abstract.
Jens Palsberg, Tian Zhao, and Trevor Jim.
ACM Transactions on Programming Languages and Systems, Volume 27,
Issue 1, January 2005. An earlier version appeared in
Ninth International Workshop on Foundations of Object-Oriented
Languages, January 2002.
Read-only fields are useful in object calculi, pi calculi, and
statically typed intermediate languages because they admit covariant
subtyping, unlike updateable fields. For example, Glew's translation
of classes and objects to an intermediate calculus relies crucially on
covariant subtyping of read-only fields to ensure that subclasses are
translated to subtypes.In this article, we present a type inference
algorithm for an Abadi-Cardelli object calculus in which fields are
marked either as updateable or as read-only. The type inference
problem is P-complete, and our algorithm runs in
O(n3) time. The same complexity results hold
for the calculus in which the fields are not explicitly annotated as
updateable or read-only; perhaps surprisingly, the annotations do not
make type inference easier. We show that type inference is equivalent
to the problem of solving type constraints, and this forms the core of
our algorithm and implementation.
Experience with Safe Manual Memory-Management in Cyclone
abstract.
Michael Hicks, Greg Morrisett, Dan Grossman, and Trevor Jim.
The 2004 International Symposium on Memory Management, October 2004.
The goal of the Cyclone project is to investigate type safety for
low-level languages such as C. Our most difficult challenge has been
providing programmers control over memory management while retaining
type safety. This paper reports on our experience trying to integrate
and effectively use two previously proposed, type-safe memory
management mechanisms: statically-scoped regions and unique pointers.
We found that these typing mechanisms can be combined to build
alternative memory-management abstractions, such as reference counted
objects and arenas with dynamic lifetimes, and thus provide a flexible
basis. Our experience—porting C programs and building new
applications for resource-constrained systems—confirms that experts
can use these features to improve memory footprint and sometimes to
improve throughput when used instead of, or in combination with,
conservative garbage collection.
Incentives and Disincentives for DNSSEC Deployment
abstract,
pdf.
Christof Fetzer and Trevor Jim.
March 1, 2004.
Like many of the Internet's foundational infrastructures, the Domain
Name System (DNS) suffers from a number of security vulnerabilities.
Efforts to produce a more secure successor, DNSSEC, have been
underway for more than ten years, and a specification is only now
reaching a final form; this is an eternity in Internet time. We
argue that the reasons for this delay are mainly economic rather
than technical, and that these reasons will persist even after the
specification is finalized and robust implementations are available.
Thus, widespread adoption of DNSSEC is by no means certain. We
offer several suggestions to increase DNSSEC's odds of success.
Combining Garbage Collection and Safe Manual Memory Management
pdf.
Michael Hicks, Dan Grossman, and Trevor Jim.
Second workshop on
Semantics, Program Analysis, and Computing Environments
for Memory Management (SPACE 2004), Venice, Italy, January 2004.
iMobile EE: An Enterprise Mobile Service Platform
abstract,
pdf.
Yih-Farn Chen, Huale Huang, Rittwik Jana, Trevor Jim, Matti Hiltunen,
Sam John, Servan Jora, Radhakrishnan Muthumanickam, and Bin Wei.
ACM Journal on Wireless Networks,
July 2003.
iMobile is an enterprise mobile service platform that allows
resource-limited mobile devices to communicate with each other and to
securely access corporate contents and services. The original iMobile
architecture consists of devlets that provide protocol interfaces to
different mobile devices and infolets that access and transcode
information based on device profiles. iMobile Enterprise Edition
(iMobile EE) is a redesign of the original iMobile architecture to
address the security, scalability, and availability requirements of a
large enterprise such as AT&T. iMobile EE incorporates gateways
that interact with corporate authentication services, replicated
iMobile servers with backend connections to corporate services, a
reliable message queue that connects iMobile gateways and servers, and
a comprehensive service profile database that governs operations of
the mobile service platform. The iMobile EE architecture was also
extended to provide personalized multimedia services, allowing mobile
users to remotely control, record, and request video contents. iMobile
EE aims to provide a scalable, secure, and modular software platform
that makes enterprise services easily accessible to a growing list of
mobile devices roaming among various wireless networks.
Compiling for Template-Based Run-Time Code Generation
abstract.
Frederick Smith, Dan Grossman, Greg Morrisett, Luke Hornof, and Trevor Jim.
Journal of Functional Programming,
May 2003.
Cyclone is a type-safe programming language that provides explicit
run-time code generation. The Cyclone compiler uses a template-based
strategy for run-time code generation in which pre-compiled code
fragments are stitched together at run time. This strategy keeps the
cost of code generation low, but it requires that optimizations, such
as register allocation and code motion, are applied to templates at
compile time. This paper describes a principled approach to
implementing such optimizations. In particular, we generalize standard
flow-graph intermediate representations to support templates, define a
formal mapping from (a subset of) Cyclone to this representation, and
describe a dataflow-analysis framework that supports standard
optimizations across template boundaries
Region-based Memory Management in Cyclone
abstract,
pdf.
Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling Wang,
and James Cheney.
ACM Conference on Programming Language Design and Implementation,
pages 282–293, Berlin, Germany, June 2002.
Cyclone is a type-safe programming language derived from C. The
primary design goal of Cyclone is to let programmers control data
representation and memory management without sacrificing type-safety.
In this paper, we focus on the region-based memory management of
Cyclone and its static typing discipline. The design incorporates
several advancements, including support for region subtyping and a
coherent integration with stack allocation and a garbage collector.
To support separate compilation, Cyclone requires programmers to write
some explicit region annotations, but a combination of default
annotations, local type inference, and a novel treatment of region
effects reduces this burden. As a result, we integrate C idioms in a
region-based framework. In our experience, porting legacy C to
Cyclone has required altering about 8% of the code; of the changes,
only 6% (of the 8%) were region annotations.
Cyclone: A Safe Dialect of C
abstract,
pdf.
Trevor Jim, Greg Morrisett, Dan Grossman, Michael Hicks, James Cheney, and Yanling Wang.
USENIX Annual Technical Conference, pages 275–288, Monterey, CA, June 2002.
Cyclone is a safe dialect of C. It has been designed from the ground
up to prevent the buffer overflows, format string attacks, and memory
management errors that are common in C programs, while retaining C's
syntax and semantics. This paper examines safety violations enabled
by C's design, and shows how Cyclone avoids them, without giving up
C's hallmark control over low-level details such as data
representation and memory management.
SD3: A Trust Management System with Certified Evaluation
abstract,
pdf.
Trevor Jim.
IEEE Symposium on Security and Privacy,
May 2001.
We introduce SD3, a trust management system consisting of a high-level
policy language, a local policy evaluator, and a certificate retrieval
system. A unique feature of SD3 is its certified evaluator: as the
evaluator computes the answer to a query, it also computes a proof
that the answer follows from the security policy. Before the answer is
returned, the proof is passed through a simple checker, and incorrect
proofs are reported as errors. The certified evaluator reduces the
trusted computing base and greatly increases our confidence that the
answers produced by the evaluator follow from the specification,
despite complex optimizations.
To illustrate SD3's capabilities, we show how to implement a secure
name service, similar to DNSSEC, entirely in SD3.
Dynamically Distributed Query Evaluation
pdf.
Trevor Jim and Dan Suciu.
ACM Symposium on Principles of Database Systems,
May 2001.
Certificate Distribution with Local Autonomy
abstract,
pdf.
Pankaj Kakkar,
Michael McDougall,
Carl A. Gunter,
and
Trevor Jim.
The Second International Working Conference on Active Networks,
October 2000.
Any security architecture for a wide area network system spanning
multiple administrative domains will require support for policy
delegation and certificate distribution across the network. Practical
solutions will support local autonomy requirements of participating
domains by allowing local policies to vary but imposing restrictions
to ensure overall coherence of the system. This paper describes the
design of a such a system to control access to experiments on the
ABONE active network testbed. This is done through a special-purpose
language extending the Query Certificate Manager (QCM) system to
include protocols for secure mirroring. Our approach allows
significant local autonomy while ensuring global security of the
system by integrating verification with retrieval. This enables
transparent support for a variety of certificate distribution
protocols. We analyze requirements of the ABONE application, describe
the design of a security infrastructure for it, and discuss steps
toward implementation, testing and deployment of the system.
Generalized Certificate Revocation
abstract,
pdf.
Carl A. Gunter
and
Trevor Jim.
ACM Symposium on Principles of Programming Languages,
January 2000.
We introduce a language for creating and manipulating certificates,
that is, digitally signed data based on public key cryptography, and a
system for revoking certificates. Our approach provides a uniform
mechanism for secure distribution of pubic key bindings,
authorizations, and revocation information. An external language for
the description of these and other forms of data is compiled into an
intermediate language with a well-defined denotational and operational
semantics. The internal language is used to carry out consistency
checks for security, and optimizations for efficiency. Our primary
contribution is a technique for treating revocation data dually to
other sorts of information using a polarity discipline in the
intermediate language.
A Polar Type System
abstract,
ps.gz.
Trevor Jim.
Workshop on Intersection Types and Related Systems,
July 2000.
We introduce a restriction of the type system of intersection types
and universal type quantification, in which quantifiers must only
appear at positive occurrences in types, and intersections at negative
occurrences. We show that the system is an extension of the system of
rank two intersection types, and, therefore, of ML. The system is much
more powerful than ML; for example, all pure normal forms are typable
in the system. We show that the system has principal pairs, and that
type inference for the system is decidable. These properties are
retained when the system is extended by a form of polymorphic
recursion.
What is QCM?
pdf.
Carl A. Gunter
and
Trevor Jim. August 1999.
Policy-Directed Certificate Retrieval
abstract,
ps.
Carl A. Gunter and Trevor Jim.
Technical Report MS–CIS–99–07,
Department of Computer and Information Science,
University of Pennsylvania. September 1998.
A revised version is published in
Software Practice and Experience, 30(15):1609–1640, 2000.
Any large scale security architecture that uses certificates to
provide security in a distributed system will need some automated
support for moving certificates around in the network. We believe that
for efficiency, this automated support should be tied closely to the
consumer of the certificates: the policy verifier. As a proof of
concept, we have built QCM, a prototype policy language and verifier
that can direct a retrieval mechanism to obtain certificates from the
network. Like previous verifiers, QCM takes a policy and certificates
supplied by a requester and determines whether the policy is
satisfied. Unlike previous verifiers, QCM can take further action if
the policy is not satisfied: QCM can examine the policy to decide what
certificates might help satisfy it and obtain them from remote servers
on behalf of the requester. This takes place automatically, without
intervention by the requester; there is no additional burden placed on
the requester or the policy writer for the retrieval service we
provide. We present examples that show how our technique greatly
simplifies certificate-based secure applications ranging from key
distribution to ratings systems, and that QCM policies are simple to
write. We describe our implementation, and illustrate the operation of
the prototype.
The SwitchWare Active Network Implementation
ps.
D. Scott Alexander, Michael W. Hicks, Pankaj Kakkar, Angelos D.
Keromytis, Marianne Shaw, Jonathan T. Moore, Carl A. Gunter,
Trevor Jim, Scott M. Nettles, and Jonathan M. Smith.
Notes of the ACM SIGPLAN Workshop on ML,
67–76,
September 1998.
Certifying Compilation and Run-time Code Generation
abstract,
ps.gz.
Luke Hornof and Trevor Jim.
Higher Order and Symbolic Computation, 12(4):337–375, December
1999.
An earlier version of this paper appeared in
ACM Workshop on Partial Evaluation and Semantics-Based Program
Manipulation,
January 22–23, 1999.
A certifying compiler takes a source language program and produces
object code, as well as a ``certificate'' that can be used to verify
that the object code satisfies desirable properties, such as type
safety and memory safety. Certifying compilation helps to increase
both compiler robustness and program safety. Compiler robustness is
improved since some compiler errors can be caught by checking the
object code against the certificate immediately after
compilation. Program safety is improved because the object code and
certificate alone are sufficient to establish safety: even if the
object code and certificate are produced on an unknown machine by an
unknown compiler and sent over an untrusted network, safe execution is
guaranteed as long as the code and certificate pass the verifier.
Existing work in certifying compilation has addressed statically
generated code. In this paper, we extend this to code generated at run
time. Our goal is to combine certifying compilation with run-time code
generation to produce programs that are both verifiably safe and
extremely fast. To achieve this goal, we present two new languages
with explicit run-time code generation constructs: Cyclone, a type
safe dialect of C, and TAL/T, a type safe assembly language. We have
designed and implemented a system that translates a safe C program
into Cyclone, which is then compiled to TAL/T, and finally assembled
into executable object code. This paper focuses on our overall
approach and the front end of our system; details about TAL/T will
appear in a subsequent paper.
Design of an Application-Level Security Infrastructure
abstract,
ps.gz.
Carl A. Gunter and Trevor Jim.
DIMACS Workshop on Design and Formal Verification of Security
Protocols, September 3–5, 1997.
We propose a security infrastructure based on authenticated data
distribution, implemented via the automatic management of queries and
certificates. This approach is appropriate for an infrastructure to be
used by application programmers, because they are not experts in
cryptographic algorithms or security protocols. Our query certificate
managers hide the use of cryptography and message sending from the
programmer, and hence prevent programmer errors that could lead to
security failures. The system has a formal semantics and correctness
guarantees.
Type inference in systems of recursive types with subtyping
abstract,
ps.gz.
Trevor Jim and Jens Palsberg.
July 1997.
We present general methods for performing type inference and deciding
subtyping in languages with recursive types. Our type inference
algorithm generalizes a common idea of previous work: type inference
is reduced to a constraint satisfaction problem, whose satisfiability
can be decided by a process of closure and consistency checking. We
prove a general correctness theorem for this style of type
inference. We define subtyping co-inductively, and we prove by
co-induction that a closed and consistent constraint set has a
solution. Our theorem makes it easier to find new type inference
algorithms. For example, we provide definitions of closure and
consistency for recursive types with a greatest type, but not a least
type; we show that the definitions satisfy the conditions of our
theorem; and the theorem immediately provides a type inference
algorithm, thereby solving an open problem.
Shrinking Lambda Expressions in Linear Time
abstract,
ps.gz.
Andrew W. Appel and Trevor Jim.
Journal of Functional Programming
7(5):515–540, September 1997.
Functional-language compilers often perform optimizations based on
beta and delta reduction. To avoid speculative optimizations that can
blow up the code size, we might wish to use only shrinking reduction
rules guaranteed to make the program smaller: these include
dead-variable elimination, constant folding, and a restricted beta
rule that inlines only functions that are called just once.
The restricted beta rule leads to a shrinking rewrite system that has
not previously been studied. We show some efficient normalization
algorithms that are immediately useful in optimizing compilers; and we
give a confluence proof for our system, showing that the choice of
normalization algorithm does not affect final code quality.
Type Inference with Simple Selftypes is NP-complete
abstract,
ps.gz.
Jens Palsberg and Trevor Jim.
Nordic Journal of Computing
4(3):259–286, Fall 1997.
The metavariable self is fundamental in object-oriented
languages. Typing self in the presence of inheritance has been studied
by Abadi and Cardelli, Bruce, and others. A key concept in these
developments is the notion of selftype, which enables flexible type
annotations that are impossible with recursive types and
subtyping. Bruce et al. demonstrated that, for the language TOOPLE,
type checking is decidable. Open until now is the problem of type
inference with selftype.
In this paper we present a simple type system with selftype, recursive
types, and subtyping, and we prove that type inference is
NP-complete. With recursive types and subtyping alone, type inference
is known to be P-complete. Our example language is the object calculus
of Abadi and Cardelli. Both our type inference algorithm and our lower
bound are the first such results for a type system with selftype.
Full Abstraction and the Context Lemma
abstract,
ps.gz.
Trevor Jim and Albert R. Meyer.
SIAM Journal of Computing
25(3), 663–696, June 1996.
It is impossible to add a combinator to PCF to achieve full
abstraction for models such as Berry's stable domains in a way
analogous to the addition of the ``parallel-or'' combinator that
achieves full abstraction for the familiar cpo model. In particular,
we define a general notion of rewriting system of the kind used for
evaluating simply typed lambda terms in Scott's PCF. Any simply typed
lambda calculus with such a ``PCF-like'' rewriting semantics is shown
necessarily to satisfy Milner's Context Lemma. A simple argument
demonstrates that any denotational semantics that is adequate for PCF,
and in which certain simple Boolean functionals exist, cannot be fully
abstract for any extension of PCF satisfying the Context Lemma. An
immediate corollary is that stable domains cannot be fully abstract
for any extension of PCF definable by PCF-like rules.
What are principal typings and what are they good for?
abstract,
ps.gz.
Trevor Jim.
MIT/LCS TM–532, Massachusetts Institute of Technology, August 1995;
revised November 1995.
Extended version of a paper appearing in
ACM Symposium on Principles of Programming Languages, 1996.
We demonstrate the pragmatic value of the principal typing property, a
property distinct from ML's principal type property, by studying a
type system with principal typings. The type system is based on rank 2
intersection types and is closely related to ML. Its principal typing
property provides elegant support for separate compilation, including
``smartest recompilation'' and incremental type inference, and for
accurate type error messages. Moreover, it motivates a new rule for
typing recursive definitions that can type some interesting examples
of polymorphic recursion.
Rank 2 type systems and recursive definitions
abstract,
ps.gz.
Trevor Jim.
MIT/LCS TM–531, Massachusetts Institute of Technology, August 1995;
revised November 1995.
We demonstrate an equivalence between the rank 2 fragments of the
polymorphic lambda calculus (System F) and the intersection type
discipline: exactly the same terms are typable in each system. An
immediate consequence is that typability in the rank 2 intersection
system is DEXPTIME-complete. We introduce a rank 2 system combining
intersections and polymorphism, and prove that it types exactly the
same terms as the other rank 2 systems. The combined system suggests a
new rule for typing recursive definitions. The result is a rank 2 type
system with decidable type inference that can type some interesting
examples of polymorphic recursion. Finally, we discuss some
applications of the type system in data representation optimizations
such as unboxing and overloading.
Continuation-passing, closure-passing style
abstract,
ps.gz.
Andrew W. Appel and Trevor Jim.
Proc. Sixteenth ACM Symposium on Principles of Programming Languages,
293–302, January 1989.
We implemented a continuation-passing style (CPS) code generator for
ML. Our CPS language is represented as an ML datatype in which all
functions are named and most kinds of ill-formed expressions are
impossible. We separate the code generation into phases that rewrite
this representation into ever-simpler forms. Closures are represented
explicitly as records, so that closure strategies can be communicated
from one phase to another. No stack is used. Our benchmark data shows
that the new method is an improvement over our previous,
abstract-machine based code generator.
Proceedings
I organized the
2000 Workshop on Proof-Carrying Code,
and its proceedings are available
online.
Contact info
Trevor Jim / trevor@att.com