[twocolumn] article [colorlinks] hyperref sf pslatex stone document Pyrite: A Framework for Palmtop Data Interchange Rob Tillotson robt@debian.org June 8, 1999 abstract The Palm Computing platform was designed with tight desktop integration in mind. Pyrite is an Open Source toolkit which allows desktop applications to interact with Palm handhelds and their data. It is implemented in the Python programming language, and is deliberately quite different than the standard Palm Desktop software. It is designed for maximum flexibility; Python's loose, dynamic object structure is exploited to allow applications to easily extend Pyrite's capabilities. In this paper, I will explain the most significant aspects of Pyrite's design, and describe some of the issues encountered during implementation. abstract Background Before the 1996 introduction of the Pilot connected organizer, pocket-sized computing devices fell into two basic categories: organizers and handheld computers. The difference was one of extensibility. Organizers, like the Sharp Wizard and Casio B.O.S.S., were essentially single-purpose devices: they had built-in software to track schedules and phone numbers and expenses, and there was very little chance of doing more. Handheld computers (such as the HP LX, Psion 3, and Apple Newton) were much more flexible, with larger memory sizes, expansion slots, and most importantly the ability to run a wide variety of add-on software. Missing from all of these devices, however, was the sort of tight desktop integration found in the Palm handhelds. Starting with the very first Pilot 1000, the Palm handhelds have been designed to work closely with a desktop computer, acting as a mobile ``window'' into the data on the desktop. The handheld itself is intended to be cheap and disposable, with modest hardware Current models have a Motorola 68K series CPU running at 16 MHz, and 1--4 MB of RAM. and and lightweight operating system. The handheld communicates with the desktop through a serial port, and comes with a drop-in docking cradle. Palm calls their products ``connected organizers,'' to highlight the differences between them and previous handhelds. In order to make the ``connected organizer'' paradigm work, Palm Computing designed a new operating system with modest resource requirements and integrated desktop communication. Two aspects of the Palm OS design are particularly important to desktop software developers: Everything is a database. The Palm OS treats the contents of the handheld as a pool of simple, flat databases, with variable length records. There is no support for files as byte streams; applications must always work with databases one record at a time. Palm OS 3.0 has a ``streaming'' interface which is similar to the C standard I/O library, but it is implemented atop the database system. Even executable programs are simply databases, containing code and user interface resources. Record-oriented synchronization. The Palm OS uses its own protocol stack See hhsj-unixpilot-protocols-1 , hhsj-unixpilot-sync , and hhsj-unixpilot-padp . to communicate with the desktop. The highest-level protocol, DLP, provides a set of operations which closely correspond to the database interface that programs on the palmtop use. The database manager on the palmtop supports synchronization by marking modified records and temporarily archiving deleted records. The desktop side of the synchronization system is part of the Palm Desktop, a software package provided with every Palm handheld. Third party software can extend the Palm Desktop's synchronization capabilities by adding conduits , which are program modules that can transfer information between a docked handheld and some other data source. Because the Palm Desktop software is only available for Windows and Macintosh, Palm handheld users who do not use one of those platforms have had to develop their own replacements for it. Most of these are based on pilot-link pilot-link . Originally released in late 1996, pilot-link includes a portable library that implements the Palm synchronization protocol stack, and a set of core utilities which use that library. Many other applications have been built upon pilot-link; a few of the more substantial and/or ambitious ones are listed in the bibliography. See pilotmanager and kpilot . Pyrite uses pilot-link for communication with the palmtop, but does not use any of its other features. Motivation When I began working on Pyrite in late 1997, my original intention was to create a Palm Desktop-like application using Python as the implementation language and Tkinter as the graphical user interface. At the time, the only Python language interface to the Palm protocols was the one provided with pilot-link, a relatively straightforward translation of the pilot-link library interface to Python 1.4. As I began to enhance that interface for Python 1.5, I decided to use it as a vehicle to explore a design quite different from the Palm Desktop's conduit system. To understand why Pyrite is not a copy of the standard synchronization interface, it may be helpful to think in terms of ``parts'' and ``glue.'' The underlying design philosophy of Unix is one of tool-building, in which complex systems are built from smaller parts. Today, the parts we use are bigger and the interconnections between them are more complex; the ``glue'' in today's systems is often a dynamic, interpreted language like Python or Perl. Pyrite's job is to bring the information on the palmtop within reach of Python, making it available in a way that is both convenient and powerful. The standard synchronization interface connects specific palmtop databases to specific desktop data sources, in the context of a HotSync session. The relationship is one-to-one, and the conduit interface simply provides a convenient way to move data back and forth. Pyrite does that as well, but decouples the act of communication from its content: synchronization is optional, applications can do everything conduits can do, and palmtop data can be exposed to Python without requiring that a corresponding desktop application exist. From the viewpoint of the Palm application developer, this can be very significant. Consider what happens after a palmtop application is developed: if it can be used to collect a significant amount of data, a desktop interface to it will soon become important. Using the standard conduit interface, this means that the developer is faced with the prospect of writing both a complete desktop application and a conduit, using different development tools (and possibly a different language) than those used to make the palmtop application. Palmtop applications are usually coded in C, cross-compiled using the GNU C compiler or with a special version of CodeWarrior; conduit development on Windows requires Microsoft Visual C++ or Java. Not surprisingly, most third-party applications (especially those from small and/or non-commercial developers) don't come with conduits or desktop software, and their data is trapped inside the palmtop. Pyrite places an abstraction layer between the palmtop and desktop applications. Information from a palmtop application can be made available in a form that is amenable to further manipulation in Python, by adding interface modules which pack and unpack records from the application's databases. This is considerably easier than developing a complete desktop application, and exposes the data on the palmtop to Python's full capabilities as a glue language. Other facilities can be built on top of this data abstraction layer, so that the implementation details of the palmtop application can be hidden even from its own conduits. My goal in designing Pyrite, therefore, was to provide tools for palmtop data interchange in a powerful, flexible way. Making this possible requires that Pyrite focus on several important tasks: itemize Accessing databases in an intelligent manner, using standard Python interfaces when possible and hiding implementation and storage details from the application programmer. Organizing palmtop-related data in a logical fashion, treating collections of databases the same no matter where they are (on the local disk or on the palmtop). Supporting standard Palm OS data formats, including the prc/pdb file format (used to store and transfer Palm OS databases on desktop systems) and the databases used by the built-in applications. Allowing easy extension, so that new palmtop applications can be supported and new conduits can be added with a minimum of unnecessary code. itemize The rest of this paper will describe the elements of Pyrite's design and implementation which enable it to perform these tasks. Design Plug-Ins Python's loose, dynamic nature was a major influence in Pyrite's overall design. The externally visible interface to Pyrite is a shallow but wide tree of classes, several of which serve as the base for an entire family of interchangeable components. Pyrite can be roughly divided into two parts: a component framework and a set of components for desktop/palmtop integration. Because the Pyrite component framework is of sufficient generality to be useful on its own, it is also available separately. sulfur Pyrite's components are called plug-ins . Plug-ins are essentially just Python modules and packages, but Pyrite builds upon the Python package system by providing more convenient access to plug-ins from within a running application. There are several important differences between plug-ins and ordinary modules and packages: itemize Plug-ins are organized by function, not by location. A collection of plug-ins sharing a common interface can span many physical locations; an application does not need to know or care where a particular plug-in is installed in the Python package tree. New capabilities can be added to a particular Pyrite installation without changing Pyrite itself. Plug-ins are instantiated automatically. When an application loads a plug-in, the result is an object which serves as the plug-in's visible interface. When many plug-ins share the same visible interface, applications can use them interchangeably. Plug-ins are automatically configured. User-visible options in a plug-in are set to the user's preferred values when the plug-in is loaded, and whenever the application's configuration changes. (For example, if there are different option values for particular palmtops, they will be set appropriately depending on which palmtop is in the cradle.) Plug-ins can be discovered at runtime. Applications need not incorporate any knowledge about which specific plug-ins are available, because the list of available plug-ins (and their properties) can be obtained at any time. itemize A plug-in is nothing more than a regular Python module, and a collection of plug-ins is just a standard Python package (or packages). When the plug-in loader finds a requested module, it looks for a class inside it which inherits from the Plugin class, and makes an instance of that class to serve as the application's interface to the plug-in. A reference to the interface object is kept in a cache for the life of the application, and reconfigured when necessary. Databases Blocks The Palm OS takes a relatively simple approach to databases: it doesn't care what is in them, as long as their contents can be divided into records. On the handheld a record is just a memory block; it is up to the application to impose whatever structure is required. Each record has a unique ID, a category (a small integer from 0 to 15) and a few attribute bits, but otherwise the handheld's data manager makes no attempt to structure the contents. As a result, it is impossible to algorithmically infer anything about the structure of a database from its contents --- in order to make a particular database useful, it is necessary to unpack the records into a more meaningful structure when reading them from the database, and do the reverse when writing them. Here, the lightweight nature of the Palm platform is helpful: most Palm applications store their data simply, using standard C types and null-terminated strings, making it easy to understand and manipulate in Python. The same is true of other record-like data blocks, detailed discussion of which is beyond the scope of this paper. In brief, the other block types include resources (stored in special databases, each has a type and numeric ID), appblocks (each database has one, an extra record stored outside the normal sequence and normally used to hold database-wide information), and preferences (application-specific data stored in a system database). Pyrite encapsulates this behavior in the Block class (and its subclasses, such as Record and Resource); a Block is similar to a Python dictionary, except that the possible key/value pairs are restricted (each represents one field in the data block) and can transform its values to and from a formatted string of bytes suitable for storage in a database. Inner and Outer Layers Pyrite's database interface is divided into two layers. An active database is represented by two object, one ``wrapped'' around the other. The outer object is the one applications actually interact with; its visible interface is much like that of a regular Python list, with additions to support the unique needs of the Palm database paradigm. Inside that, a back-end object manages access to the actual database; the back-end interface is less Python-like, and more like the native one. This division is intended to make Pyrite more flexible and customizable. The inner object treats records in the database as opaque blocks of arbitrary bytes, and concerns itself only with how and where those blocks are stored. The outer object assigns structure and meaning to those arbitrary blocks of data, wrapping them in Block objects without regard to where or how they are actually stored. As a result, the code specific to a particular Palm application (packing and unpacking its internal record formats) only has to be written once, and it will work with any and all implementations of the low-level database interface. Implementation Issues Unfortunately, implementing the back-end database interface was more difficult than I expected. Neither the DLP protocol nor the prc/pdb local file format are well-suited for general purpose, random access to the contents of a database --- DLP is meant for synchronization, and the prc/pdb format is meant as a way of transferring whole databases at once. (This is why the standard Palm Desktop doesn't store its local data in prc/pdb files, but in a more flexible data structure provided by the Visual C++ runtime library. Similarly, Pyrite works with prc/pdb files using an in-memory cache, which bypasses the most annoying limitations of the format.) As a result, Pyrite database objects exhibit differences in behavior, even though the interface to all of them is ostensibly the same. Some back-end databases automatically assign unique IDs, others do not. Some can insert new records at any position in the database, others can only append them to the end. It was impossible to hide these differences behind the outer interface; instead, each database has a set of properties which describes these small details. An application can check these properties and modify its behavior appropriately, or request a database with a certain set of properties, if one is available. Although this is an imperfect solution, it seems to work reasonably well in practice. Stores The Pyrite database layer is primarily concerned with the contents of individual databases. Management of database storage is handled by another interface layer, the Store . A Store is an abstract representation of a related collection of databases, such as the contents of the palmtop. Stores follow the Palm OS conventions for file system organization (a flat pool of databases referenced by name, creator, and type). The purpose of the Store object is to map this view onto an underlying representation, providing a consistent interface to applications no matter where the databases actually are. Most Stores correspond to a particular kind of back-end database object. For example, each Store which works with prc/pdb files (there are several) uses the caching back-end described previously, and the DLP store uses a back-end object derived from pilot-link. To deal with differences in back-end capabilities, the application may provide ``hints'' to a Store about each database it opens, including requests for particular behaviors, suggested database header values, and other Store-specific information. Palmtop Application Support Above, I described how Pyrite uses Block objects to translate data between its stored form and an expanded, dictionary-like representation. Determining the proper kind of Block to use, however, is another matter entirely, and that is where application support modules enter the picture. Fully supporting a particular Palm application requires several bits of knowledge about that application, for example: itemize The name, creator ID, type, and other identifying information of its databases The format of records, appblocks, and/or resources in its databases How its databases are organized, if different than the usual conventions What preferences it uses, and their format itemize For each palmtop application Pyrite supports, all of this knowledge is encapsulated in an application support module. A typical application support module contains one or more Block subclasses --- for example, the Mail application module has four: one representing a record in the mail database, and the other three representing preferences --- and at least one custom outer-layer database class which uses them. Each application module is also a plug-in, with an interface that allows the opening and creation of databases, reading and writing of preferences, and so forth. Early versions of Pyrite used an automatic classification scheme to detect supported databases. After importing App.Address , for instance, Pyrite would be made aware of Address databases; then, when the application opened `AddressDB' , the returned database object would be treated as an address database. In more recent releases, I removed that system in favor of the current one, which requires the desktop application to explicitly load the desired plug-in and request that it open the database. While the current scheme is not as convenient as the previous one, it places the responsibility for database classification where it belongs: inside the one module that contains all of the other application-specific knowledge. Synchronization Like the Palm Desktop, Pyrite's synchronization process involves the running of conduits. Conduits are plug-ins, loaded in the same manner as other parts of Pyrite. During synchronization, the conduits in use are called in turn, in three passes. The first, pre-sync pass allows the conduits to do advance preparation, such as gathering data from an outside source. Actual synchronization is done during the second pass; the serial connection is opened on demand, the first time it is used, and closed at the end of the pass. The third and final pass allows the conduits to do post-processing and cleanup. The reason for this three-pass process is efficiency of communication: using the serial connection is ``expensive'', in both battery power and the user's time, and it should be unnecessarily left idle. Since the connection to the handheld is open only during the second pass, the first and last passes can be used to do time-consuming work without consideration of connection time. There is, however, one small inconsistency in this type of conduit structure. Until the connection is opened, it is impossible to tell which palmtop is in the cradle. It is possible to guess the identity of the palmtop in advance, based on either a user-supplied default or on the identity of the most recently seen palmtop, but until the synchronization pass begins there is no way to accurately know which palmtop will be present. As a result, conduits have to be careful not to do anything in the pre-sync pass that might be affected by a change of identity during the other two passes. Dealing with this inconsistency is still an open issue in Pyrite's design; one possible solution might be to do the pre-sync pass outside of the normal synchronization sequence, once for each known palmtop (an ideal application for tools such as cron or at ). Pyrite does not impose any particular internal structure on its conduits, but it does provide help for basic synchronization tasks. Like the Palm Desktop, Pyrite includes a generic synchronization method; given a palmtop database and two desktop databases (primary and archive), it synchronizes their contents according to the logic described in the official documentation. palm:cdk-guide In theory, it should be possible for the generic synchronization code to reconcile records at the field level, as long as Pyrite has full support for the corresponding palmtop application; at the time of this writing, however, the code to do that is not yet complete. Pyrite Applications Pyrite's plug-in services, along with others such as command-line parsing and configuration file support, are provided through an Application class. Each Pyrite-enabled application is wrapped in a subclass of Application . The application's core code is a method on this subclass, and Pyrite services are available through calls to other methods on the application object itself. Much of the Application class is concerned with configuration. Like most complex programs, Pyrite has many settings which can be configured by the user or system administrator. In addition to system-wide settings (such as the name of a port associated with a docking cradle, or the location of a shared data directory), many applications and plug-ins have local settings. When an application starts up, its Application object loads option values from two configuration files, one in /etc and one in the user's home directory, storing them in a layered registry so that the user's settings override those set by the administrator. When a plug-in is loaded, it is configured based on the stored option values. The Application object also parses the command line, extracting settings provided as arguments. Making matters more difficult, however, is the fact that palmtop identity makes a difference. Most non-trivial Pyrite applications require some access to information associated with a particular palmtop, so the Application object keeps track of the ``current user''. Making a connection to a palmtop can cause the ``current user'' to change, since there is no guarantee that a particular palmtop will be in the cradle. The Application class deals with this by reconfiguring itself and all plug-ins every time the ``current user'' changes, in case different settings are associated with each palmtop. The other significant purpose of the Application class is to act as a context for plug-ins. When a plug-in is loaded from within an Application object, many of the plug-in's methods will call their counterparts in the Application object in preference to their own default behavior. This way, application-specific behavior is possible without requiring changes to existing plug-ins. Future Directions Pyrite began as a simple refinement to pilot-link, and grew far beyond my initial expectations. While it is (in some sense) relatively complete, there is still much work to do. In addition to the ever-present need for more and better documentation, there are several potential improvements which appear to be worth pursuing in the long term. First and foremost, Pyrite needs to support as many palmtop applications as possible. It is not necessary to connect the palmtop application to any particular desktop software; simply exposing its data to Python will make future integration easier. The work involved in adding an application support module is not difficult; instead, the primary obstacle to growth in this area is the closed nature of many palmtop applications. Open access to data on the palmtop is vital not only to the future of Pyrite but to the future of Palm/Unix compatibility in general, and gentle evangelism of palmtop developers about the benefits of open, documented data formats may prove to be very beneficial. Next, Pyrite could be made less dependent on Python. Tying a library to a particular language can be a disadvantage even in the relatively flexible world of Open Source, so one of my long-term goals for Pyrite is to make it available to applications written in other languages. In the short term, the Pyrite class structure could be translated directly into another language, using the existing Python code as a prototype. Unfortunately, many popular languages are not amenable to a direct translation, and dynamic loading of classes is tricky in most compiled languages. In the long term, the most promising possibility is to use a distributed object system (CORBA, for example) to extend Pyrite's reach, and to allow Pyrite components to be used by programs in languages other than Python. Finally, I plan to add more user-level support to Pyrite, in the form of graphical interfaces to most of Pyrite's functions. Other user interface improvements, including easier configuration and installation, will also be important. Pyrite could become the basis for a complete desktop application suite, using plug-ins to enable it to work with data from many palmtop applications. While it is not my goal to make Pyrite the only interface to Palm connected organizers and their data, I aim for it to be the most powerful and flexible one. Availability Pyrite and related tools are free software; most is under the GNU Library General Public License, but parts are also available separately under a more liberal license like that of the Python interpreter and libraries. The source code can be downloaded from the Pyrite home page: quote http://purl.oclc.org/net/n9mtb/cq/ quote Two Pyrite-oriented mailing lists are available: pyrite-announce is moderated and carries announcements only, and pyrite-discuss is open to public discussion. To subscribe, send a message to mailto:pyrite-announce-subscribe@egroups.com pyrite-announce-subscribe@egroups.com or mailto:pyrite-discuss-subscribe@egroups.com pyrite-discuss-subscribe@egroups.com , respectively. hhsj-palmpython palm:sdk-companion palm:sdk-ref palm:cdk-ref palm:cdk-guide plain pyrite document