Client-side proxies: Existing client-side proxies

Client-side proxies	Master's thesis, May 2000	Tomas Viberg
<< Previous [ Existing client-side proxies ] Next >>		Table of contents

5 Existing client-side proxies

While the focus of the last section was when and why someone should use client-side proxies, this section focuses on how the approach is actually used. Specifically, the questions examined here concerns the implementation of available proxies and, if they do not fully realise the potential of the approach, how they might have been implemented. As a rough classification, these questions deal with the external and the internal aspects of client-side proxies. External is what is visible to the end-user, mainly the user interface. The specifics of the internal aspects, including questions about application architecture and performance, are mostly of interest to advanced users, administrators, developers, etc, but also of some interest to the end-user, since they affect the perception of usability and efficiency.

In addition to those examined in the previous section, six new proxies enter the field for closer examination. A4Proxy is a Windows-specific anonymising proxy, functionally similar to the Crowds proxy described in section 2 [A4Proxy 00]. Another acquaintance from section 2 is the WebMate proxy providing browse and search assistance [WebMate 99], again subject to scrutiny. ByProxy [ByProxy 98] and Muffin [Muffin 00] are extensible client-side proxies. Although they provide predefined modules for different processing tasks, the main feature is that they allow third-party developers to extend the functionality of the proxies by implementing modules of their own. Muffin is limited to processing HTTP streams, while ByProxy also has support for the mail (SMTP) and news (NNTP) protocols. WebMate, ByProxy and Muffin are all written in Java and supposedly platform-independent. While not entirely platform-independent, the privacy-enhancing Junkbuster proxy at least has open source-code [Junkbuster 99]. It blocks unwanted URLs, deletes unauthorised cookies and removes HTTP headers that might identify the user. Finally, Proxomitron is a client-side filtering proxy targeted at HTML text and HTTP headers, with both pre-configured filters and support for creating additional filters [Proxomitron 00]. Like A4Proxy, it is only available for the Windows platform.

This examination will not consider every aspect of every application. Of main interest are the examples that stand out, for good or bad, and these will be emphasised in the following subsections.

5.1 User interaction

For many years, the desktop metaphor has been predominant in computer-user interaction. With the advent of the Internet, and especially the World Wide Web, user interaction has partially changed shape. Today, hypertext documents viewed with Web browsers is a familiar and well understood way of user interaction and many applications and online services embrace this method to provide advanced functionality. While interaction is dependent on the relatively limited expressiveness of the hypertext mark-up language, it can provide a simple and consistent interface to different types of services. In client-side proxies it is possible to facilitate user interaction in several ways, since the proxy is neither a pure stand-alone application nor a online service, but rather a hybrid of the two. This section explores different models of interaction, their impact on platform/client independence and the overall quality of user interfaces.

5.1.1 Interaction models

Apart from giving access to functionality and configuration, one of the responsibilities of the user interface is to communicate the state of the application to the user. In general, this means that the user interface (or parts thereof) should be visible near the client application and the processed content. Despite this, we have seen that most of the client-side proxies examined rely on application windows separate from the client applications for user interaction. In a way, this is natural, since it visualises the separation of proxy and client functionality. From another viewpoint, it is not so natural. Although there is a clear technological boundary between proxy and client, this boundary might not seem so obvious to the end-user. Rather, there is often a close semantic relationship between the processing of the content stream performed by the proxy and presentation in the client application. Possible ways to visualise this relationship is to integrate the user interface with the client application or embed it in the requested document. One obvious requirement is that the content protocol or client application supports this.

There are exceptions to the principle of visibility, for example concerning content blocking (PureSight, SurfWatch) and other prohibiting or monitoring applications. In applications like these, designed for almost complete transparency, the end-users have no need (or business) to access the inner chambers of the application. Rather, the users should be more or less unaware of the fact that the applications are performing their duties, only revealing themselves when the user tries to do something unauthorised.

On the next step on the visibility ladder, we find applications like Freedom and WebWasher. They work actively with the content stream but without requiring incessant monitoring and without producing any additional information apart from the processed content. It might be enough for these kinds of applications to indicate that they are functioning properly. The Windows-specific applications show this via icons in the Windows system tray (figure 6). Granted, this is a highly platform-specific feature, but a user-friendly one since a simple mouse click can give access to the full user interface.

Figure 6. WebWasher in the Windows system tray.

By necessity, this level of visibility must also suffice for another class of proxies, for example ByProxy, A4Proxy and NewsProxy. ByProxy and A4Proxy work with multiple types of content and protocols, a diversity that makes it nearly impossible to use any other model of interaction than separate application windows or client dependant integration. The same applies to the single protocol proxy NewsProxy, since the news protocol do not support any reasonable way to incorporate the user interface in the content stream.

For other proxies, the user interface should be close to the workspace of the user, providing immediate feedback and configuration. However, since most examined applications use their own windows for interaction, this is not the case. SELECT, Proxomitron and Muffin are examples that use application windows although the content they work with makes it possible to use and adapt the actual content stream for user interaction purposes. The content involved is mainly HTML documents, where the protocol and the content language allow applications to interact with the user through the content stream. Focusing even more on hypertext documents, which predominantly means Web pages, using the content stream for user interaction provides three major alternatives to do this with the aid of Web browsers. Integration might be supported by presenting the user interface in a separate browser window, a separate frame or directly in the requested document. Using a separate browser window has drawbacks similar to those of separate application windows - they might not catch the attention of the user or some other application might cover them. A covered window is mainly a problem for novice users, since they are not always aware that the windows represent a three-dimensional space. As opposed to integrated interfaces, with separate proxy windows it is possible for a user to arrange the desktop freely. Dedicated windows also ensure that requested documents are unaltered, as long as the functionality of the proxy does not involve content adaptation. However, separate windows do not facilitate a close union of user interface and content. Of the possible ways to integration, the only one utilised by any of the proxies in this examination is embedding the interface in the requested document. WebMate inserts a controller applet (figure 7) at the bottom of each requested page, allowing the user to access the user interface in a separate applet window (figure 8). There is also a stand-alone application window for the administration of basic proxy functionality, but all functionality directed to the user is accessible via the controller applet.

Figure 7 (left). WebMate controller applet.
Figure 8 (below). WebMate main interface.

It could be possible to embed the interface in other types of content. A user might for example interact with an underlying proxy through e-mail messages. In situations where it is important that the user is not interrupted and where interaction can be asynchronous, this could make sense. However, direct interaction is preferred in most cases, and "normal" user interfaces provide a way of interaction that is undoubtedly more intuitive.

Muffin and ByProxy also use their own windows but since they are extensible, it is possible for third-party developers to provide closer integration of user interface and processed content, at least for some modules. This is especially true for Muffin, being an HTTP-only proxy. ByProxy works as a proxy for multiple protocols, making it harder to provide interfaces integrated through the content stream unless the extension module focuses solely on HTTP processing.

5.1.2 Integration, separation and independence

The choice of interaction model also has implications on the client-independence of the application. Proxies that integrate their user interface with the client application are mostly more dependent on the capabilities of specific clients than those that use their own application windows. WebMate and SELECT are two examples of this, since they use Java applets and/or JavaScript embedded in the requested Web document as part of their user interface. Although the most popular browsers available today support these technologies, other browsers do not. A high degree of both integration and client independence requires the user interface to be described in the basic language supported by the client application. Fulfilling this demand is most feasible in the context of the Web and Web browsers, but inconsistent support for various HTML features can still make the user interface unsuitable for some clients. To achieve complete client independence, a clear separation of the proxy user interface and the client application is necessary.

As we have seen, separation is the approach used by the majority of existing client-side proxies examined in this work, while none uses pure HTML interfaces. Junkbuster is a proxy whose only attempt at a user interface is pure HTML, but this is merely a summary of version information and some variables that has been set during initialisation. To configure Junkbuster, the user must edit the configuration files manually.

The choice of interaction model also has implications on the dependence or independence of specific operating system platforms. An interface using native graphical functionality and components is not platform-independent. An application limiting itself to interface components readily available in the graphical toolkits of diverse platforms might be more portable and less platform-dependent. For example, applications using the Windows-specific system tray are probably more platform-dependant than those that do not. Interfaces implemented in languages like Java or HTML is certainly more independent, but they are still limited to platforms that support the chosen implementation technique. In reality, all major platforms have the ability to display HTML and, perhaps to a lesser extent, Java interfaces. However, a decision not to use platform-specific components and functionality can have other effects on the overall quality and usability of the interface.

5.1.3 User interface quality

The quality of the interface has obvious impact on the usability and perceived complexity of user interaction. So what is a good interface? One determinant of a good interface is to what extent it fulfils the expectations of the user. If an interface complies with the look-and-feel of the underlying operating system, most users will consider it good enough since they are accustomed to similar environments. The standard interface components provided by operating systems more or less force platform-dependent interfaces into the appearance mainstream, thus introducing an element of standardisation. While this does not guarantee a high quality interface, at least it guarantees that users will not be completely confused. Let us consider some aspects of this, borrowed from Microsoft's user interface guidelines for Windows applications [Microsoft 00].

The first assumption is that the best interface is no interface. Instead of relying on interaction, the application just works, which in the end is what the user wants. A good example is the pornography blocker PureSight. Relying on computation rather than interaction, there is generally no need for the user to interact with the application. There is also the no-interface paradigm used by many UNIX-style applications, basing interaction on command-line arguments passed to the program at start-up, and by manual editing of configuration files. The Junkbuster proxy illustrates the approach. For a user that is familiar with this milieu, it can be a usable interface, perhaps a parallel to keyboard short-cuts in graphical environments. However, for users accustomed to the graphical interfaces, these console applications can be very frustrating to use. They give virtually no visual aid regarding the functionality of the application.

If there has to be a user interface, strongly focused applications are normally easier to manage and configure, as has already been stated in the previous examination of the news filtering proxy NewsProxy. This is also true for other proxies with tasks limited to a specific area, such as WebWasher and SurfWatch. A strong focus is imperative to create a simple interface, concentrating on essential functionality and promoting fast initial learning. To different degrees, Proxomitron, WebWasher, SurfWatch and PureSight all live up to this, having focused tasks and providing familiar environments with default configurations that allow a user to start use the application quickly and worry about the details later. Like WebWasher, A4Proxy provides access to all functionality in a single window, with a tabbed dialog to navigate through different configuration windows. However, the contents and presentation of the different windows are diverse, lending a certain degree of complexity to the interface.

The extensible proxies, Muffin and ByProxy, are generally harder to configure. The main reason for this is that, apart from configuration of the base application itself, it also requires installation and configuration of different third-party extension modules. Relating to many, possibly very different tasks, it is difficult to maintain a consistent configuration view, thus increasing the complexity of the process. For example, the Muffin interface is extensive enough, but not as consistent and easily understood as the WebWasher interface (figure 9). The total amount of configuration needed might not differ that much between focused and extensible proxies, at least not if the user wants to create aggregate behaviour with multiple single-task proxies. In such situations, the overhead of configuring several different applications adds to the complexity.

Figure 9. Interface samples from Muffin and WebWasher.

No matter the task, it is important that the user is in control of what happens. A good example is the Proxomitron proxy, where the user has easy access to functionality and can control what the proxy does to the content stream. Filter rules are accessible in table-like lists, with checkboxes to enable or disable them. Clicking on a rule brings up a dialog where the user can edit the behaviour of the specified filter. On the other side of the control spectrum is Junkbuster, allowing no runtime interaction whatsoever. Command-line arguments, raw text configuration and restarting to apply changes do not give most users a sense of control.

PureSight, Proxomitron, WebWasher and the other operating system-specific proxies use platform-native interfaces, which generally are faster and more responsive than platform-independent solutions. Java interfaces suffer from the overhead of the virtual machine, and the connection between proxy and client needed for HTML interfaces introduce communication overhead. The Java-based proxies show this clearly, since the interfaces of SELECT, WebMate, Muffin and ByProxy are all slower and less responsive than their platform-native counterparts.

Another thing that determines the perceived quality of an interface is the way it handles different modes. An application is in a special mode when it displays for example a dialog window that demands user attention before it is possible to continue normal operation. The ideal solution is a modeless interface that never interrupts the user. WebWasher exhibits such an interface, while the common case is that interfaces occasionally force a switch of mode. If this is necessary, the mode should at least be obvious and visible, such as file dialogs. A bad example of modal behaviour is the Muffin proxy. During configuration, numerous different windows might be opened, causing confusion as to what mode the application is currently in and what the results of an action will be.

For a user to feel in control of application behaviour, the interface must also provide directness. A user should be able to manipulate information directly within the application, and the interface should give access to all of the application's functionality and configuration options. This is normal behaviour for user interfaces, since their purpose is to be the link between user and application. Accordingly, a majority of the examined proxies provide access to the full spectrum of relevant information directly through their interface. However, some proxies store important information in configuration files separate from the application and the only way to access the information is through manual editing. Most notably, this applies to Junkbuster, having no interface, and NewsProxy, where the interface does not facilitate filter configuration. Available and visible information and presentation of possible choices reduce the reliance on a user's ability to recall the right actions. It is easier to recognise the appropriate actions, and directness in an interface thus alleviates the mental burden of the user.

That it is easier to recognise something than to recall it from memory leads to the next ingredient of a good user interface, consistency. There are two levels of this, consistency within the application and consistency within the operating environment. If an application is consistent with the general look-and-feel of the surrounding operating system, users already accustomed to this environment can transfer their existing knowledge to new software. A familiar and predictable interface facilitates quicker learning, which enables the user to focus more on the task at hand. Platform-specific proxies generally look and behave like other applications on the same platform (figure 10).

Freedom, SurfWatch, A4Proxy, Proxomitron, PureSight and other applications that are consistent within the operating environment have a lower learning threshold than for example Java-based applications. Since the Web also has become a familiar environment for many users, hypertext interfaces have a similar advantage. Although the interface does not look like the surrounding operating system, it looks like other Web documents. Users that understand the design of the Web will consider the interface consistent within its environment. In contrast, consistency is not a common characteristic of platform-independent Java applications. The applet interface of WebMate and the stand-alone Java interfaces of SELECT, Muffin and ByProxy lack the common design style that is one of the strengths of platform-specific applications. Although standardisation is not the only path to usable interfaces, it is de facto very important.

Figure 10. Consistent within the operating
environment - PureSight and Proxomitron.

Following general design guidelines, environment-consistent applications are in general also consistent within themselves. This level of consistency requires that command names, presentation style, behaviour of operations, placement of elements, etc remain the same throughout the interface. An example of inconsistent behaviour is the rating buttons of the SELECT proxy. In the minimal rating interface, the buttons are located at the top of the window, which is inevitable since the window contains only these buttons and a button to expand the interface. When a user expands the interface, the rating buttons suddenly are close to the bottom of the window, creating an unnecessary inconsistency in the interface.

Users also expect some kind of response on the actions they perform, and application developers should make the effort to provide noticeable feedback on user actions. Again, the normal behaviour of the examined proxies is to provide feedback, communicating application status through messages, animations, etc. As usual, there are also exceptions. Editing filter rules in NewsProxy does not result in immediate response, since editing is separate from the application. To detect syntactical errors in the edited rules, the user has to restart the application. Another "feature" of systems with lacking feedback is frozen screens. The SELECT proxy demonstrates this. Whenever network communication takes place, the interface dies and is not resurrected until (and if) the communication is finished. Another annoying detail is that when a user switches between the minimal and the complete interface, the interface completely disappears for quite a while before it appears again.

The major determinant of a good interface is simplicity, providing smooth access to the complete functionality of an application. Extensive functionality might work against simplicity, and interfaces that maintains a strong focus and reduce the available information to the base requirements are generally simpler and more usable then more complex interfaces. For a proxy, the base requirements might be no interface at all, since proxies mostly run in the background. Depending on the task and the additional information produced, the interface design is more or less important to the proxy user. Nevertheless, even proxies providing completely transparent run-time services require installation and some configuration. A well-designed interface is always better than a poorly designed, even if it is seldom used.

5.2 Application architecture

Just like with people, interior qualities are harder to evaluate than exterior. It requires an in-depth examination of what happens inside to get a thorough understanding. Gaining such an understanding of computer software internals requires study of application source-code or detailed system documentation. This poses a problem, since sources or documentation might not be readily available, especially for commercial systems. The extent and complexity of source-code also makes the task time-consuming beyond the limits set by the scope of this work. With this method disqualified, a black-box approach must suffice, looking at the outer signs to draw conclusions about the architectural issues regarding client-side proxies.

5.2.1 Monolithic or modular

One architectural issue is whether the application is modular or monolithic. Somewhat simplified, a monolithic application consists of one large application file, while a modular application is split into different modules with specialised functionality. Modular applications create links to external modules dynamically, while running. Applications built with statically linked modules at compilation time are not modular. In the context of this section, a modular application is one that uses dynamic linking. Among other things, the choice between static and dynamic linking has impact on application efficiency and ease of updating.

Of the examined client-side proxies, Junkbuster seems to be the only monolithic application, although built from modular source-code. Other applications might seem monolithic at a glance, but they probably use dynamic linking of platform-specific libraries, for example to gain access to graphics and network functionality. It is difficult to be certain of this using a black-box approach, but it is standard behaviour for modern platform-dependent applications. What is certain is that the Java applications SELECT, WebMate, Muffin and ByProxy are modular. All linking is dynamic in Java.

A modular approach could facilitate run-time loading and unloading of functional modules. By loading only basic functionality at start-up, application initialisation might be considerably faster. Loading additional functionality only when demanded could lessen the application's overall use of memory and processing power. However, the overhead introduced by dynamic loading might have negative effects on performance, and monolithic applications are generally faster. For Java applications, with both completely dynamic linking and dependence on the virtual machine, performance is often a problem. On the good side, a modular, dynamically linked application might be easier to update, since it does not need complete reconstruction after every update. This is particularly apparent in Java environments. Simply replace a class file containing a certain module with an updated version, restart the application, and the changes take effect. Easy updates of individual modules can improve the overall stability of an application. Of course, an updated module can also introduce new problems resulting in serious errors in dynamically linked environments, while the compiler might have discovered the problem at compile time in a monolithic application.

5.2.2 Transparency

Designed as invisible middlemen working to improve the perceived performance of network communication, a major feature of the original proxy servers is transparency. Transparent service is also a trademark of the more versatile client-side proxies examined in this thesis. To behave as was originally intended, a proxy should perform its duties without drawing attention to itself. Monitoring and adaptation of the content stream should be invisible or appear as part of the functionality of the client application or operating environment. Ideally, the user should forget about the proxy once it is started.

Freedom, PureSight and SurfWatch provide the most transparent service, working with the low-level network functionality of Microsoft Windows. They access the content stream directly through the operating system, offering easy installation and complete transparency. There is no need to configure specific client applications since the operating system automatically monitors all communication on behalf of the registered proxies. The obvious gain is that no communication can bypass the proxy, but the downside is that a user can not decide to exclude some particular client application from proxy interference. Due to the smooth low-level integration with the operating system and the fact that these applications do not produce additional information separate from the content stream, a user can normally ignore their existence. After installation and configuration, the single-task filtering proxies WebWasher, Proxomitron, A4Proxy, Junkbuster and NewsProxy are equally unobtrusive. However, they rely on intra-machine communication for their functionality, which normally requires manual configuration of different client applications to make them send their requests through the proxy. While making installation slightly more complex, it lets the user decide which clients to subject to proxy processing.

In the end, the task performed by the proxy determines whether true transparency is possible. The basic architecture of a proxy server provides transparency, but if a developer builds functionality that requires user interaction on top of the proxy, there is no guarantee for transparency. The extensible proxies Muffin and ByProxy exemplify this. The basic proxy functionality is running in the background, invisible to the user. An extension module has the option to be as transparent as the surrounding application, but it can also supply functionality that demands the user's attention. For example, the SELECT proxy for collaborative rating is built on top of Muffin. However, since this task clearly demands user interaction, the SELECT proxy is less transparent than Muffin and the average proxy. Although WebMate also requires interaction, it is more transparent. By providing interaction through the client application, it might seem to the user that the client environment and not a separate application provide the functionality. However, confusion might arise if the user moves to a machine where the proxy is not installed. Contrary to expectations, the client application does not provide the anticipated functionality. This is a problem common to all transparent services and applications, whether they are proxies or not.

5.2.3 Sophistication through aggregation

Most proxies focus on a specific task, such as breaking animation, removing personal information from requests, filtering, etc. A user might want to submit the content stream to many types of processing before it reaches the client application and for this, the proxies must have some support for aggregation.

Chaining is one way to support this, meaning that the output from one proxy is the input for another. Communication passes through multiple proxies on the way between client and server (figure 11), making the aggregate functionality of all proxies available to the user. This is the most common way to support aggregation, probably because the basic requirement is simply to change the network port (and optionally, the host) through which communication flows. All examined proxies except ByProxy and NewsProxy support chaining. SELECT does not seem to support chaining either, although it is based on Muffin which has chaining support. A4Proxy is not so straightforward regarding chaining, since its task involves relaying requests through external, privatising proxies. With the option to set a default relay proxy, chaining of local proxies is possible, but not a wise choice. To ensure privacy, the A4Proxy must be last in the chain, applied to communication just before it leaves the local machine. In this way, the proxy can relay communication through any remote, anonymising proxy it wishes.

Figure 11. Proxy chain between client and server.

Order could be important in proxy chains. A privacy-enhancing proxy should be the last stop between the client and the remote network. It also makes sense that a content blocking proxy performs its task before the document is processed by other proxies. Normally, users can control the chaining order by configuring the individual proxies. Although Freedom, PureSight and SurfWatch support chaining, close platform integration hides this aspect of configuration from the user. It is not possible for a user to decide in which order to apply these proxies to the content stream.

Chaining of proxies is a simple and well-supported way of aggregating behaviour. It does require configuration of multiple applications and it could be bad for performance, as will be discussed later. An alternative is to support aggregation through extensibility, an approach we recognise from the Pavilion framework in section 2. An extensible proxy allows developers to implement plug-in modules to extend proxy functionality. This supports aggregate behaviour without configuration of multiple applications and without the overhead of communication between chained proxies. It is also possible to apply extensions in a user-defined order, and since configuration is limited to one application, changing the order is probably simpler in extensible environments. Apart from some client-side proxies, many different applications use this approach to enable third-party developers to extend the basic functionality of the application.

What distinguishes an extensible application is that it allows dynamic loading of extension modules, modules possibly developed long after the first release of the application. A developer only needs to know about the application's programming interface and nothing about implementation particulars. With this knowledge, the developer can develop functionality extensions using the full expressiveness of supported programming languages. ByProxy and Muffin are the only extensible proxies in this examination, and their support for third-party extensions is the topic of the next subsection.

5.2.4 Development of third-party extensions

The extensible proxies Muffin and ByProxy are both implemented in Java, which is probably no coincidence. A basic requirement for extensibility is dynamic loading of extension modules, and Java has built-in support for run-time loading of classes. In addition, Java interfaces make it easy to enforce that an object provides the methods required of an extension module, regardless of module internals.

In Muffin, a developer must provide a FilterFactory that among other things maintain the state of the application between sessions, with the help of configuration functionality supplied by Muffin. As the name implies, the factory also supplies Muffin with Filter instances that receive and process content. What aspects of the content a filter can access depends on what interface(s) it implements. A ContentFilter can process requested documents directly through the stream flowing between client and server. A HttpFilter can intercept requests and send anything back to the client and a RedirectFilter intercepts a request and redirects the client to another resource. A ReplyFilter filter replies from remote servers, and finally, a RequestFilter does the same with client requests. Muffin pre-parses the content stream to give developers easy access to the information of the stream, creating Reply and Request objects that encapsulate header information from client requests and server replies. The content stream is transformed from the original byte-stream format to a stream of specialised objects providing high-level access to the HTML content, such as tags, tag attributes, character data, etc.

Instead of using internal streams, ByProxy reads the stream into byte-buffer objects. For reading and writing header information, ByProxy provides high-level reply and request objects, named BrowserRequestHeader and ServerDocumentHeader. In addition, ByProxy provides IncomingEmail and OutgoingEmail objects, encapsulating mail-specific information. Through these objects, an email filter can easily access message headers, content body, server information, etc. There are no news-specific objects. Instead of using predefined interfaces, a ByProxy extension specifies the types of objects it is interested in processing. For example, a filter can specify that it wants access only to IncomingEmail objects, and when the proxy receives an email, it calls the sniff method of extensions with registered interest in the object. The sniff method should be available in an object called a sniffer. The sniffer is responsible for acting on or manipulating the data it receives from a so-called proxy agent. The agent handles communication monitoring, and notifies the sniffer when it encounters something of interest. There is no interface to enforce the existence of the sniff method, but it must be available for ByProxy to function properly.

One of the major arguments for extensible solutions is to increase the productivity of third-party developers. Since the basic proxy functionality is available through the base application, a developer does not have to worry about the miscellany of the underlying technology. Indeed, this is a trademark of all layered solutions, such as operating systems, network protocols, etc. It also means that the overall application can evolve and become more attractive without constant involvement of the original developers. To increase the productivity of third-party developers, there must be a stable and understandable framework in which to develop extensions. Muffin's consistent use of interfaces ensure some degree of stability, while ByProxy's lax approach could result in serious run-time errors. Well-documented interfaces also visualise what is required of an extension module and because of this, it is probably quicker and easier for a third-party developer to produce extensions for Muffin than for ByProxy. The strength of ByProxy is that it is multi-lingual, allowing developers to process several types of content within the same application environment.

Neither Muffin nor ByProxy provide a user interface specialised for presentation of processing results. They do provide a graphical interface for configuration, but the extension module itself must supply any other interface. Although a module that process HTML content easily can display what it wants in the processed documents, the lack of interface support has potential negative effects. Several filters adding their information will clutter the requested document, developers creating their own interface will experience productivity loss, and modules without interface could be less user-friendly.

5.2.5 Platform independence

The client-side proxy architecture is not inherently platform-independent. A proxy relies on the same more or less platform-dependent programming languages and operating environment functionality as other solutions. Nonetheless, there are four discernible levels of platform-independence exhibited by the proxies in this examination.

On the first and most independent level, we find the Java applications. SELECT, WebMate, Muffin and ByProxy can run virtually unchanged on any machine with a proper virtual machine installed. In theory, they are platform-independent, but in reality, they are dependent on platforms with Java support. Despite this, they are more independent than any application targeted at a specific platform, since the virtual machine shields them from the particulars of the underlying operating system. On the next level, Junkbuster and NewsProxy are more platform-dependent, but since their source-code is available, they are at least portable to different platforms. As already discussed, porting is not a trivial undertaking and most users are limited to the pre-ported versions. However, as the Linux operating system and the GNU software has shown, open source projects tend to attract third-party developers whose effort results in availability for more platforms than the commercial alternatives.

The third level houses Proxomitron, WebWasher and A4Proxy. Although tied to the Windows platform, they behave as standard proxies and communicate through local network ports. This kind of network functionality is common across different platforms, and these applications should be portable without extensive structural changes. This is probably not the case with Freedom, SurfWatch and PureSight, all depending on platform-specific network functionality provided by the Windows operating system. They access the content stream directly through the operating system, a possibility that is not as common, or at least not as consistent, as the socket communication used by other proxies. Freedom, SurfWatch and PureSight constitute the fourth level, being thoroughly platform-dependent.

Regardless of the platform-independence of a specific proxy application, proxies are mobile. They can be located on the client machine, on a local network or anywhere on the Internet, and still be accessible to the user. Therefore, moving a proxy to a computer on the network where it is executable makes a platform-dependent proxy independent, at least in the eyes of the user. An obvious requirement is that the proxy has no user interface or the ability to display the interface through the content stream, such as Junkbuster or WebMate. However, moving the proxy to the network has negative side effects. Some of the benefits of a local proxy are lost, such as the ability to enhance user privacy before communication leaves the client machine, and the possibility to utilise local processing power for demanding tasks. In addition, network-based proxies will probably be multi-user systems, adding the complexity of multi-user environments to development and administration.

5.2.6 Performance impact

The introduction of proxies between server and client will have impact on performance, primarily through increased response times. Several factors influence the degree of performance degradation. If the goal of the proxy is to improve performance, the gains of processing should obviously compensate for the cost. The only proxy for performance enhancement in this examination is WebWasher. By removing advertisements from requested Web pages, WebWasher clearly improves the overall performance. The proxy eliminates requests for ads from busy servers, resulting in faster retrieval of Web documents.

Another factor is the simplicity of the task. Simple processing has less impact on performance. One example is the relatively straightforward text matching used by Proxomitron, SurfWatch, Junkbuster and NewsProxy. While simplicity is a way to minimise performance loss, it generally leads to less sophisticated behaviour. In cases where the processing is more demanding, asynchronous processing might be a way to alleviate the performance impact. This is the approach used by SELECT and WebMate, since they only need a quick glance at document-specific information. After extracting this information, the proxy releases the content stream to the client application and continues its processing. Obviously, there is a period of waiting before the processing results are available, but it allows the user to view the document while waiting. Although the overall loss in performance might be considerable, it is not as noticeable as when all processing must be finished before the document can be displayed.

Where asynchronous processing is not possible, performance could certainly be a problem. Examples are Freedom and A4Proxy, since they encrypt communication and/or introduce privacy-enhancing detours from the optimal path between client and server. The porn blocker PureSight can not use asynchronous processing either, since the content analysis must be done before deciding whether to show or to block the requested document.

Chaining multiple proxies for aggregate behaviour might have considerable impact on performance, since chaining requires socket communication between different proxies and all content processing is lost at each movement along the chain. For example, a proxy could adapt the content to simplify processing. Before sending it to the next proxy in the chain, the application must restore the content to its original state, and every proxy in the chain might repeat this procedure of parsing and restoring. From the performance viewpoint, the extensible approach of Muffin and ByProxy could be preferred, since it only performs pre- and post-processing of the content once. However, most client-side proxies do not support extensibility, and even those that do might maintain the view of the content as a data-stream. Such a proxy uses internal streams to give extension modules access to the content. This means that a stream is sent to a module, which parses it and writes the result to another stream that is passed to the next module, and so on, until all modules has had access to the content. This is clearly inefficient compared to building a higher-level data structure from the stream and passing pointers to the structure to the interested modules.

Apart from simplifying the task and use extensibility rather than chaining, there are other ways to minimise the performance impact. Caching of documents comes to mind, since it is a function many ordinary proxies provide, but none of the examined proxies use internal caches of any sophistication. Moving ahead of the user to fetch documents that has not yet been requested is another way to improve at least the perceived performance. Pre-fetching increases the overall network traffic, but performance will probably improve for users following links in Web documents. WebMate provides pre-fetching of documents.

While caching, pre-fetching and other performance-enhancing methods could be valuable in a single-proxy environment, they might cause problems in multi-proxy chains. If several proxies attempt to cache or pre-fetch documents the results are likely to be confusing and inconsistent. From this viewpoint, it is understandable that the performance-enhancing functionality provided by some of the examined proxies is limited to maximise the performance of the individual proxy. For example, the Freedom proxy allows the user to set the length of the privacy-enhancing detour in favour of either performance or security, and PureSight has the ability to remember previous processing results so that the same page does not have to be processed every time it is accessed.

Up to this point, we have mapped out the territory of client-side proxies. Now it is time to leave already trodden paths, and step into hitherto unknown domains. The next section introduces Blueberry, a prototype proxy extension. Although deeply rooted in the proxy environment, it stretches the boundaries set by other client-side proxies appearing in this work.

<< Previous [ Existing client-side proxies ] Next >>

Table of contents