Enhancing The Web Customer’s Experience: Techniques and Business Impacts of Web Personalization and Customization Michael Drogan Silberman College of Business Administration Fairleigh Dickinson University 285 Madison Avenue Madison NJ 07940-1099, USA Jeffrey Hsu Information Systems Silberman College of Business Administration Fairleigh Dickinson University 285 Madison Avenue Madison NJ 07940-1099, USA Abstract The ease and speed with which business transactions can be carried out over the Web has been a key driving force in the rapid growth of e-commerce. The ability to track user browsing behavior down to individual mouse clicks has brought the vendor and end customer closer than ever before. It is now possible for vendors to personalize their product messages for individual customers on a massive scale, a phenomenon referred to as “mass customization” (Mobasher, Cooley and Srivastava, 2000). This paper will explore the topic of web personalization/customization. Simple techniques such as the ability for a user to create a personalized “home page” will be discussed as well as more advanced techniques such as web usage mining that do not rely on user input but instead on user activity. Using such techniques, sites can adapt themselves to user preferences without requiring that users take the time to complete profile information. In addition to individual personalization, this paper will explore the topic of group personalization or ‘adaptive’ web sites. As a web site grows and evolves, its original design may no longer be appropriate. Web servers record data about user interactions and accumulate this data over time. This paper will also cover the topic of “spyware.” The purpose of spyware is to record a web user’s surfing patterns and deliver more personalized and targeted advertising based on those usage patterns. Privacy, government issues, and also the business impact of customization/personalization are also explored. Keywords: web personalization, web customization, e-commerce, web development 1. INTRODUCTION Personalization has become increasingly widespread. In a 2002 study of development professionals by Evans Data, three in four used some sort of dynamic content on their web sites and 56% reported that they are deploying personalization features. However, only 17% say that more than half of their web site is dynamic. This percentage is higher at E-commerce and financial-services sites. In addition, sites are increasingly happy with their efforts; respondents to the Evans survey indicated that web sites geared toward dynamic content would grow to 30% in the near future, up from 18% at the time of the survey (Schindler, 2002). 2. THE NEED FOR WEB PERSONALIZATION The intense competition among Internet-based businesses to acquire new customers and retain the existing ones has made Web personalization a significant part of e-commerce (Mobasher, Dai, Luo, Sun and Zhu, 2000). In today’s highly competitive e-commerce environment, the success of a site often depends on the site’s ability to retain visitors and turn casual browsers into potential customers. Automatic personalization and recommender system technologies have become critical tools in this arena since they help tailor the site’s interaction with a visitor to his or her needs and interests (Nakagawa, Luo, Mobasher and Dai, 2001). The current challenge in electronic commerce is to develop ways of gaining deep understanding into the behavior of customers based on data which is, at least in part, anonymous (Mobasher, Dai, Luo, Sun and Zhu, 2000). While most of the research in personalization is directed toward e-commerce functions, personalization concepts can be applied to any web browsing activity. B. Mobasher, one of the most recognized researchers on this topic, defines web personalization as any action that tailors the Web experience to a particular user, or set of users (Mobasher, Cooley and Srivastava, 2000). Web personalization can be described as any action that makes the Web experience of a user personalized to the user’s taste or preferences. The experience can be something as casual as browsing the Web or as (economically) significant as trading stocks or purchasing a car. The actions can range from simply making the presentation more pleasing to an individual to anticipating the needs of the user and providing the right information, as well as performing a set of routine book-keeping functions automatically (Mobasher, 1999). User preferences may be obtained explicitly, or by passive observation of users over time as they interact with the system (Mobasher, 1999). The target audience of a personalized experience is the group of visitors who will see the same content as each other. Traditional web sites deliver the same content regardless of the visitor’s identity—their target is the whole population of the Web. Personal portal sites, such as MyYahoo! and MyMSN, allow users to build a personalized view of their content—the target here is the individual visitor. Personalization involves an application that computes a result, thereby actively modifying the end-user interaction. A main goal of personalization is to deliver some piece of content (an ad, product, or piece of information, for example) the end user finds so interesting that the session lasts at least one more click. The more times the end user clicks, the longer the average session lasts; longer session lengths imply happier end users, and happier end users help achieve business goals (Rosenberg, 2001). The ultimate objective is to own a piece of the customer’s mindshare and to provide customized services to each customer according to his or her personal preferences – whether expressed or inferred. All this must be done while protecting the customer’s privacy and giving them a sense of power and control over the information they provide (Charlet 1998). The bust of the so-called “IT bubble” has put vastly increased pressure on Internet companies to make a profit quickly. Imagine if in a brick and mortar store it were possible to observe which products a customer picks up and examines and which ones he just passes by. With that information it would be possible for the store to make valuable recommendations. In the online world, such data can be collected. Personalization techniques are generally seen as the true differentiator between brick and mortar business and the online world and a key to the continued growth and success of the Internet. This same ability may also serve as a limitation in the future as the public becomes more concerned about their privacy and the ethics of sites that collect personal information. 3. PERSONALIZATION VS. CUSTOMIZATION Personalization and Customization seem to be very similar terms. While the techniques do have similarities, it should be noted that there are some generally recognized differences. Customization involves end users telling the web site exactly what they want, such as what colors or fonts they like, the cities for which they want to know the weather report, or the sports teams for which they want the latest scores and information. With customization, the end user is actively engaged in telling the content-serving platform what to do; the settings remain static until the end user re-engages and changes the user interface (Rosenberg, 2001). Examples of customization include sites such as Yahoo! And MSN that allow users to explicitly create their own home pages with content that is meaningful to them. This technology is relatively simple to implement, as there is very little computation involved. It is simply a matter of arranging a web page based on explicit instructions from a user. Such technology is generally used as a basis for setting up a “portal” site. Personalization is content that is specific to the end user based on implied interest during the current and previous sessions. An example of personalization use is Amazon.com. Amazon’s technology observes users purchasing and browsing behavior and uses that information to make recommendations. The technology is cognitive because it "learns" what visitors to a site want by "observing" their behavior. It has the ability to adapt over time, based on changes in a site's content or inventory, as well as changes in the marketplace. Because it observes end users' behavior, personalization has the ability to follow trends and fads (Rosenberg, 2001). The most basic personalization technique is to use what is known as a “cookie.” Cookies are small data files, typically less than 4 kilobytes. They are generated by web sites to track the number of visitors and to learn to which areas of the site those visitors are going. Cookies are accepted by a user’s web browser for storage on the client’s hard drive (Machlis, 1998). At future sessions, the web server accesses the cookie data, including log-on and password, so users don’t have to log on each time they visit. In addition, cookies can help to personalize a site with a message such as “Welcome name.” In the early days of the Internet, there was a great deal of concern regarding these files. Rumors existed that the cookies were able to launch destructive programs to damage a user’s hard drive. These rumors were later found to be baseless (Machlis, 1998). Cookies, when used alone, can provide only very basic personalization. While more modern personalization techniques do incorporate cookie technology, this is only a small piece of the overall personalization process. To date, most personalization systems for the Web have fallen into three major categories: manual decision rule systems, collaborative filtering systems and content-based filtering agents (Mobasher, Cooley and Srivastava, 2000). 4. MANUAL DECISION RULE SYSTEMS Manual decision rule systems (MDRS) allow Web site administrators to specify rules based on user demographics or static profiles (collected through a registration process), or session history (Mobasher, Cooley and Srivastava, 2000). The rules are used to affect the content, the structure or the appearance of the information served to a particular user. Some systems that belong to this category are Yahoo!’s personalization engine and BroadVision (Vozalis, Nicolaou and Margaritis, 2001). For this reason, these systems are often categorized as customization rather than personalization – although that is not universally accepted. While BroadVision uses MDRS technology, they often refer to their system as Personalization. Mobasher and others in the field would probably disagree with this definition based on how they define personalization (Mobasher, Dai, Luo, Sun and Zhu, 2000). Manual decision rule systems are often more associated with portals and not adaptive or personalized web sites. Rule-based personalization systems take a different approach to the problem of preferences than other systems. Instead of matching a users’ input to the profiles of other users, rules match that input to a set of rules, or assumptions, about users’ behavior. For example, if a user tells a site that they are 8 years old and like comedies, a movie website using this technology might suggest the movie Aladdin. If you are 80 years old and express the same preference or taste you might be offered Grumpy Old Men (Maddox and Blankenhorn, 1998). BroadVision was an early pioneer of MDRS technology with their Enterprise Business Portal series of products. Today, however this technology is often being replaced with more advanced personalization techniques. Even BroadVision’s newer product offerings for B2B portals are not completely MDRS based but instead incorporate more complicated personalization techniques along with CRM technologies (www.broadvision.com). 5. COLLABORATIVE FILTERING SYSTEMS One of the most widely used technologies for building personalization-and-recommendation systems is collaborative filtering (CF). Collaborative filtering systems, such as Firefly (now Microsoft) and Net Perceptions, typically take explicit information in the form of user ratings or preferences, and through a correlation engine return information that is predicted to closely match the users’ preferences (Mobasher, Cooley and Srivastava, 2000). Given a target user’s preferences obtained from their ratings, CF-based techniques, such as the “Nearest-Neighbor” approach, compare that record with the historical records of other users in order to find the top X users who have similar tastes or interests. The mapping of a visitor’s surfing record to its “neighborhood” could be based on similarity in ratings of items, access to similar content or pages, or purchase of similar items. The identified neighborhood is then used to recommend items not already accessed or purchased by the active user (Mobasher, Dai, Luo and Nakagawa, 2001). A good example of basic collaborative filtering is the well known web site NetFlix. This site allows users to rent movies online. Once users have a rental history, the system is able to make recommendations based on the history of the user. In addition, users of the site are able to explicitly rate movies. This gives NetFlix even more information than the rental history since it is possible that someone would rent a movie but not actually like it. Given a rental history along with the users’ ratings of movies, the NetFlix system is able to make accurate recommendations based on the rental history and ratings of others users who have similar tastes. Although Firefly is gone, the concept is certainly not. The most popular web sites, such as Amazon, NetFlix and Lands End rely on the collaborative filtering concepts pioneered by Firefly to provide recommendations to their customers (Oreskovic, 2000). Some of the larger sites, like Amazon.com, rely on proprietary technology that is not produced by a specific vendor. However, there are companies today that have risen to fill the void left by the purchase of Firefly by Microsoft. Some of the more successful today are Art Technology Group (ATG), Blaze Software and Net Perceptions. 6. NET PERCEPTIONS One of the most successful Collaborative Filtering companies on the market today is Net Perceptions (www.netperceptions.com). From the Net Perceptions web site: Using sophisticated analytics, Net Perceptions automatically injects sales and marketing intelligence such as targeted and relevant product recommendations, in real time into every customer interaction. Net Perceptions has helped to fill the void left by the acquisition of Firefly and helped many organizations such as half.com, 3M and Musicians Friend use personalization technologies for business advantage. The last section of this paper looks at those cases in greater detail. In addition, Net Perceptions helped power Amazon.com’s earliest reading recommendations. Amazon has since moved to a proprietary technology (Walker, 2001). Since larger sites have moved to proprietary technology, Net Perceptions has been forced to look beyond just its personalization software to try and make a profit. Today, Net Perceptions is helping J.C. Penney, Kmart and other chains retailers optimize their circulars that get printed and stuffed into Sunday newspapers. The idea is to use Net Perceptions software to analyze the effect of past print promotions by correlating them with detailed sales data, helping retailers figure out which products work best together to boost overall sales (Walker, 2001). It should be noted that Net Perceptions never made a profit throughout the whole “Internet boom.” They did manage to survive, however, which is much more than can be said for others. They believe that their new strategy of working with established retailers as described above is their key to future profitability (Walker, 2001). 7. CONTENT-BASED FILTERING Content-based filtering approaches such as those used by the popular “WebWatcher’ program rely on content similarity of Web documents to personal profiles obtained explicitly or implicitly from users (Mobasher, Cooley and Srivastava, 2000). Sites using collaborative filtering technology make recommendations based solely on other users’ ratings or accesses, ignoring the content of the objects themselves. A content-based filtering approach attempts to improve the site based on what the pages say and what they are about. For example, one approach would be to analyze the text of Web pages at the site, and add links between pages that have similar texts. Presumably, such adaptations would make the site easier to use and thus improve the “true” quality (Perkowitz and Etzioni, 2000). Content based filtering involves guessing where the user wants to go and taking the user there or providing a link. Path prediction may be done online, by predicting the user’s goal based on his path so far, or it may be done offline, statically computed based on user models. The WebWatcher program learns to predict what links users will follow on a particular page as a function of a model of their interests (Perkowitz and Etzioni, 2000). A link that WebWatcher believes a particular user is likely to follow will be highlighted graphically and duplicated at the top of the page when it is presented. Upon entering a site, visitors are asked, in broad terms, what they are looking for. Before they depart, they are asked if they have found what they wanted. WebWatcher uses the paths of people who indicated success as examples of successful navigations. If, for example, many people who were looking for “personal home pages” follow the “people” link, then WebWatcher will tend to highlight that link for future visitors with the same goal. Note that, because WebWatcher groups people based on their stated interests rather than customizing to each individual, it falls on the continuum between individual customization and pure transformation (Perkowitz and Etzioni, 2000). 8. WEB PERSONALIZATION PROCESSES AND TECHNIQUES Today many of the successful e-commerce systems that provide server-directed automatic Web personalization are based on collaborative filtering (Mobasher, Dai, Luo, Nakagawa, Sun and Wiltshire, 2000). Pure collaborative filtering is not computationally difficult. A user inputs some sort of rating for an item or object and the program recommends other items or links based on preferences of “neighborhoods” of users who entered similar ratings. There are well known limitations to the basic CF approach. For instance, it becomes hard to scale these techniques to a large number of items while maintaining reasonable prediction performance and accuracy (Mobasher, Dai, Luo, Nakagawa, Sun and Wiltshire, 2000). A primary reason for the performance problem is that nearest neighbor approach requires that the neighborhood formation phase be performed as an online process, and for very large data sets this may lead to unacceptable latency for providing recommendations (Mobasher, Dai, Luo and Nakagawa, 2001). Also, the input is a subjective description of the users by the users themselves and may be prone to biases (Mobasher, Cooley and Srivastava, 2000) and since profiles are obtained explicitly from users, they may begin to lose accuracy over time as tastes change (Mobasher, Cooley and Srivastava, 2000). It is also known that CF techniques deliver poor performance in the face of sparse amounts of data to work with for computations (Mobasher, Dai, Luo, Nakagawa, Sun and Wiltshire, 2000). Web usage mining has the potential to overcome these limitations. However, usage-based personalization can be problematic when little usage data is available pertaining to some objects or when the site content changes regularly (Mobasher, Dai, Luo, Sun and Zhu, 2000). 9. WEB USAGE MINING It is felt that Web usage mining, possibly used in conjunction with standard approaches such as collaborative filtering, can help address some of the shortcomings of these techniques (Mobasher, Dai, Luo, Sun and Zhu, 2000). To date, there have been several techniques proposed for effective personalization. For example, Perkowitz and Etzioni among others have proposed web usage mining as a mechanism for improving and optimizing the structure of a site (Perkowitz and Etzioni, 1997, 1998, 2000). Their techniques are discussed later in this paper. Other strategies include similarity indexing between groups of people as well as offline clustering of user records to reduce the online component of the system to search within a matching cluster (Mobasher, Dai, Luo and Nakagawa, 2001). However, most researchers such as Mobasher, Dai, Nakagawa, Sun and Wiltshire (among others) feel the critical step for effective personalization is to combine web usage mining with other techniques, such as collaborative filtering, to derive quality and actionable “aggregate user profiles” from these usage patterns. In addition, they propose that both usage and content attributes of a site should be integrated into a Web mining framework and used by the recommendation engine in a uniform manner (Mobasher, Dai, Luo, Sun and Zhu, 1999, 2000). 10. WEB PERSONALIZATION FRAMEWORK Bamshad Mobasher and others have proposed the most recognized framework to date for such a system. The framework seeks to improve the effectiveness of collaborative filtering techniques on anonymous clickstream data and provide meaningful recommendations to unknown users at the earliest possible stage in their interactions with the site (Mobasher, et al, 1998, 1999, 2000, 2001). 11. OFFLINE PROCESSES The offline component is comprised of the data preparation and specific usage mining tasks. The goal of the data preparation tasks is to create a server session file where each session is a sequence of “pageviews” each represented by a Uniform Resource Identifier (URI) that is attributed to a single, but anonymous, user (Mobasher, Cooley and Srivastava, 2000). Pageview identification is the task of determining which page file accesses contribute to a single pageview and is heavily dependent on intrapage structure. A single framed site can be represented by a single pageview but a multiframe site pageview may contain several files (Mobasher, Cooley and Srivastava, 2000). When large amounts of these sequences of pageviews (user sessions) are known, an offline process will perform clustering on these user sessions to form candidate neighborhoods. In addition to being performed offline, this clustering is independent of any targeted user (Mobasher, Dai, Luo and Nakagawa, 2001). The online recommendation engine process will then compare a portion of an active user’s session to representatives from the discovered clusters to deliver recommendations to the user. These cluster representatives are known as “aggregate user profiles” (Mobasher, Dai, Luo and Nakagawa, 2001). The recommendation engine considers the active user session in conjunction with the discovered patterns to provide personalized content (Mobasher, Cooley and Srivastava, 2000). A major component of the Mobasher framework is data preparation, which occurs as an offline process. The data preparation phase is responsible for, among other things, allowing only URIs that represent meaningful or relevant pageviews to be included in the final server session file (Mobasher, Cooley and Srivastava, 2000). The session file obtained in the data preparation stage can be used as the input to a variety of data mining algorithms such as the discovery of association rules or sequential patterns, clustering, and classification (Mobasher, Cooley and Srivastava, 2000). These data mining algorithms can provide information regarding patterns from usage data. However, knowledge of these patters is not the final step in this framework. The critical step is to generate actionable aggregate profiles in the form of usage clusters (Mobasher, Cooley and Srivastava, 2000). The Mobasher framework describes three important characteristics that these profiles should possess. Capture possibly overlapping interests of users, since many users may have common interests up to a point (in their navigational history) beyond which their interests diverge (Mobasher, Cooley and Srivastava, 2000). Provide the capability to distinguish among pageviews in terms of their significance within the profile (Mobasher, Cooley and Srivastava, 2000). Have a uniform representation that allows for the recommendation engine to easily integrate different kinds of profiles (multiple profiles based on different pageview types, or obtained via different mining techniques) (Mobasher, Cooley and Srivastava, 2000). To meet these requirements, this framework represents usage profiles as weighted collections of URIs. Each item in a usage profile is a URI uniquely representing a relevant pageview, and can have an associated weight representing its significance within the profile (Mobasher, Cooley and Srivastava, 2000). Once these profiles have been computed, there still needs to be a recommendation. The Recommendation Engine The activities described above are all offline processes in the Mobasher framework. The recommendation engine is the online component of a web personalization system. Its task is to compute a “recommendation set” for the current user session. This recommendation set is computed by matching the current user’s activity against one or more of the aggregate usage profiles generated in the offline process (Mobasher, Cooley and Srivastava, 2000). Notice that in this system the current user session is all that is used for recommendations. While this lack of user history may seem to be a limiting factor, it is a fact that many people use the Internet at different times for different reasons. This framework takes into account that recommendations made to a user in one session may not necessarily be appropriate in later episodes (Mobasher, Cooley and Srivastava, 2000). Accuracy and Performance While this initial framework increased the scalability of collaborative filtering systems, it also led to a drop in the accuracy of the recommendations. This tradeoff was initially expected (Mobasher, Dai, Luo and Nakagawa, 2001). Later work in 2001 by Mobasher, Dai, Luo and Nakagawa expanded on this framework’s data preparation procedures, such as with normalization and significance filtering, to improve the effectiveness of the clustering approach to collaborative filtering in the context of anonymous clickstream data (Mobasher, Dai, Luo and Nakagawa, 2001). The experimental results of this work indicate that with proper pre-processing, collaborative filtering based on aggregate usage profiles can generate recommendations with the same level of accuracy as the direct approach, while dramatically improving scalability (Mobasher, Dai, Luo and Nakagawa, 2001). The real significance of these results is that web personalization can be achieved based entirely on anonymous users’ clickstream data even at very early stages of their visits (Mobasher, Dai, Luo and Nakagawa, 2001). This technique, along with other approaches such as by Perkowitz and Etzioni (discussed later) may become invaluable to make recommendations to those users who do not wish to give any personal information over the Internet or to generate recommendation to a casual web surfer. In addition, it is possible that one day the government might create a law forbidding the collection of any personal information on a web site without explicit user permission, in which case the value of this framework will become even more significant (Evans, 2001). 12. ARTIFICIAL INTELLIGENCE AND ADAPTIVE WEB SITES So far, we have examined different ways to personalize the web experience to individual users. Through various techniques, it is possible to personalize a site based on a wide range of user interests. However, with most conventional techniques, there are still limitations. On a user’s initial visit to a site, we may not have any information to help in personalization. The same visitor may seek different information at different times (Perkowitz and Etzioni, 1997). Many sites outgrow their original design and accumulate links in unlikely places (Perkowitz and Etzioni, 1997). There may come a day when the government outlaws the collection of personal information over the Internet (Evans, 2001). There are some users who simply will not give any personal information on the Internet (Nunes and Kambil, 2001). A site may be designed for a particular kind of use, but may be used in many different ways in practice (Perkowitz and Etzioni, 1997). While it is generally agreed that personalization is important, there are some who have proposed others methods to improve the web experience of users. One such method is Artificial Intelligence (AI). The previous methods seek to personalize a site; the goal of AI researchers is to transform the site into a better one (Perkowitz and Etzioni, 2000). The concept is simple; web users interact directly with a server maintained by the inventors of the service or authors of the content being served. As a result, data on their behavior is recorded in web server logs. Obviously, this raw data would be far too time-consuming for the sites Webmaster to process regularly. However, using AI techniques web server logs are excellent targets for automated analysis. This makes the problem - how can we build a web site that improves itself over time in response to user interactions with the site? (Perkowitz and Etzioni, 1997) If such techniques could be perfected, it would address many of the shortcomings of “traditional” personalization techniques and produce a site that is able to evolve over time based on the cumulative actions of the site users. Such an optimized site would allow first time visitors, about whom we know nothing, to more easily navigate the site based on the improvements. This approach to adaptive web sites is motivated by these key goals: Avoid additional work for visitors such as completing surveys or questionnaires (Perkowitz and Etzioni, 1998). Make the web site easier to use for everyone, not just specific individuals (Perkowitz and Etzioni, 1998). Protect the site's original design from destructive changes. When creating a web site, a designer creates the look and feel of the site, the structure of the information, and the kinds of interactions available. When making automatic changes to such a site, damage to the site structure must be avoided (Perkowitz and Etzioni, 1998). 13. BUSINESS IMPACT OF PERSONALIZATION While few dispute that personalization technology is important, there are some who question whether or not the value of this technology is as significant as many think. In a study by Nunes and Kambil, 300 on-line consumers were surveyed for their opinions on web personalization. The results were surprising. Rather than have advanced personalization technology figure out their interests and present them with products they are predicted to be interested in, most customers would prefer to customize web interactions for themselves (Nunes and Kambel, 2001). Some think that the real way to get individualized interaction between a user and a website is to present the user with a variety of options and let the user choose what is of interest to that individual at that specific time (Nielsen, 1998). The argument here is that only the user knows what he is interested in at any specific moment in time. Yesterday a user might have been shopping for a gift for his grandmother and does want that gift information stored in a profile for future recommendations. A key argument against personalization is that web surfers are generally unwilling to take the time to complete profile information. However, Nunes and Kambil found in their survey of 300 web users that 93% of users reported manually customizing at least one site and 25% have customized four or more sites (Nunes and Kambel, 2001). Most users are concerned about privacy and many are hesitant to provide information to a web site when asked. However, surveys found that users were actually willing to provide information to a site when they were informed that the data would be used for personalization features and not an invasion of their privacy for some undisclosed purpose (Nunes and Kambel, 2001). Imagine in the case of the one-time gift for grandmother if the site were to ask if the purchase should be added to the user’s profile of interests for future recommendations. This would spare the customer months of recommendations that take the gift purchase into account. A fundamental key to determining whether personalization technology will have a business impact is to realize that this technology is not necessarily beneficial to all types of sites. Michael Rosenberg, writer and analyst for Itworld.com, describes the type of sites that have the potential of a positive business impact from personalization: Information-heavy content sites, such as technical resource, financial information, or equipment manufacturer support sites. This also includes sites that need to house full documentation but could benefit by applying the experience of "power users" to the general audience (Rosenberg, 2001). 1. Commerce sites that have a large number of SKUs, such as movie and entertainment sites or large retail sites that carry multiple lines of kitchen products. These are sites that carry, sell, and promote various types of goods and want to leverage user interaction beyond simple segmentation (Rosenberg, 2001). 2. Commerce sites that want to leverage direct user taste and preference. These sites might ask end users for ratings on items they like and dislike and use that as the basis for personalization (Rosenberg, 2001). 3. Commerce sites that actively carry, sell, and promote "trendy" items. Personalization allows these companies to catch and capitalize on fast-moving consumer trends without expensive and slow data mining efforts (Rosenberg, 2001). Rosenberg also describes types of sites that have not experienced a positive business impact from the technology and should not expect to improve their business with personalization technology: 1. Sites that have a simple, well-defined business model, such as delivering winning lottery numbers or telephone book services and nothing else (Rosenberg, 2001). 2. Content sites that are focused, have a simple structure, and contain little content, such as a local airport site that describes directions and services (Rosenberg, 2001). 3. Commerce sites that feature a few simple products, such as coffee from a single roasting house or men's underwear from a single designer (Rosenberg, 2001). 4. Email campaigns for follow-up sales based on product affinity, such as selling the case or battery after the initial sale (Rosenberg, 2001). It is difficult to assess the impact of personalization on sites that are founded on personalization ideas. An example of one would be Amazon.com. The site was built with personalization in mind, so there is no before-and-after to quantify the business impact of personalization. Another is Netflix. That enterprise was built from the ground up with personalized movie recommendations as a core business function. There are, however several cases of “brick and mortar” enterprises that implemented e-commerce and personalization with some success. Three noteworthy examples are half.com, Musician’s Friend and J.Crew. Half.com Half.com, which is an eBay company, offers consumers a fixed price, online marketplace to buy and sell new, overstocked and used products at discount prices. Unlike auctions, where the selling price is based on bidding, the seller sets the price for items at the time the item is listed. The site currently lists a wide variety of merchandise including books, CDs, movies, video games, computers, consumer electronics, sporting goods and trading cards (www.half.com). Half.com decided that to increase customer satisfaction as well as company profits, personalization technology would be implemented. It was decided that product recommendations would be presented at numerous locations on the site including the product detail, add-to-wish list, add-to-cart and thank you pages. In fact, each point of promotion would include three to five personalized product recommendations. In addition, the site would generate personalized, targeted emails. For example, half.com would send a personalized email with product recommendations that are relevant based on prior purchases. In addition, they would send personalized emails to attempt to reactivate customers who had not made a purchase in more than six months (www.netperceptions.com). Half.com decided to try out Net Perceptions technology to meet these needs. As a proof of concept, Net Perceptions and Half.com performed a 15-week effectiveness study of Net Perceptions’ recommendation technology to see if they could show a positive business benefit to justify the cost of the product and the implementation (www.netperceptions.com). For the study, visitors were randomly split into groups upon entering the half.com site. Eighty percent of the visitors were placed in a test group and the remaining twenty were placed into a control group. The test group received the recommendations and the control group did not. The results of this test showed half.com the business benefits of personalization technology. The highlights were: Normalized sales were 5.2 percent greater in the test group versus the control group. Visitor to buyer conversion was 3.8 percent greater in the test group. Average spending per account per day was 1.1 percent greater in the test group. For the email campaign, 7 percent of the personalized emails generated a site visit compared to 5 percent of the non-personalized. When personalized emails were sent to inactive (not made a purchase in six months) customers, 28 percent of them proceeded to the site and actually made a purchase. J.Crew J.Crew is one of the clothing industry’s most recognized retailers, with hundreds of clothiers around the world and a catalog on thousands of doorsteps with every new season. J.Crew is a merchandising-driven company, which means the goal is to get the customer to exactly what they want as easily as possible (www.atg.com). Dave Towers, vice president of e-commerce Operations explains: “As a multi-channel retailer, our business is divided between our retail stores, our catalog and our growing business on the Internet.” J.Crew understood the operational cost reductions that could be achieved by migrating customers from the print catalog to J.Crew.com (www.atg.com). To accommodate all of their Internet customers, Jcrew built an e-commerce infrastructure that consistently supports about 7,000 simultaneous users and generates up to $100,000 per hour of revenue during peak times. J.Crew realized early on that personalization technology would be a critical area of focus if they were to succeed in e-commerce. As Mr. Towers put it, “A lot of our business is driven by our ability to present the right apparel to the right customer, whether it’s pants, shirts or sweaters, and then up-sell the complementary items that round out a customer’s purchase.” JCrew’s personalization technology has allowed them to refine the commerce experience for Internet shoppers. JCrew has definitely taken notice of the advantages that personalization technology has brought to their e-commerce site. The expanded capabilities delivered by personalization have given JCrew a notable increase in up-sells or UPTs (units per transaction), thanks to the ability to cross-sell items based on customers’ actions on the sites (www.atg.com). Towers explains: “we can present a customer buying a shirt with a nice pair of pants that go with it, and present that recommendation at the right moment in the transaction. The combination of scenarios and personalization enable us to know more about a customer’s preferences and spending habits and allows us to make implicit yet effective recommendations” (www.atg.com). Clearly, JCrew is the type of e-commerce site that can directly benefit from personalization technology. With their business model and the right technology implantation, JCrew is one company that has been able to make very effective and profitable use of the Internet. Musician’s Friend Musician’s Friend, which is a subsidiary of Guitar Center, Inc., is part of the world’s largest direct marketer of music gear. Musician’s Friend features more than 24,000 products in its mail-order catalogs and on its Web site. Products offered include guitars, keyboards, amplifiers, percussion instruments, as well as recording, mixing, lighting and DJ gear (www.musiciansfriend.com). In 1999, Musician’s Friend realized that both its e-commerce and catalog sales were underperforming. They realized that they had vast amounts of customer and product data, but were not leveraging this information in any intelligent or productive way (www.netperceptions.com). The company sought a solution to increase its e-commerce and catalog revenues through better understanding its customer and product data interactions and leveraging this knowledge to generate greater demand (www.netperceptions.com). To meet their objectives, Musician’s Friend decided to implement web personalization technology. The company felt it could personalize the shopper’s experience while at the same time gain a better understanding of the vast and complex relationships between products, customers, and promotions. Successful implementation would result in more customers, more customer loyalty and increased revenue (www.netperceptions.com). Musicians Friend decided to implement Net Perceptions technology. For their site, they did more than make recommendations based simply on the shopper’s preferences. They user preference information and combined it with knowledge about product relationships, profit margins, overstock conditions and more. Musician’s Friend also leveraged personalization technology to help their catalog business. The merchandising staff quickly noticed that it could help them to determine which of the many thousands of products available on the web site to feature in its catalog promotions (www. netperceptions. com). The results were impressive. In 2000, catalog sales increased by 32% while Internet sales increased by 170%. According to Eric Meadows, director of Internet for the company, “We have been able to implement several enhancements to our site as a direct result of the Net Perceptions solution, including using data on the items customers return to refine and increase the effectiveness of the additional product suggestions the site recommends.” (www.netperceptions.com) Their personalization solutions helped Musician’s Friend generate a substantial increase on items per order year-over-year; in other words, intelligently generating greater customer demand (www.netperceptions.com). 14. PLANNING, IMPLEMENTATION AND COST Once a company has decided that they wish to personalize their web site, they have a wide array of product choices to choose from and many business decisions to make. Certainly, users have plenty of choices, from the template-driven packages of BroadVision and Vignette to the more flexible development tools of Art Technology Group (ATG). There are also site analysis tools, profiling systems, data analysis engines and collaborative filtering products (Sliwa, 2000). All of these different choices come at far different costs and have different implementation options. Fortunately, the price of personalization in general has dropped considerably in recent years. In 1998, Forrester Research estimated that the cost of a truly tailored site was about $5.5 million (Maddox and Blankenhorn, 1998). These days, you can buy a copy of Microsoft’s Commerce Server 2000, which will provide personalization, for about $13,000 (www.pcworld.com). However, effectively implementing personalization is more than simply purchasing an off-the-shelf software package. A company must consider how personalization will help to accomplish their business goals and how mush is an appropriate amount to spend on the technology. Richard Dean describes seven key issues that should be resolved before implementing personalization: What behavior are you hoping to enable: driving sales, generating traffic, or creating a knowledge base? Before considering products and implementation strategies, create a well-developed plan for what you're trying to accomplish. Figure out how you'll measure success and judge software options against these plans (Dean, 2000). Define your business rules; whatever your product, there is a logic to how it is marketed and sold. Sample business rules include making certain items available only to subscribers, giving some items priority shipping, and suggesting item y if item x is out of stock (Dean, 2000). Is your data ready for use by a consumer-accessible, outward-facing Web site? Is your data organized with the proper intelligence so that products like skis, for example, are linked to bindings? In order to create this kind of data connection, you'll need to work with your marketing team to determine the relationships between products and information. Only when these issues are settled will you be able to offer visitors custom product-sets on the fly (Dean, 2000). Don't underestimate the effort needed to create the necessary logic for your personalization effort, as well as the time and learning curve required to tag the content. Make sure you have sufficient technical resources to complete the personalization effort (Dean, 2000). Don't try to add customization to a large site all at once. Begin with the areas most likely to drive sales, traffic, and customer loyalty, and then branch out (Dean, 2000). After determining the level of customization you'd like to enable, figure out what it will cost to build it (including hardware and support). Then, do a cost-benefit analysis--if, for example, personnel and hardware and software costs are going to be $50,000 the first year, figure out how long it will take to make that money back. Can you create a realistic, measurable return on investment (ROI) model under which the personalization enhancements drive enough sales or increase banner rates to make up for the high expenses? (Dean, 2000). Should you even enable customization on your site at all? There is no simple answer because it depends on what you're trying to achieve (Dean, 2000). Once these key issues have been addressed, a company needs to decide on a solution. This solution will consist of a combination of the customization product and the implementation. With regards to implementation, there are three main paths a company can follow: Buy, Build or Outsource (Sliwa, 2000). Buy There are plenty of good solutions when looking to purchase personalization software. When considering these products, however it should be noted that the biggest cost in site personalization is generally the people and not the technology (Maddox and Blankenhorn, 1998). Build There are some companies that decide to build their own personalization software. The most known in this group is Amazon.com. Amazon considers their personalization capabilities to be a business differentiator and does not want to rely on a vendor to provide this capability. For that reason, they developed their own personalization engine. Since the majority of this approach would consist of programming time and talent, the cost is impossible to measure accurately. Outsource This option may be the most agreeable to organizations that are on a budget or lack in-house programming talent. With this option, you generally hire the professional services division of a software vendor to implement, upgrade and manage their own product. One example of this can be found in the Children’s Place Retail Stores, based in Secaucus, NJ. The Children’s place left it up to Net Perceptions to set up its collaborative filtering software with whatever partners needed to be involved. The Children’s Place has been extremely happy with the results. They recently instructed Net Perceptions to initiate a targeted marketing campaign. Hoping for a 9% click through, the company reached 19.1%, with 14% of those customers actually purchasing the suggested product (Sliwa, 2000). In addition to software vendors charging for implementation, there have been vendors with more imaginative pricing strategies. For example, Dynaptics Corp offers a performance-based model for its personalization software. Dynaptics’ hosting service, called GetPersonal, offers the company’s Personal E.ssistent technology at a subscription fee of $3,000 per month. The company will negotiate the performance-based portion of the pricing model with individual customers based on click-through and conversion rates (Callaghan, 2001). The performance-based part of the pricing model would typically add another $3,000 per month to the company’s fees if the service performs as advertised. Dynaptics standard license includes a $10,000 one-time activation fee, a $75,000 annual subscription fee and a 10% maintenance fee. Clearly, there are plenty of options for a company to choose from. A company must clearly understand its business goals and budgetary limitations before making a decision regarding the technology and implementation. 15. THE ETHICS AND LEGALITY OF PERSONALIZATION One fact about web personalization that is a constant point of debate is the fact that information about a person is often being collected without their knowledge. It is a known fact that some companies abuse the information they learn from users resulting in the most hated product of Internet commerce: spam (Ouellette, 2001). The deluge of junk email hitting Internet users is making many of them question the ethics of the companies that are collecting this personal information. Besides the annoyance of spam, there are countless other reasons to be concerned about this type of data collection. If a consumer chooses to do business online, they must reveal credit card information, personal information such as their address and also information regarding their shopping habits and purchases. It is a well known fact that credit card numbers can be stolen and misused. Also, there are some purchases people make that they wish to keep secret. If some of this type of information were to fall into the hands of an unethical person or organization, it could lead to great embarrassment or other consequences. Spyware There has been much debate recently about the use of “Spyware.” So what is Spyware? The most benign of these programs simply serve advertisements. Others can collect detailed information about a viewer's behavior and send it back to a parent company the person likely knows nothing about. Many change the settings of a browser or other software, sometimes in ways that only someone with sophisticated technical knowledge can reverse (Borland, 2003). Spyware generally presents the user with not just advertising, but personalized advertising. The web now allows advertisers to do something they cannot easily so with any other consumer experience - spy on consumers to determine their interests and target advertising to those interests. This new form of personalized advertising is simply doing the same as all other personalization. It is observing web surfers browsing behavior and creating personalized content, advertising for that user. While this idea may seem harmless enough, these spyware programs are becoming highly controversial with the government, Internet users and also Internet Service Providers (ISPs). None of this is yet illegal, and in most cases, notice of such functions is contained somewhere in a piece of software's terms of service or license agreement. But critics say few people read these agreements. As a result, innocent web surfers can often unknowingly wind up with software that monitors their behavior, soaks up their computing and network resources, and can even damage their computers, in extreme cases (Borland, 2003). Earthlink is one example of an ISP that has been affected by spyware programs. Increasingly over the past few months, Earthlink has been receiving more and more complaints from its users about Spyware (Borland, 2003). Users generally call regarding another matter, but Spyware turns out to be the source of the problem. When users find out that Spyware exists on their computer and has been sending their surfing habits to the Spyware maker’s parent company, they are almost always quite upset. As Jim Anderson, Earthlink’s vice president for product development, was quoted as saying “They feel that their trust has been broken” (Borland, 2003). Gator One of the most controversial Spyware programs on the Internet today is Gator. Gator is a program whose original stated purpose was to automatically fill in passwords and other areas of web pages. This program is available for free download. However, the real purpose is to load an advertising spyware module called OfferCompanion that displays pop-up ads when visiting some sites (Gill, 2003). Gator boasts that since it’s software is always running, it can spam users with “Special Offers” and other ads anywhere they go, including competitors sites (www.gator.com). Gator is able to do this with high levels of accuracy since it is able to track and spy on user’s browsing behavior. According to “SimplytheBest” Spyware information: Gator targets consumers based on site visitation and/or historical behavior. Your personal information is stored on your personal computer in an encrypted file. Gator accesses this personal information on occasion, using your IP address to help diagnose. Gator provides aggregate statistics about its customers, traffic patterns and related site information to third party vendors. In order to provide this service, they collect information on your web usage. It is not just the fact that Gator spies on users to deliver personalized advertising that is causing such a controversy. It is very much also the manner in which Gator delivers that personalized advertising. Gator is able to display ads in two different and highly controversial ways. Under the first method, a person who is visiting a flower-delivery company’s Web site might receive a pop-up ad for a rival site. In addition, Gator is able to paste ads of the same dimensions on top of the banner ads being used on popular sites such as Yahoo!. Critics have compared the technology to intercepting Time magazine as it is mailed to readers’ homes and gluing a new ad over the back page (Olsen, 2002). Gator defends these actions by claiming that they are giving consumers choices. So why would users allow a program to run on their PCs and perform this activity? The answer is free software. The biggest culprit here is the file-sharing program Kazaa. This program is currently one of the most popular downloads on the Internet (www.cnet.com). Spyware is included as part of the installation and information explaining it’s functioning is detailed in the license agreement that most people do not read. While there are some aspects of the spyware that users are allowed to opt out of, there is a core spyware program called Cydoor that is mandatory. Kazaa will not run without it (Gill, 2003). The bottom line is that a company must make money somehow. If people want free software, they must be able to accept spyware at some level. Government Action These consumer concerns have not gone unnoticed by our government. In Oct 2000, Rep John Edwards filed the Spyware Control and Privacy Protection Act. Under the legislation, companies that use code to track the activities of Internet users would have to notify consumers in plain language when the users surf their sites or download information. No information on Internet surfing habits could be collected without first obtaining each consumer’s permission (Evans, 2001). While Congress failed to take action on that bill, it shows the concern of some government officials. Notice that this law would have applied to all forms of personalization technology, not just targeted spam and advertising. Collaborative filtering including all nearest “neighborhood” approaches would have potentially been affected by this legislation. The politically libertarian foundation of the Internet is certain to make any new law a difficult proposition. Many prefer technological solutions, as evidenced by a growing grassroots movement of programmers dedicated to thwarting intrusive programs (Hansen and Borland, 2002). Amazon.com, one of the most visible and successful users of personalization techniques, has also created some controversy with its information gathering methods. In fact, as recently as October 2002, privacy groups urged state authorities to restrict Amazon’s data collection efforts, calling the online retailer untrustworthy (Pruitt, 2003). In a letter sent to more that a dozen state attorneys general, the US Federal Trade Commission and other officials, the privacy advocates called for changes to Amazon’s privacy policy that would allow customers to have greater control over the information, including the ability to keep their purchase records from being transferred and disassociate their identity from any or all transactions (Pruitt, 2003). The feeling of these groups is that, as a general rule, bookstores should not be selling information on their customers’ reading habits. Amazon has taken steps to make concessions to the privacy groups, but it seems that this is a debate that will continue for some time and has the potential to affect most of the personalization technology in use today. 16. CONCLUSION The subject of web personalization has changed dramatically since the days when the concept was first developed in an MIT research lab in the mid 1990’s. Through the mid and late 1990’s, web personalization was a favorite topic of scholarly papers from researchers such as Mobasher, Perkowitz, Etzioni, Dai, Nakagawa and many others. Companies realized that personalization was a must for e-business success, but many were hampered by the cost (about $5.5 million according to Forrester research in 1998) as well as known problems with integration, lack of qualified technical talent, privacy concerns and the dizzying array of small startup companies that offered a wide range of solutions. This situation is not unique as these are the qualities of any cutting-edge or emerging technology. Today, web personalization is quickly evolving into a mature technology. Software giants such as Microsoft and IBM are integrating personalization into their latest e-commerce offerings of Commerce Server 2000 and WebSphere respectively. Their R&D divisions are taking the place of the scholarly work that had been performed by academics. This is evident by the advanced personalization technology integrated into these offerings and the precipitous decline in published scholarly work on the topic of personalization. As with any maturing technology, the price of implementation has been declining dramatically. Whereas it was estimated that the cost of the technology was about $5.5 million in 1998, there are now many offerings below $100,000 and even many below the $25,000 mark. A key issue for the future of personalization will be privacy concerns. As covered in this paper, the US Congress has introduced bills to ban the collection of any personal information on web sites. While those bills have not passed the Congress, it shows the concerns of many people as well as lawmakers. A key to the privacy issue will be the ability of corporations to police themselves. To achieve this, industry leaders in the World Wide Web Consortium (W3C) have created the P3P standard to give users more control over the privacy of their information. According to the W3C, about 25% of the top 100 sites have currently adopted the P3P standard. However, they acknowledge that the adoption rate has slowed recently. They attribute the slow down to the general state of the economy and the possibility of privacy officer teams being downsized (www.w3c.org). It is clear, though that privacy will play a large role in the future of personalization. 17. REFERENCES Borland, J., 2003. “Spyware Epidemic Rallies Call For Action.” www.ZDNet.com, February. Callaghan, D., 2001. “Low-Cost Personalization Option Looms. “ Eweek. March. Charlet, J., 1998. “BroadVision.” Stanford University OIT-21. Cooley, Mobasher, and Srivastava, 1997. “Web Mining: Information and Pattern Discovery on the World Wide Web.” Technical Report TR 97-027. University of Minnesota. Dai, Mobasher, Luo, and Nakagawa, 2001. “Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization.” Data Mining and Knowledge Discovery, 6:61-81. Kluwer Academic Publishers. Dai, Mobasher, Luo, and Nakagawa, 2001. “Effective Personalization Based on Association Rule Discovery from Web Usage Data.” From the 3rd ACM Workshop on Web Information and Data Management, November 9th, Atlanta Georgia. DePaul University. Dalton, G., 1997. “OPS: Answer To Cookies?” Information Week. October 13th. Dean, R., 2000. “Personalizing Your Web Site.” Builder.com. June 2nd. Evans, J., 2001. “Senator Proposes Spyware Security Bill.” www.infoworld.com. January. Gill, L., 2003. “PC Spies at the Gate.” www.newsfactor.com. January, 3rd. Greening, D., 2003. “Data Mining on the WEB.” New Architect Magazine on www.webtechniques.com. Hansen, E. and J. Borland, 2002. “Addressing the Cause, not Symptoms.” www.cnet.com. June 24th. Lesch, S., 2000. “P3P 1.0: A New Standard in Online Privacy.” www.w3.org. Lynch, R., 2000. “What’s All the Fuss About?” www.Itworld.com. October. Maddox, K and Blankenhorn, D., 1998. Web Commerce: Building a Digital Business John Wiley and Sons, Inc. Mobasher, B., 1999. “WebPersonalizer: A Server-Side Recommender System Based on Web Usage Mining.” Published In Technical Report TR-01-004. DePaul University. Mobasher, Cooley, and Srivastava, 2000. “Automatic Personalization Based on Web Usage Mining.” Communications of the ACM. Vol 43, No. 8. Mobasher, Dai, Luo, Nakagawa, Sun, and Wiltshire, 2000. “Discovery of Aggregate Usage Profiles for Web Personalization.” Proceedings of the WebKDD Workshop at the ACM SIGKKD, Boston, August. Mobasher, Dai, Luo, Sun, and Zhu, 2000. “Integrating Web Usage and Content Mining for More Effective Personalization.” In E-Commerce and Web Technologies Lecture Notes in Computer Science (LCNS). Springer-Verlag. Moskowitz, L., 1997. “Who Will Regulate Your Online Privacy?” PCWorld. June. Nakagawa, Luo, Mobasher, and Dai, 2001. “Improving the Effectiveness of Collaborative Filtering on Anonymous Web Usage Data. “Published by DePaul University. From the Proceedings of the IJCAI 2001 Workshop on Intelligent Techniques for Web Personalization (ITWP01), Seattle WA. Nielsen, J., 1998. “Jakob Nielsen’s Alertbox for October 4th 1998.” (Commentary) www.useit.com Nunes and Kambil, 2001. “Personalization? No Thanks.” Harvard Business Review, April. Olsen, S., 2002. “Gator Rushes to Court Over Ad Technology.” www.cnet.com. August. 28th. Oreskovic, A., 2000. “Flight of the Firefly.’ The Industry Standard. Nov 06th. Ouellette, T., 1999. “Web Personalization. “ComputerWorld Magazine, December. Perkowitz and Etzioni, 1997. “Adaptive Web Sites: An AI Challenge.” University of Washington. Published in Proc. IJCAI-97. Nagoya, Japan. Perkowitz and Etzioni, 1998. “Adaptive Web Sites: Automatically Synthesizing Web Pages.” In Proc. AAAI-98 (American Association for Artificial Intelligence), Madison, WI. Perkowitz and Etzioni, 2000. “Toward Adaptive Web Sites: Conceptual framework and case study.” University of Washington. Published by Elsevier Science B.V. Pruitt, S., 2003. “Amazon comes under further fire over Privacy Policy.” www.itworld.com. IDG News Services / Boston Bureau. October. Rosenberg, M., 2001. “The Personalization Story.” www.ITWorld.com. May. Sarwar, Karypis, Konstan, and Redi, 2000. “Analysis of Recommendation Algorithms or E-Commerce.” Proceedings of the 2nd ACM Conference on E-Commerce (EC00) Minneapolis. Sliwa, C., 2000. “Personalization: Buy, Build or Outsource?” ComputerWorld. June. Vozalis, Nicolaou, and Margaritis, 2001. “Intelligent Techniques for Web Applications: Review and Educational Techniques.” 5th Hellenic European Conference on Computer Mathematics & its Applications (HERCMA 2001), Athens, GR. Walker, Leslie, 2001. “Web Survivors Take Business Software Where the Money Is.” The Washington Post. January 18.