Internet and Web-Based Database Technology Amjad A. Abdullat Computer Information Systems Department West Texas A&M University Canyon, Texas 79016 Abstract The demand for data-intensive Web sites is driving the merger between Web sites and database technologies. Many E-commerce sites and other Internet applications provide Web interface to access information stored in database systems. It is common to use two-tier and three-tier client server architecture for Internet applications. In some cases, other variations of client server models are used. There are several approaches and technologies that could be utilized to deliver innovative web-based database solutions that help businesses meet the challenges of the new competitive business environment. This paper will discuss database technologies and concepts and the different approaches that are available for creating database-driven Web sites environment. Keywords: Web database, Java Server Pages, Java, database connectivity, scripting languages, database interaction, E-commerce 1. Introduction The emergence of the World Wide Web (WWW) as a primary tool for communications among individuals and business enterprises has transformed computing and caused dramatic shift in both business applications and business processes. As usage of the Web has escalated, the importance of database to this growth has become more evident Java.sun.com. The explosive growth of E-commerce has been a significant contributor force for this escalation. It seems that businesses from large size to small size have stepped up to the challenge of adapting business to take advantage of the global network know as the Internet (Laudon and Traver, 2003). As a result of the growth of the internet and database technology, several mature technologies have been developed and are widely used for Web access. The common APIs (Application Program Interfaces) for data access strategy consist of Open Database Connectivity (ODBC), Object Linking and Embedding Database (OLE DB), Java Database Connectivity (JDBC) and ActiveX Data Objects (ADO) (http://jakarta.apache.org). Microsoft’s Active Server Pages (ASP), Sun’s Java Server Pages (JSP) and Allaire’s Cold Fusion have become the three commonly used web access tools. Also, the three-tier architecture has replaced the two tiers model for thin client and security reasons. By storing the application on a separate server rather than on the client computer, the Web eliminates the time and cost of application deployment. Extensible Markup Language (XML) has been well developed and will become the next generation standard technique for data transferring and communication within heterogeneous systems and among enterprises (Riccardi, 2003). All of the database systems and Web development tools have merged with the XML technology. With the help of additional IT technologies such as data warehouse, data mining, distributed database, wireless, and digital signature, Web database technology can move to a new era facilitating a better B2B marketplace. 2. Web-based Database Technology in Context E-Commerce and other database applications are designed to interact with the user through Web interfaces that display Web pages. The common method of specifying the contents and formatting Web pages is through the use of hyper link documents. There are various languages for writing these documents, the most common being HTML (Hyper Text Markup Language). Although HTML is widely used for formatting and structuring Web documents, it is not suitable for specifying structured data that is extracted from database (Morrison and Morrison 2002). Recently, a new language-namely XML has emerged as the standard for structuring and exchanging data over the Web. XML can be used to provide information about the structure and the meaning of the data in the Web page rather than just specifying how the web pages are formatted for display on the screen (Riccardi, 2003). The changes in database technology and the use of Internet-based technology within business enterprises to build Intranet has been widely adopted and has, Consider splitting this sentence into two, a bit run-on created an environment that facilitates the emergence of new innovation to database applications. The connectivity of Web database has presented several business opportunities that help business remain competitive. Among the several new opportunities that Web database connectivity brings three distinct opportunities were identified. The first one is allowing rapid response to competitive pressure by bringing new services and product to market quickly. Second, increase customer satisfaction through the creation of Web-based support services. And third, yield fast and effective information dissemination through universal access from across the street to across the globe (Laudon and Traver, 2003). Several characteristics of the Web environment have supported the rapid adoption and implementation of Internet and Intranet business applications. These characteristics include simplicity and similar functionality of the browsers interface, hardware and software independence, and development cost and time (Riccardi, 2003). First, the simplicity and functional similarity of the browser interfaces have significantly reduced traditional barriers to adoption, such as complexity. The uniformity and the similarity of most popular browser interfaces have made it easier for users to switch among browsers and among business websites. The use of tools such as HTML, DHTML, XML, and scripting languages have resulted in uniformity of presentation and function that make it easy to switch from one Internet–based application to another. Second, the hardware and software independence of the browsers has eased the sharing of information across platforms and has also resolved many previously thorny cross-platform issues of access. In particular, wide access to database information has become possible. Through the Internet network, location independence has been achieved. Companies can access their data locally and remotely, making some information publicly available while protecting critical information from public access by placing it behind a firewall or on a different server. The movement of data files and updates to databases through the Internet is commonplace. Third, development cost and time have been reduced. Inexpensive or even free development and deployment tools are available, making the barriers to entry very low. 3. Types of Websites Using Database Systems Most businesses and organizations are now expected to have a web site. While the functionality, the level of sophistication, and the timelines will vary (Laudon and Traver, 2003). However, in reviewing the types of Websites that allow users to perform either extraction or depositing of information, two distinct types emerged. The first type allows the user to perform read-only functions. The second type allows users to perform reading and updating functions (Riccardi, 2003). Many websites do not have a database attached to them. They provide static information that is coded in HTML, JavaScript, CGI, or other scripting languages. Common to many of the sites, are the need to extract or deposit information into a database that is attached to the site (Riccardi, 2003). Some sites are simply repositories of information that can be queried by the site user. The user can request information and read it, but cannot change it in any manner. These sites are able to provide information about particular products or classes of products because of the site visitor’s query. Any set of data lending itself to storage in a relational database can be attached to a site, and casual exploration of Websites will convince the reader that the possibilities are almost endless (Kroenke, 2004). Data stored in numeric, character, or graphic formats are all widely used. Other sites provide more interactivity between the user and the database, in that the user can send back information to a database that is attached to a site. This capability has supported the explosive growth of electronic commerce, as orders may be placed on-line by customers. 4. Trends and Driving Forces for Web Database The increased demand for enterprise-wide data access has made the integration of database systems and the Web reach a high level of relevance and significance. For many businesses of all sizes to remain competitive in the new business environment, businesses are increasingly dependent on the Web to manage and perform their business processes and activities. In reviewing the many forces that could drive the shift to Web-database systems five forces were identified: 1) changing business environment, 2) the need for enterprise data access, 3) end- user productivity, 4) changes in technology and 5) cost (Laudon and Traver, 2003). No one can dispute the fact that the Internet and its associated technologies have changed the nature of the business environment. The new competitive environment in which many businesses find themselves has forced them to transform and streamline their business processes and offer new products and services. It is evident that to meet the challenges of the new business environment there must be an efficient approach to information management. Moreover, the traditional approach and legacy platforms to access databases are incapable of mitigate the demand for widespread access to data. Providing the right information at the right time with the appropriate format to decision making at all each organizational level is considered the single important function of any MIS department. Mangers and knowledge workers need on-demand data access through easy-to-use GUI interface. Productivity gains by the end-users at all business levels are a direct result of the growth and the proliferation of personal computers. The increased level of sophistication of both the end-users and the software applications has changed the focus from how to access data into how to manipulate data to obtain information that provide the business enterprise a competitive advantage over its business competitors. The rapid and accelerated change in technology is a significant force that drives the integration of Web and database systems. For example, the increased power of the microprocessor processing technology to deliver high performance is similar to performance of many mainframe or minicomputer systems. The proliferation of Internet, Intranet and the advances in data communications has made it possible to enhance the bandwidth needed to effectively access the enterprise database systems. The advances in microcomputer database management systems have reached a level of high performance, reliability and cost that allowed many developers to enhance existing legacy applications that were not possible before (Date, 2000; Elmassri, 2004). 5. Architecture of Web Database The architecture of database systems is influenced by the underlying computer system on which it runs, in particular by such aspects as computer architecture, networking, and distributions (Date, 2000). The earlier DBMS architecture used mainframe-computing environment to provide the main processing for all functions of the systems. The reason for such an approach was that users could access the database via dumb terminals that did not have any processing power and only provided display capabilities. As the price of hardware declined, and PC-platforms replaced terminals, gradually DBMS started to exploit the available processing power at the user side. This approach led to the client server DBMS architecture (Elmassri, 2004; McFadden, et al., 2002). Client Server Architecture: The basic premise of client server architecture is to divide the processing load between the server and the client. It was developed to address the challenges of a computing environment in which a large number of PCs, file servers, web servers, database servers, printers, and other resources connected via a network (McFadden, Hoffer and Prescott, 2002). The idea is to delineate a specialized server with specific functionalities. Two-Tier Architecture The two-tier architecture is a form of the client server architecture that is increasing being used in commercial DBMS. The two-tier architecture features a PC client and a database server. The PC client contains the presentation code and SQL statements for data access. The database server processes the SQL statements and sends query results back to the PC client (Date, 2000, Morrison and Morrison, 2002). The database server performs process management functions. The validation and business logic code can be divided between the client and the database server. While the two-tier architecture is appropriate for stable requirement and a relatively moderate number of clients, it poses several challenges when it comes to software maintenance. Three-Tier Architecture The performance of the two-tier architecture can be poor when a large number of clients submit request to the server at the same time. The three-tier software architecture emerged to overcome the challenges of the two-tier architecture. The third tier (middle tier server) is between the user interface and the data management components (Date, 2002). This middle tier provides process management where business logic and rules are executed and can accommodate hundreds of users by providing functions such as queuing, application execution, and database staging. The three-tier architecture is used when an effective distributed client/server design is needed that provides increased performance, flexibility, maintainability, reusability, and scalability, while hiding the complexity of distributed processing from the user. The third tier provides database management functionality and is dedicated to data and file services that can be optimized without using any proprietary database management system languages. It provides access to resources based on names instead of locations, and thereby improves scalability and flexibility as system components are added or moved. N-TIER TECHNOLOGY To improve performance and provide a flexible division of processing, N-tier architecture supports additional layers of servers (Date, 2000). Each tier only focuses on its one task and communicates with others. The middle tier can be split into two, with one tier for web server and another for the application server. In the n-tier architecture, clients can request objects without knowing the platform, location, or implementation details of the object. The N-tier architecture is the most general client server. While some systems have distributed databases, the third tier is divided into multi-tiers. The N-tier technology makes the system more stable and portable. 6. Web-Database Interfaces Technology The amalgamation of the Internet technology and database applications created new computing environments that can be described as rich and complex (Morrison and Morrison, 2002, Riccardi, 2003). Furthermore, the demand for data-intensive Web sites has been a driving force for the merger between Web sites and database technologies. The need for connectivity between the Web sites and the databases has created the necessity to establish specific and accepted standards that are capable of accommodating universal access to different data resources (Riccardi, 2003). The universal data access (UDA) is an approach to access different data sources and database platforms. In the early 1990s several standards emerged to provide interfaces for accessing database servers. Open Database Connectivity (ODBC), Object Linking and Embedding Database (OLE DB), ADO and JBDC are standards that are viewed by many in the field as no longer considered to be on the leading edge of database processing. However, these standards are widely used in many E-commerce applications. As we embark in exploring the new interface technology, it will be advantageous to discuss the traditional standards such as ODBC, JDBC, OLE DB and ADO. Open Data Base Connectivity (ODBC) Most database vendors support open database connectivity (ODBC) as the standard interface to connect to the database. ODBC was developed in early 1990s to provide a DBMS independent means for processing relational database data. ODBC provides a common interface between the web and the database servers. It consists of a set of standards by which SQL statements are issued. ODBC is a standard used to provide an application-programming interface (API), which allows client-side program to call DBMS as long as the client and the server platforms have the necessary software installed. ODBC was created to access relational database and data sources that are table-like. ODBC is an open and vendor-neutral way to accessing data from any database and simplify the process of migration from one database application to another. The adoption of ODBC by Microsoft has made it very popular by software applications developers. OLE DB While ODBC has simplified some database development tasks, it still was limited to table-like data sources (Riccardi, 2003). To overcome this limitation, Microsoft developed Object linking and embedding database OLE DB as an object-oriented interface that hides data server functionality to access other table-like data sources (i.e., text, word, email, web, structured data, images, etc.) that are not table-like. OLE DB defines a collection of COM (Component Object Model) that can be accessed from any program to connect to various database management systems. DCOM (Distributed Component Object Model) is an extension of COM that provides a distributed component based environment to be used by OLE DB providers to accommodate data access needs. As components can be thought of as the combination of both process and data into a reusable object, components can be treated as both data consumers and data providers at the same time. Consumers take data from OLE DB interfaces and providers expose OLE DB interfaces. Active Data Objects (ADO) ADO (Active data Objects) is a programming extension of ASP (Active Server Pages) supported by Microsoft IIS (Internet Information Server) for database connectivity. ADO is a set of objects for utilizing OLE DB data. Its purpose is to provide a mechanism to access data sources that utilize different tools such as C, C++, C#, visual basic, and other scripting languages. Most programmers use ADO to get OLE DB data. ADO contains wrappers that help access OLE DB data sources in a way similar to the ODBC formats. JDBC JBDC is Sun’s version of ODBC. It is the most prominent approach for accessing relational DBMS from a Java program (http://jakarta.apache.org). The JDBC defines a database access API that supports basic SQL functionalities and enables access to a wide range of relational DBMS. The JDBC API consists of two main interfaces: an API for application writers and a lower-level driver API for driver writers. For applications and applets to access databases using the JDBC, sun identified four driver types (http://jakarta.apache.org). The first type of driver converts JDBC calls into the network protocol used directly by the DBMS, allowing a direct call from the client to the DBMS server. The second type of driver is to translate JDBC calls into middleware vendor’s protocol, which is then translated to DBMS protocol by the middleware server. The third type of driver converts JDBC calls into calls on the client API for the DBMS. However, this driver software needs to be installed in client machines. The last type of driver uses JDBC Bridge that provides JDBC access using ODBC drivers. This provides a Java API that interfaces to ODBC drivers. This enables processing of ODBC data sources from Java. The software driver needs to be loaded on each client machine. Another JDBC based approach uses Java with embedded SQL, called JSQL. This is an extension to the ISO/ANSI standard for embedded SQL that specifies support only for several programming languages. A JSQL translator transforms the JSQL clauses into standard Java code that access the database through a call level interface. JDBC is a low-level middleware tool that provides database access interface from a java application (http://jakarta.apache.org). 7. Emerging Interface Standards The open sources movement has played a significant role in the development of new technologies and tools as alternative to OLE DB, ADO and .Net. There are several tools and standards that are available for database applications processing in connection that are available used in displaying database content on the Web. Open source is not a requirement for use of JDBC. JDBC is employed on Windows XP, 2000 and other operating systems to access SQL server such as Oracle (Riccardi, 2003).. Below is a discussion of several of the open sources products such as Active Server Pages, XML, JSP, and Apache Tomcat. Active Server Pages (ASP) Active Server Pages technology is a Microsoft product used to allow dynamic, interactive Web pages to be created on the Web server. ASP’s approach is to provide an environment in which a script is executed on the server to retrieve user requests from a Web page and then generate active Web content to satisfy the request. ASP technology was designed to work on a Windows operating system that is running Microsoft Internet Information Serve, and support Active X scripting, allowing a large number of different scripting engines to be used (Rob and Coronel, 2002). ASP is a scripting framework and VBScript is the default scripting language for ASP. ASP provides the flexibility of CGI, with the performance issues. Unlike CGI, APS runs in-process with the server and is multi-threaded and optimized to handle large volumes of users. Extensible Markup Language XML is considered to be a meta-language that is used to represent and manipulate data of structured documents (Morrison and Morrison, 2002). The common method of specifying the content and the formatting of Web pages is through the use of hyperlink documents. There are various languages for writing these documents, the most common being HTML. Although HTML is widely used for formatting and structuring web documents, it is not suitable for specifying structured data that is extracted from database. XML has emerged as the standard for structuring and exchanging data over the Web. XML is concerned with the description and representation of the data, rather than how data are displayed. A major feature of XML is that it provides the semantic that facilitates the sharing, exchanging, and manipulation of structured documents. SUN'S JAVA SERVER PAGES (JSP) AND SERVLETS Sun Microsystems developed Java Server Pages (JSP) and Servlets. JSP is intended to be the mechanism to create dynamic Web pages using HTML, XML, and Java programming language (http://jakarta.apache.org). The coding in JSP is only done in Java. While JSP and Active server pages are similar, the main difference is that in ASP active content is written with a scripting language while in JSP, active content is written in Java (not JavaScript). This makes it possible to write complex logic with complex error handling in JSP that may not be possible in ASP. By restricting the programming only to Java allows the developer to exploit the full capabilities of a complete object-oriented language. Since Java is a machine independent; this is also true for JSP (Riccardi, 2003). This makes it possible to run the applications on Windows XP, 2000, IIS, Linux server, and window server. JSPs are transformed into standard Java language and then compiled just like a regular program. JSP came after Servlets to fill the gap between Java Programming and HTML design and scripting (Riccardi, 2003). The relationship between JSP and Enterprise Java Beans (EJB) is analogous to ASP and COM/DCOM. It allows a developer to easily write static HTML content with embedded dynamic content by enclosing active content in <% %> or tags. Active content is written in Java. It is advised that complex business logic be compartmentalized in EJB for better separation between design and functionality. While all this extra power may sound wonderful, it comes at a price. Java Servlets and JSP are well suited for people familiar with Java programming. To a non-Java programmer, JSP can be very frustrating to debug and write. For example since active content in JSP is generated with Java, one needs to cast session and request variables into a usable format. Apache Tomcat The Apache Tomcat was a result of the Jakarta project that was cosponsored by the Apache Foundation and Sun (http://jakarta.apache.org). The Apache Web server does not support servlets. Tomcat is a servlet processor that can work in conjunction with Apache or as a standalone Web server. Tomcat has limited Web server capabilities. Use of Apache is recommended when running commercial production application. When running Tomcat and Apache separately on the same server, they require using different ports (http://jakarta.apache.org). 8. Conclusion The wide access to the Web and the availability of easy-to-use GUI tools such as the Web browser has made it much easier and popular to access applications including database from anywhere. The connectivity solutions to the web from anywhere have made it possible for enterprises to exploit the full benefits of their database management systems (DBMS). As a result of this, Web database technology has become more relevant and significant. Web database systems have been wildly used in the ecommerce environment, such as displaying product items, on-line order forms and shopping carts, remote database access, and lots of useful intranet applications. The maturity of such technology will help business develop effective strategy to streamline their business processes. The new solutions to Web database allow business to exploit the benefit of E-commerce applications. However, there are several challenges that need to be addressed such as security, and reliability. The new developments of Web database technology will have great impact on the advances on areas that are closely associated with it. Areas such as such as distributed databases, data mining, data warehousing, wireless and security will benefit greatly from the advance in Web database technology. XML will be the standard for the future generation of Web database. The development of wireless networks will increase the relevance and significant of Web-database to E-commerce applications. 9. References Anonymous, (undated), “Database Publishing on the Web,” http://java.sun.com/ products/servlet/industry.html Anonymous, (undated) The Apache Tomcat server, http://jakarta.apache.org Date, C. J. and H. Darwen, (2000), Foundation for Future Database Systems: The Third Manifesto. Addison-Wesley. Kroenke, D., (2004), Database Processing: Fundamentals, Design and Implementation, Seventh Edition, Prentice-Hall. Elmaasri and Navathe, (2004), Fundamentals of Database Systems, Addison Wesley. Laudon, K. and C. Traver, (2003), E-Commerce: Business, Technology and Society. Prentice-Hall. Mannio, V. Michael, (2004), Database: Design, Application, Development, & Administration. Irwin, McGraw Hill. McFadden, Hoffer, and Prescott, (2002), Modern Database Management, Addison Wesley. Morrison, M. and J. Morrison, (2002), Database-Driven Web Sites. Course Technology. Riccardi, G., (2003), Principles of Database Systems With Internet and Java Applications. Addison Wesley. Rob and Coronel, (2002), Database Systems Design, Implementation and Management, Course Technology.