A Recommendation for a Professional Focus Area in Data Management for the IS2002 Information Systems Model Curriculum Herbert E. Longenecker, Jr. hlongenecker@usouthal.edu Computer and Information Sciences University of South Alabama Mobile, Alabama 36688 USA David Yarbrough david.yarbrough@ngc.com Northrop Grumman Pascagoula, Mississippi 39568 USA David L. Feinstein dfeinstein@usouthal.edu Computer and Information Sciences University of South Alabama Mobile, Alabama 36688 USA Abstract IS2002 has become a well defined standard for information systems curricula. The Data Management Association (DAMA 2006) curriculum framework defines a body of knowledge that points to a skill set that can enhance IS2002. While data management professionals are highly skilled individuals requiring as much as a decade of relevant experience before being hired, principles of the profession are a good fit with the requirements for an information systems analyst. The life-cycle relationships of data management professionals to other IT professionals suggest that information systems education and experience can provide an excellent career path to data management. Based on these observations recommendations are made to add a new professional emphasis area in data management and to add data quality, architecture and metadata concepts and practices to the IS2002 learning units. Keywords: IS2002 curriculum, data management, DMBOK, certification, assessment, exit skills 1. INTRODUCTION In both the IS’97 (Couger et al 1994, 1997; Davis et al 1997) and IS2002 (Gorgone et al 2002) Model Information Systems Curriculum Models the focus of the programs was the educating of students who would have the skills necessary for the development and deployment of Information systems: “Information Systems, as an academic field, encompasses two broad areas: (1) acquisition, deployment, and management of information technology resources and services (the information systems function) and (2) development and evolution of technology infrastructures and systems for use in organization processes (system development).” The correctness of these assertions have led to the documents becoming the basis of IS accreditation standards (Gorgone and Lidtke 2002). In addition, the exit areas of the model curricula define the curriculum exit skill areas defined in 1997 and reiterated in 2002. Also, the survey data of Landry (2000) confirmed by national survey with confirmatory factor analysis the validity of these areas. Subsequently, the model has been utilized as the basis for developing assessment examinations for IS programs by the Center for Computing Education Research, the CCER (Landry et al 2003, 2004). The skill areas were used jointly by the CCER and the Institute for Certification of Computing Professionals (ICCP) certification, “The Information Systems Analyst”(McKell 2003, 2004). IS’97 and IS2002 curricula are expressed by ten model courses and are further specified by Learning Units (LU) which are flexibly defined collections of goals and objectives which explain the outcomes expected for each course. These LUs can be mapped to courses at a given university in order to show compliance with the model, as well as to enable assessment. The learning units can be mapped to one or more than one university course. At each mapping there is some local objective that would be defined by the university. The Data Management Association specified a framework for data management curricula (Henderson et al 2004). The document contains a very detailed analysis resulting in the statement of a body of knowledge describing the objects of the data management profession. The framework suggests that it would be appropriate to develop undergraduate curricula in data management to become part of the professional force exceeding 100,000 people nationally (Henderson 2004). In this paper we examine the question of whether there should be an undergraduate curriculum furnishing graduates to the data management profession. And, if not, what are the alternatives. Indeed, we will advance and support the concept that there should NOT be an undergraduate curriculum specifically for data management. However, based on our analysis of the data management framework and the responsibilities of the mission of the IS curriculum, we propose a responsive alternative to enhance significantly the IS curriculum while simultaneously opening a strong career path for data management professionals. 2. THE DATA MANAGEMENT PROFESSION The data management profession has members who are entrusted with the information of major organizations. Many if not most of these professionals work for large organizations and are charged with definition, development, and management of the mission-critical information stores of significant organizations of the world. In a recent article (Perez 2006) the characteristics of data management professionals have been examined and are reflected in Figure 1. For the most part the profession seems to be comprised of very senior people 90% of whom have had more than 10 years of professional experience. However, only an average of 5 of those years has been with the responsibility for data management. Also, since the mean age of these professionals is approximately 45, it would seem that most of these professionals have been out of college for 15 to 20 years. That is, data management is NOT an entry level profession. Rather, it is a profession requiring immense experience and mature judgment. A significant fraction of these professionals have had 20+ years developing their current level of experience. These observations are relevant because it implies that the focus of curriculum probably should not be to focus on entry level professionals. 3. SKILLS COMPARISON OF INFORMATION SYSTEMS AND DATABASE ANALYST PROFESSIONALS During 2000 US faculty and professionals were surveyed (Haigood 2000) regarding the skill depth of database analysts, software engineers, and information systems professionals. For the list of skills surveyed respondents were asked to evaluate the relative skill depth required for their profession. Database analysts are data management professionals. Software engineers most frequently have come from computer science programs. Information systems analysts represent the graduates from IS programs. The meaning of the data in Figure 2 suggests that database analysts are very similar to IS analysts, except more skilled. On the other hand, there seems to be very little correlation of database analysts and software engineers. Figure 3 shows a specific correlation between IS analysts and database analysts. In both Figures 2 and 3 what is striking is the similarity of the two professionals. That database analysts have higher skill expectations is not surprising since it is likely they have considerably more experience. What is also interesting is the possibility that with years of professional experience, the IS professional will mature and develop the sophistication of the database analyst. That is, the entry level IS professional may have a career path to the database analyst position, and by extension to the data management professional. 4. ANALYSIS OF THE DAMA BODY OF KNOWLEDGE One of the important contributions of the DAMA framework is the development of a body of data management body of knowledge. Clearly the body of knowledge begins to define the profession. In order to understand the body of knowledge we produced a hierarchical extraction of the body of knowledge by abstracting each major section of the body to a single phrase that was descriptive of the entire section. This abstraction is represented in Figure 4. The framework uses a different ordering of the data portrayed in Figure 4. We observed that the data of figure 4 closely parallel a development life cycle. This is an important observation since the framework does not make this point. It is also important since it would be necessary for data-related-tasks to exist at each phase of the life cycle if the information systems professional were to migrate to the role of data management professional. Indeed, by inspection of Figure 4 many of the topics already are well defined in the IS curriculum. Notably absent from the IS2002 curriculum are important areas recognized in the DAMA body of knowledge. For example, Data Quality is absent from IS2002 (Strong 2005). Data warehousing, data security, and the concept of metadata are not defined, and should perhaps be added to the model curriculum. 5. DAMA BODY OF KNOWLEDGE LIFE CYCLE RELATIONSHIPS In Figure 5 we present a life-cycle representation consistent with the Capability Maturity Model of the SEI (CMMI Team 2002; CMU 2004) for systems and software engineering. We have added two additional rows in which we have inserted at appropriate locations in the life cycle the titles presented in Figure 4. Interestingly, we found a perfect fit for all of the components of the DAMA framework in a single row we entitled “Information Engineering” indicative of the developmental role implied. We believe that The INCOSE Body of Knowledge (INCOSE 2006) and the software engineering descriptions implied in the CMMI (2002) are represented adequately in Figure 5.DAMA through its framework clearly expresses professional “ownership” of the information engineering row of responsibilities in Figure 5. Likewise, INCOSE clearly identifies its professional “ownership” through its published body of knowledge, as does the SEI define a similar “ownership” for software engineering.. Indeed, systems engineers, like data management professionals are not entry level positions. They too require many years of experience. 6. THE IS2002 SKILL SET The IS2002 skill set is not positioned in life-cycle order. However, it is our observation that the skill set fully comprises rows 1, 2, and 3. That is, Information Systems = {People + Systems + Information + Software} Engineering. What we think this observation means is that in smaller organizations there is not budget for more than one information technology professional. In smaller organizations, consultants may fill the roles of these positions. In larger to medium organizations, the information systems professional probably fills all of the functions. However, in larger organizations, there will be job descriptions for each of these rows. Indeed, there may be positions for only one or two cells of the matrix, e.g. a Data Modeling Specialist. It is therefore reasonable that the specialists take years to mature. Therefore, it is reasonable to expect that the basic concepts and skills implied by the DAMA Framework be learned by beginning IS professionals to provide a foundation for entry level positions and future professional growth into data management positions. The additions to the Skill Set involve adding three new sub-skills as presented in Figure 6. While the expressed skill set of IS2002 is broader than that expressed in Figure 5, it is the feeling of the authors that data management professionals as well as information systems professionals meet the criteria of the mission of IS (McNurlin and Sprague 1998) “to improve the performance of people in organizations through the use of information technology. … Misunderstanding the necessity for supporting people leads to disastrous consequences: the focus MUST BE to develop excellent people focused business systems with inextricably woven information systems with related, appropriate, accessible data. People do the work of the organization! “ 7. SO, SHOULD THERE BE A SEPARATE DATA MANAGEMENT DEGREE PROGRAM? We have presented evidence that there is a great need for data management professionals, and that these are very skilled people usually hired many years after a degree program. Although, the DAMA body of knowledge points to information engineering skills, our survey evidence suggests that data management professionals have the skill set of an IS professional, in addition to the information engineering focus. Therefore, an augmented IS program plus years of maturation should produce a data management professional. The DAMA Curriculum Framework points to a few DM programs. On close inspection several of these programs are certificate programs, and several others are being phased out. For whatever reason, pure data management undergraduate degree programs seem to be unsuccessful. Yet, the nation needs data management professionals. Currently, there are more than a thousand successful information systems programs. We argue that by augmenting the IS2002 curriculum specification as recommended below there will be stronger IS programs, and a viable path will have been established for developing data management professionals, perhaps through facilitated professional development: Conference speakers will make the case through excellence in personal and professional leadership. Young professionals will sort themselves out. 8. RECOMMENDATIONS: 1. Data quality, data architecture, and metadata should be integrated in the IS2002 learning units. These are skills that must be part of the tool-set of any information systems professional. It may require an additional course to cover the concepts of data quality. 2. A course in data warehousing concepts should be added. Practical tools such as ETL and reporting tools should be practiced and become true skills. Warehouse design can be covered minimally. If an institution has an established decision support systems (DSS) course, this course could be expanded to include the concepts and techniques of data warehousing. However, the practical focus must be maintained. 3. A course in data mining concepts with limited practicum should be offered. Concepts of business intelligence can be integrated into this course. Reporting tools should be practiced. Inference tools can be included. Statistical tools can be included. Alternatively, an existing DSS course could be enhanced to include the concepts and limited practicum for data mining 4. A set of specifications could be defined for a specialist addendum to the ISA certificate, e.g., the ISA-DM, and “Information Systems Analyst – specialty in Data Management. 5. The Information Systems Curriculum Task Force should include a representative from the Data Management Association. 6. Data Management should be promoted to students so they see the vast potential job opportunities as their career advances. 9. REFERENCES Carnegie Mellon University (2004). ”Welcome to the CMMI”, from http://www.sei.cmu.edu/cmmi/ Couger, J. Daniel, Herbert E. Longenecker, Jr., David L. Feinstein, Gordon B. Davis, John T. Gorgone, Dorothy Dologite, George M. Kasper, Joyce C. Little, Joseph S. Valacich and A. Milton Jenkins. 1995. "Information Systems 1995", MISQ Fall 1995. Couger, J. Daniel, Gordon B. Davis, David L. Feinstein, John T. Gorgone, and Herbert E. Longenecker, Jr. 1997. “IS’.97: Model Curriculum and Guidelines for Undergraduate Degree Programs in Information Systems, Data Base Volume 26 Number 1 Winter 1997, pp. I-94. CMMI (2002). “Capability Maturity Model® Integration (CMMISM), Version 1.1, CMMISM for Systems Engineering, Software Engineering, Integrated Product and Process Development, and Supplier Sourcing, (CMMI-SE/SW/IPPD/SS, V1.1), Staged Representation”. CMU/SEI-2002-TR-012; ESC-TR-2002-012 CMMI Product Team (2002). ”Capability Maturity Model® Integration (CMMISM), Version 1.1”, Carnegie Mellon University, 2002 Davis, G. B., Gorgone, J. T., Couger, J. D., Feinstein, D. L., and Longenecker, H. E. Jr. 1997. “IS ‘97 Model Curriculum and Guidelines for Undergraduate Degree Programs in Information Systems,” ACM, New York, NY and AITP (formerly DPMA), Park Ridge, IL. Gorgone, J.T., Davis, G. B., Valacich, J. S., Topi, H., Feinstein, D. L., and Longenecker, H. E., Jr. (2002) IS 2002 Model Curriculum and Guidelines for Undergraduate Degree Programs in Information Systems. ACM, New York, NY, AIS, and AITP (formerly DPMA), Park Ridge, IL. Gorgone, J. T. and Lidtke, D. 2002. The IS criteria can be found at www.abet.org Haigood, Brandon 2000. “An Analysis of Computing Skill Sets”, Thesis, University of South Alabama. Henderson, D., B. Champlin, D. Coleman, P. Cupoli, J. Hoffer, L. Howarth, K. Sivier, A. M. Smith, and E. Smith (2004). “Model Curriculum Framework for Post Secondary Education Programs in Data Resource Management”, The Data Management Association International Foundation, Committee on the Advancement of Data Management in Post Secondary Institutions, Sub Committee on Curriculum Framework Development. Landry, J.P., Longenecker, H.E. Jr., Haigood, B., and Feinstein, D.L. (2000). Comparing Entry-Level Skill Depths Across Information Systems Job Types: Perceptions of IS Faculty. Americas Conference on Information Systems (AMCIS 2000), August 10-13 Landry, J.P., Reynolds, J.H., and Longenecker, H.E. Jr. (2003). “Assessing Readiness of IS Majors to Enter the Job Market: An IS competency Exam Based on the Model Curriculum.” Proceedings of the 2003 Americas Conference on Information Systems, August 4-6. Landry, Jeffrey P., Pardue, J. Harold, Reynolds, John H., and Longenecker, Herbert E. Jr. (2004). “IS 2002 and Accreditation: Describing the IS Core Areas in Terms of the Model Curriculum,” Information Systems Education Conference (ISECON 2004), November 2004, Newport, RI (awarded distinguished paper). Longenecker, H. E., D. Henderson, P. Cupoli, and A. Smith 2006. “A Proposal for Developing Undergraduate and Graduate Model Curricula for Data Resource Management Synergistic with the Model Curricula for Information Systems”, DAMA International Symposium & Wilshire Meta-Data Conference, April 27, 2006. McKell, Lynn J., Reynolds, John H., Longenecker, Herbert E. Jr., and Landry, Jeffrey P. (2003) “Aligning ICCP Certification with the IS2002 Model Curriculum: A New International Standard,” International Business & Economics Research Journal, September 2003, Vol. 2, No. 9, pp. 87-91. McKell, L.J., Reynolds, J.H., Longenecker, H.E., Landry, J.P., Pardue, J.H. (2004) “Information Systems Analyst (ISA): A Professional Certification Based on the IS2002 Model Curriculum”. Proceedings of the European Applied Business Research Conference, June 14-18. McKell, L.J., Reynolds, J.H., Longenecker, H.E., Landry, J.P., Pardue, J.H. (2005). The Center for Computing Education Research (CCER): A Nexus for IS Institutional and Individual Assessment, Proceedings of the Information Systems Educators Conference 2005, Columbus. McNurlin, B.C. and Sprague, R.H. Jr. (1998) Information Systems Management in Practice (4th ed.). Englewood Cliffs, NJ: Prentice Hall Perez, Andres (2006). “The Elusive Species of the Information Age: The Data Management Professional”, September 6 2006, DAMA International. Strong, D.M., Fisher, C., Feinstein, D.L., and Longenecker, H.E. (2005). “Teaching, Learning, and Curriculum Development To Support Managing Information as a Product”. In Wang, Pierce, Madnick, and Fisher (Ed.) Information Quality – Part of the Association of MIS Monograph Series (pp. 217-229). Armonk, NY: M.E. Sharpe. Yarbrough, David 2005. “Instantiating CMMI Level 3 In A Graduate Enterprise Integration Sequence: The Development Of Short-Cycle SDLC Documentation Templates”, Thesis, University of South Alabama. Appendix of Figures Figure 1. Characteristics of Data Management Personnel. The data in the three panels are survey results from 1000 attendees to the DAMA International conference (Perez 2006). (The figures are reproduced with permission of DAMA International.) The figure shows the distribution of ages of Data Management Personnel. The mean age is 45. Data Management professionals have considerable experience. The figure show that 91% have more than 10 years, while most have greater than 20. Although some Data Management professionals have had 20 of experience almost half have less than 10 years in the profession. Figure 2. Comparison of Computing Professionals Expected Skill Depths for 49 Skills. Skill depth expectations are mean responses of the survey group, and are plotted in descending order of expectation for Database Analysts. (data from Haigood 2000). Database Analyst Skill Expectations vs. Information Systems Analyst Database Analyst Skill Expectations vs. Software Engineering Professional Figure 3. Skills Comparison between IS Analyst and Database Analyst (top 20 skills) Figure 4. Abstraction of the DAMA Framework Body of Knowledge. The data is and abstraction from Henderson et al, 2004. We suggest the abstraction be called “Information Engineering”. “Information Engineering”: Data Life Cycle Management Data Requirements Analysis and Documentation Data Modeling, Access and Security Planning for Data & Metadata Data Quality Data Models and Modeling Relational Data Model Data Warehousing Data Security Physical Database Access and Management Data Storage Management Data Access and Database Programming Database failures, backup and recovery procedures Standards Creation, enforcement, maintenance Figure 5. Life-cycle Relationships between Systems Engineering, Information Engineering, Software Engineering, and Business Process Development. A preliminary presentation of this figure was shared at (Longenecker 2006). The figure was initially presented in the work of (Yarbrough 2005). Figure 6. IS2002 Skill Set With Proposed Additions. Sub-Skills 1.3.1,2,3 Were Defined in 2002. Sub-Skills 1.3.4,5,6 Are Recommended Additions Based on the DAMA Framework. Skill Description Job Ad Words 1.3.1 Modeling and design, construction, schema tools, DB Systems modeling, SQL, construction, tools -top down, bottom up designs; schema, development tools; desk-top/enterprise conversions; systems: Access, SQLServer/Oracle/Sybase, data warehousing & mining; scripts, GUI tools 1.3.2 Triggers, Stored Procedures, Audit Controls: Design/ Development triggers, audit controls-stored procedures, trigger concepts, design, development, testing; audit control concepts/standards, audit control implementation, T-SQL, PL/SQL 1.3.3 Administration: security, safety, backup, repairs, replicating monitoring, safety -security, administration, replication, monitoring, repair, upgrades, backups, mirroring, security, privacy, legal standards, HIPAA 1.3.4 Metadata: architectures, systems, and administration definition, principles, practices, role of metadata in database design, repository, dictionaries, creation, ETL, administration, usage, tools 1.3.5 Data Warehouse: design, conversions, reporting star schema, ETL, data cleansing and storage, reporting tools, business intelligence, analytic queries, SQL OLAP extensions, data mining 1.3.6 Data Quality: dimensions, assessment, improvement Data Accuracy, Believability, Relevancy, Resolution, Completeness, Consistency, Timeliness; Data definition quality characteristics, Data model / requirements quality characteristics; Data clean-up of legacy data, Mapping, transforming, cleansing legacy data; Data defect prevention, Data quality employee motivation, Information quality maturity assessment, Gap analysis, data governance and stewardship