Data, Information, Knowledge and Competency
Valdemar W.Setzer
1. Introduction
What does it mean being competent in a foreign language, say, French? The reader should try to answer this question before proceeding with this paper. It would be interesting to write down the answer, comparing it later with ours.
We asked this question to many I.T. (Information Technology) professionals during interviews to assess their competencies. The answers varied from "having fluency in that language" to "being able to think in French." These vague characterizations are not very useful if the intention is to develop a data processing system to collect competencies and making it possible to indicate professionals to compose project teams or to fill up managerial vacancies.
Looking at the literature for some help was not effective: "competency" is rarely handled, and there was quite a bit of confusion between this concept and that of "knowledge." Worst of all, there was a general confusion between "knowledge" and "information," and also between "information" and "data." But here we faced a concept for which we had previously developed a definition. In this paper, we will provide this definition, and clear characterizations to the other three concepts.
It would also be interesting if the reader would try at this point to give her/his characterization of what s/he understands by "information" and "knowledge", to compare with ours. But your probable difficulty is not uncommon: on issue #81 dated Aug. 10, 1998 of the excellent electronic magazine "Netfuture – Technology and Human Responsibility" its editor, Stephen Talbott, describes that during two lectures to large audiences of librarians, he asked what was "information" and nobody risked an answer besides "that's the stuff we work with" [Talbott].
This paper begins with the definition of "data". Then characterizations (and not definitions, as will be clear) of "information," "knowledge" and "competency" are given. According to our characterization, "competency" depends on two factors, leading to a matrix representation, the "competency matrix." After general considerations about these concepts, a review of the literature is made. Then it is shown how they were used in the implementation of two different "competency systems" for PROMON (a very large Engineering firm with annual revenues of about 1 billion US dollars, and PRODESP, the São Paulo State Data Processing company, which has more than 1,000 technical professionals in the data processing field. Finally, some considerations are made on the implementation of Competency Centers, congregating employees of some professional area.
2. Data
We define data as a sequence of quantified or quantifiable symbols. Thus, a text is a piece of data. In fact, letters and characters are quantified symbols because there is a finite number of them; any alphabet (including digits and special characters) may be considered as a numbering system. Pictures, figures, recorded sounds and animation are also examples of data, because they may be quantified to the point that it is eventually difficult to distinguish, from their originals, their reproduction made from the quantified representation. It is very important to note that, even if incomprehensible for a reader, any text constitutes a piece of data. This will become clearer in the next section.
Therefore, in our definition data are necessarily mathematical entities, and thus are purely syntactic. This means that data may be totally described through structural, formal representations. Being quantified or quantifiable, they can obviously be stored into a computer and processed by it. Inside a computer, a piece of text may be linked to other pieces, through physical contiguity or through "pointers." Pointers are addresses of the storage unit being used. Thus, one gets a "data structure." Pointers may link a point of a text to a quantified representation of a figure, sound, etc.
Data processing in a computer is limited exclusively to structural manipulations of the former, done through programs. The latter are always done through mathematical functions, and thus are also "data." Examples of such manipulation in the case of texts are their formatting, sorting, comparing with other texts, statistics of words appearing in the text, etc.
3. Information
Information is an informal abstraction (that is, it cannot be formalized through a logical or mathematical theory) which is in the mind of some person, representing something of significance to that person. Note that this is not a definition, it is a characterization, because "something," "significance" and "person" cannot well defined. We assume here an intuitive (naïve) understanding of these terms. For instance, the phrase "Paris is a fascinating city" is an example of information – as long as it is read or heard by somebody, and as long as "Paris" means the capital of France, and "fascinating" has the usual, naïve quality associated with this word.
If the representation of some information is done through data, as in the phrase on Paris, it may be stored into a computer. But, attention, what is stored this way is not information, but its representation under the form of data. This representation may be transformed by the machine, as in text formatting, a syntactical transformation. The machine cannot change the meaning starting from the latter, because meaning depends on a person who has the information. Obviously, the machine may shuffle the data in such a way that they may become unintelligible to the person who receives them; in this case it has ceased to be information for that person. Furthermore, it is possible to transform the representation of some information in such a way that its meaning changes for the person receiving it (as for example automatically changing the name "Paris" to "London". There as a change of meaning for the human receptor, but in the computer the change was purely syntactical, a mathematical data manipulation.
It is not possible to process information directly in a computer. For this, it is necessary to reduce it to data. In our case, "fascinating" would have to be quantified, using for instance a scale of 0 to 4. But then in our sense it would not be information any more.
On the other hand data, as far as they are intelligible, are always incorporated by someone as information, because (adult) humans are always looking for meaning and understanding. When the phrase "the average temperature in Paris in December is 5oC" (by hypothesis) is read or heard, an immediate association is made by the reader (or hearer) with cold, with a certain period of the year, with the particular city, etc. Note that "meaning" cannot be formally defined. Here it will be considered as a mental association with a concept, as temperature, Paris, etc. The same happens when we see an object with a certain format, and we say that it is "circular," associating – through our thinking – our mental representation of the perceived object with the concept "circle." For a deep study of thinking, showing that as far as our inner activity is concerned, it is an organ for the perception of concepts, see one of the fundamental works by Rudolf Steiner, his "Philosophy of Freedom" (direct translation of the German original title), specially his chapter IV, "The world as perception" [Steiner 1963, p. 76].
Information may be an inner property of some person, or may be received by her. In the first case, it is in the mental sphere, and may originate from an inner perception, like some pain. In the second, it may or may not be received through its symbolic representation as data, that is, under the form of text, figures, recorded sound, animation, etc. As said above, the representation by itself, for example a text, consists exclusively of data. Reading a text, a person may absorb it as information, as long as she understands it. It is possible to associate the reception of information through data to the reception of a message. Nevertheless, information may also be received without being represented through data. For instance, in a cold day, if someone is in a heated room, extending her arm through the window she obtains some information – if it is too cold outside or not. Observe that this information is not represented through symbols, and should not be called a "message." On the other hand, it is possible to have a message which is not expressed through data, for instance in the case of a strong shout or vocal noise: it may contain lots of information for its receiver, but does not contain any data.
When we exemplified data, we used "recorded sound." This is due to the fact that sounds in nature contain much more that what may be recorded: hearing them, there is a whole context that disappears in the recording. The noise produced by sea waves, for example, comes with the view of the sea, of the latter's smell, the air humidity, the luminosity, the wind etc.
A fundamental distinction between data and information is that the former is purely syntactical, and the latter necessarily contains semantics (implied by the word "meaning," used in its characterization). It is interesting to note that it is impossible to introduce semantics into a computer and process it, because the machine itself is purely syntactical (as the whole of mathematics also is). If one examines for instance the field of the so-called "formal semantics" of programming "languages" one would notice that it is in fact just syntax, expressed through an axiomatic theory or through mathematical associations of its constructs with operations performed by a (eventually abstract) computer. As a matter of fact, "programming language" is a misnomer, because what one normally calls a language contains semantics. (A couple of years ago we heard in a public lecture N.Chomsky – the famous researcher who established in 1959 the field of "formal languages," and who had intensively looked for syntactic "deep structures" in our language and brain –, saying that a programming language is not a language at all.) Other misnomers used in the computer field, connected to semantics, are "memory" and "artificial intelligence." We are against their use because they give e.g. the false impression that our memory is equivalent in its function to computer storage devices, or vice-versa. Theodore Roszack makes interesting considerations showing that our memory is infinitely wider [Roszack 1994 p. 97]. John Searle, the author of the famous allegory of the Chinese Room (in which a person, following rules in English, combines Chinese ideograms without understanding them at all, and thus answers questions in this language – this way computers process data), demonstrating that computers have no understanding, argued that computers cannot think because they lack our semantics [Searle 1991 p. 39].
Searle's allegory suggests an example that may help to clarify our concepts a bit further. Suppose we have a table with 3 colums: city names, months (represented by 1 to 12) and average temperatures, such that column titles and city names are in Chinese. For someone that has absolutely no knowledge of Chinese and its ideograms, that table is pure data. If the same table would be in English, for an American with reading abilities it would be information, as long as he or she knew what the city names represent and the temperature would be in degrees Fahrenheit. Note that the table in Chinese could be formatted, its lines could be sorted according to city names (given an alphabetic ordering of the ideograms), months, etc. – examples of pure syntactic processing.
4. Knowledge
We characterize knowledge as a personal, inner abstraction of something that has been experienced by someone. In our example, someone has some knowledge of Paris only if she or he has visited it. Later on we will loosen somewhat this requirement.
In this sense knowledge cannot be described in its entirety – what can be described is information. Furthermore, I does not depend just on a personal interpretation, as with information, because it requires a personal experience with the object of knowledge. Thus, knowledge is in the purely subjective realm of humans or animals. Part of the difference between both resides in the fact that a human may be aware of his or her own knowledge and may describe it conceptually in terms of information, for instance through the phrase "I visited Paris, so I know it" – supposing that the reader or hearer understands what is meant.
In our characterization, information may be stored into a computer, but not processed in its meaning; furthermore, as long as it resides in a computer, it is not information anymore, it is pure data. As knowledge is not subjected to representations, it cannot be inserted into a computer. Thus, in our sense it is absolutely wrong to speak of a "knowledge base" in a computer. At most we may have a traditional "database."
A baby has quite a bit o f knowledge. For example she may recognize her mother, se knows that when crying she get fed, etc. But it is not possible to say that a baby has information, because she does not associate concepts – at least not in the sense of an adult, who may do it consciously. Along this line, it is not possible to say that an animal has information, but it certainly has lots of knowledge.
Thus, there exists information that is related to some knowledge, as in the case of the phrase on Paris, pronounced by someone who knows this city. But there may be information without this relation, for instance if the person read a travel guide before visiting Paris for the first time. Therefore, information may be practical or theoretical, respectively. Knowledge is always practical.
We associated information to semantics. Knowledge is associated to pragmatics, that is, it is related to something existing in the "real world" of which we have a direct experience. (Again, we are assuming here a naïve understanding of "real world.")
5. Competency
Competency is the capacity of executing a task in the "real world." In our example, this could correspond to the capacity of working as a tourist guide in Paris. (Note that in our sense a travel guidebook only contains information.) A person can only be considered competent in some field if he or she has demonstrated through past accomplishments the capacity for executing a required task in that field.
We associated pragmatics to knowledge. Competency is associated with a physical activity. A person may have a good degree of competency e.g. in delivering speeches. For this, she or he must move her or his mouth and produce physical sounds. A competent mathematician is not just a person who is able to solve mathematical problems and eventually create new mathematical concepts – which may be purely inner, abstract, mental (and thus not physical) activities. He or she must also be able to transmit his or her mathematical concepts to others. This transmission is obviously done through physical (outer) actions.
The creativity which may be associated with competency reveals another one of its characteristics. Competency may be connected to freedom, which did not appear in the other three because there was no activity involved with them, other than their acquisition. In our example, a competent guide to Paris will tour two different tourists in different ways, recognizing that they have different interests. Furthermore, such a guide may improvise different tours for two tourists with the same interests but with different personal reactions along the tour or just by having an intuition that the tourists should be treated differently. Cusumano and Selby describe how Microsoft Corporation has organized its software development teams permitting the creativity typical of hackers but at the same time directing it to established objectives, maintaining the compatibility of modules through periodic synchronizations [Cusumano 1997]. Here is another distinct feature of humans and animals in terms of competency: humans are not necessarily directed by their "programs" as animals are, and may be free and creative, improvising different activities in the same environment. In other words, animal competency is always automatic, deriving from a physical necessity. Humans may establish mental objectives for their life, such as cultural or religious ones, having nothing to do with physical needs. These objectives may involve the acquisition of some knowledge and certain competencies, leading to self-development.
Competency requires knowledge and personal capacities for realizing something concrete. Therefore, it is impossible to introduce competency into a computer. One should not say that an automated power lathe has some competency. One should say that it contains data (programs and input data) which are used to control its functioning.
As with knowledge, a competency cannot be fully described. When comparing competencies, one has to know that this comparison just gives a rough idea of the degree of competency a person has. Thus, when classifying a competency into, say, "none," "developing," "proficiency," "strong" and "expert," as proposed in the MIT I/T Competency Model [MIT I/T], or "novice", "advanced beginner", "competent", "proficient" and "specialist", according to Hubert and Stuart Dreyfus [Devlin 1999 pg. 187], one should be conscious of the fact that something is being reduced to informtation (as long as those terms are understood). There is a clear intuitive ordering of those degrees, from none or weak to high competency. Associating a "weight" to each one, as 0 to 4 in the MIT case, and 0 to 5 in the Dreyfuses (here, 0 should e interpreted as "none"), there has been a quantification of something that is non-quantifiable in its essence. Therefore, one should be aware of the fact that when calculating someone's "total competency" over various fields – eventually required by some project –, a metric is being introduced which reduces some subjective human characteristic to an objective shadow of what it really is, and this may lead to many errors. The situation is worsened with behavioral skills, like "leadership," "ability to interact with others," etc.
We are not saying that such quantified assessments should not be used; we just want to point out that they should be used with extreme reserve, and one should be aware that they do not represent what competency the person being assessed really has. We think that such assessments may be used just as rough suggestions, and should be followed by personal – and thus subjective – further analysis. If the computer is used to process data, one is in the objective realm. Humans are not objective entities, so they should always be treated with some degree of subjectivity, otherwise they are reduced to machines (this is obviously even worse than treating them as animals).
6. Intellectual fields
Our characterizations apply very well to practical fields, such as data processing or engeneering, but they need further elaboration for purely intellectual ones. Let us examine the case of a competent historian. There is no problem with his or her competency: it is manifested through written papers and books, eventually through given lectures and courses, etc. On the other hand, we have to extend our characterization of knowledge to encompass such intellectual fields as history: in general historians do not have personal experiences of past times, people and places. Nevertheless, a good historian is certainly a knowledgeable person in her or his field.
Unfortunately, our way out of this apparent incongruence of our characterization will not be accepted by everyone: we postulate, as a working hypothesis, that a good historian has in fact a personal experience – not of physical situations, but of the Platonic "world" of ideas. Ancient facts are recorded in that world as "realities" and are grasped through thinking by a person that immerses him or herself into the study of ancient accounts. The words "intuition" and "insight" deal with mental activities having sometimes to do with a "perception" of that world. In fact, "insight" means, according to the American Heritage Dictionary (1970 edition), "the capacity of discerning the true nature of a situation," "an elucidating glimpse." "True natures" are concepts, hence do not exist physically; we make the hypothesis that through insight, that is, an inner perception, we "glimpse" the world of ideas [Steiner 1963, p. 112].
If one may admit as a working hypothesis that the concept of a circle is a "reality" in the world of ideas, existing independently of any person, then it is not difficult to admit that our thinking is an organ of perception with which we may "experience" the eternal, universal idea of "circle." In this sense, and using our characterization for "knowledge," one may say that a person may have a knowledge of the concept "circle." Note that nobody has ever seen a perfect circle, as no living person has experienced with his or her senses the French Revolution, or has met Goethe but both are realities in the latter's "archetypal world."
Thus we have saved good historians from being labeled as having just information and no knowledge…
7. General comments
It is necessary to recognize that our characterizations of data, information, knowledge and competency are not the usual ones. For instance, it is common to consider "data" as a proper subset of "information," that is, data is a particular kind of information. We found it useful to separate completely these two concepts, that is, according to our considerations, data are not part of information. The same applies to information and knowledge, and to knowledge and competency.
It is interesting to observe that, according to our characterizations, there is no (formal) "Information Theory." What Claude Shannon developed was in fact a "Data Theory." Theodore Roszack relates discussions originated from the name "Iformation Theory" [Roszack 1993 p. 12]. Shannon's theory deals with, for example, the capacity channels have of transmitting data, and not information. Thus, in our sense, one should not talk about "amount of information," but "amount of data" transmitted by some channel. "Bit" is not a unit of information measure, but of data, as shown by its name ("BInary Digit"): numbers, by themselves, do not contain information, they are pure data.
Data are purely objective – they do not depend on their user. Information is objective-subjective, in the sense that it is described in an objective way (texts, pictures, etc.) but its meaning is subjective, depending on its user. Knowledge is purely subjective – each person experiences something in a different way. Competency is subjective-objective, in the sense that it is a purely personal characteristic, but everybody may examine its outcome.
The characterizations made above may be useful for enterprises. They should become conscious of the fact that they don't introduce information into computers, but data. There are two aspects to be considered here. Data should represent as well as possible the information that should be acquired from them. Furthermore, the enterprise's professionals always interpret them. The same data may be used as two different pieces of information. To avoid it, it is not sufficient that the desired information be clearly represented, but that the professionals be prepared to interpret it in the expected manner. Keith Devlin mentions some tragic cases, such as airplane crashes, due to erroneous interpretation of data or to ambiguous representation of information [Devlin, pp. 9 (the case of the Canary Islands in 1977, with 583 deaths) and 76 (the Cali case in 1995, with 159 deaths), respectively].
On the other hand, it is important to know that it is impossible to transmit knowledge: what is transmitted is data, eventually representing some information. To accomplish knowledge transmission from one person to another, it is necessary to provide for personal interactions between both, with the first one vividly showing or describing her experience. Devlin mentions two cases of large enterprises where there was a tentative of transmitting knowledge through data, but the transmission became effective only through personal contact [idem, pp. 176 and 177].
As for competency, it can only be acquired by doing something. Thus, enterprises wishing to develop competency among their professionals in a certain area should make them work in that area or participate in projects, preferable with people with high competency.
We found support for some of our ideas in the literature. For example, Y.Malhorta says: "The traditional paradigm of information systems is based on seeking a consensual interpretation of information based on socially dictated norms or the mandate of company bosses. This has resulted in the confusion between knowledge and information. Knowledge and information, however, are distinct entities. While information generated by computer systems is not a very rich carrier of human interpretation for potential action, knowledge resides in the user's subjective context of action based on that information. Hence, it may not be incorrect to suggest that knowledge resides in the user and not in the collection of information, a point made two decades ago by West Churchman, the leading information systems philosopher." [Malhorta 1978]
Note that in our sense, information cannot be generated by a computer. Computers can only reproduce its representation under the form of data, eventually with some change in format or some purely syntactic treatment. Computers may generate data, for example calculating the average temperature of various cities. Furthermore, we have associated "action" to competency, and not to knowledge.
Malhorta also says: "Karl Erik Sveiby, the author of The New Organization Wealth: Managing and Measuring Knowledge-Based Assets (Berret Koehler, 1997), contends that the confusion between knowledge and information has caused managers to sink billions of dollars in information technology ventures that have yielded marginal results. Sveiby asserts that business managers need to realize that unlike information, knowledge is embedded in people, and knowledge creation occurs in the process of social interaction. On a similar note, Ikujiro Nonaka, the first Xerox distinguished professor of knowledge at University of California at Berkeley, has emphasized that only human beings can take the central role in knowledge creation. Nonaka argues that computers are merely tools, however great their information-processing capabilities may be."
We consider the confusion between information and competency much worse than between information and knowledge. Competency should be faced with much more subjectivity, and should be connected to some physical accomplishment.
According to our characterization, an individual may acquire knowledge without social interaction. For instance, someone may make an extensive visit to Paris alone, without speaking to the local people. Well, Paris is a result of social interactions, but the visit could be made to a lake or mountain.
Nonaka seems to imply that knowledge can be described. We do not agree with this. Finally, in our sense there is no "information processing," just "data processing" by computers. For example, formatting information in a computer consists, in reality, in formatting the data which represent that information. We yet higher emphasis, we are against using the expression "knowledge processing" or "knowledge database."
In their book on knowledge management, Davenport and Prusack's say: "Knowledge is neither data nor information, though it is related to both, and the differences between these terms are often a matter of degree." [Davenport 1998, p.1]. We agree with the first statement. But in our characterizations, the three are absolutely different, and not just a matter of degree.
They are also in agreement with Malhorta: "Confusion about what data, information and knowledge are - how they differ, what those words mean has resulted in enormous expenditures on technology initiatives that rarely deliver what the firms spending the money needed or thought they were getting. ... Organizational success and failure can often depend on knowing which of them you need, which you have, and what you can and can't do with each" [p. 1]. We have tried to establish essential differences; we hope they help to end the present confusion.
They characterize "data" (they refer to it in the singular) as "a set of discrete, objective facts about events" [p. 2]. We agree that data are discrete and objective. But we do not agree with the events: data may be generated by computers. For example, they may be the outcome of some calculations having nothing to do with facts of the real world (the events). Feeling cold is an event, but it is not done through data. They state that "data by itself has little relevance and purpose" [p. 2]. We consider data by themselves as being just symbolic representations, having absolutely no relevance and purpose; only when they are used not as data, but as information, relevance and purpose are attached to them – but then they are not data anymore.
They also state that "... there is no inherent meaning in data. Data describes only a part of what happened" [p. 3]. Yes, there is no meaning in data, they are just syntactical descriptions, but per se they have no connection to what they describe. A human must establish this connection. Furthermore, we would not say that "data describe" something. They may be just the representation of some information, but may also be pure garbage, without possibility of extracting any information from them. For example, the table with city names and temperatures in Chinese (see section 2) is pure garbage for someone that does not read or does not understand that language.
Two interesting statements: "Data is important to organizations - largely, of course, because it is essential raw material for the creation of information" [p. 3]. "Firms sometimes pile up data because it is factual and therefore creates an illusion of scientific accuracy" [idem]. We mentioned data's objectivity; furthermore, they can always be expressed mathematically, hence the mentioned illusion.
In their section on information, they describe it as "... a message, usually in the form of a document or an audible or visible communication. As with any message it has a sender and a receiver. Information is meant to change the way the receiver perceives something ... The word 'inform' originally meant 'to give shape to' and information is meant to shape the person who gets it, to make some difference in his outlook or insight" [p. 3]. Our characterization is more general: it did not imply that information is meant by its originator to be transmitted to someone else. Furthermore, as in the example of putting the arm outside a window to assess the cold, information may not be received through a message. But we appreciate their concept of information as a message (as long as it makes sense to the human receiver), because it covers most of the purposes for creating some information. An important point here is "Strictly speaking, then, it follows that the receiver, not the sender, decides whether the message he gets is really information – that is, if it truly informs him" [p. 3]. Later on, we read "Unlike data, information has meaning - the 'relevance and purpose' … Not only does it potentially shape the receiver, it has shape: it is organized to some purpose. Data becomes information when its creator adds meaning" [p. 4]. The problem of the creator notwithstanding (who associates meaning is mainly the receptor, and the data "creator" could be a computer), it is nice to see that our ideas, developed independently, are in full agreement with some of theirs. Their last phrase in this section is worth mentioning. "The corollary for today's managers is that having more information technology will not necessarily improve the state of information" [p. 5] This is obvious, the technology is a data technology, and not information technology or, in the best of hypothesis, it is the technology of transmitting and representing information.
As their book is on knowledge management, in their section on knowledge they provide a thorough characterization for it: "Knowledge is a fluid mix of framed experience, values, contextual information, and expert insight that provides a framework for evaluating and incorporating new experiences and information. It originates and is applied in the minds of knowers. In organizations, it often becomes embedded not only in documents or repositories but also in organizational routines, processes, practices and norms." [p. 5].
Our characterization restricts knowledge to a personal experience; it does not agree with the rest. In particular, routines and processes may not be in the minds of knowers, and written norms are in our sense just data. They may be read as information, but probably some of those norms are incomprehensible, and thus are just pure data. "While we find data in records or transactions, and information in messages, we obtain knowledge from individuals or group of knowers, or sometimes in organizational routines" [p. 6] Yes, knowledge resides in individuals, but what they transmit and what may obtained from them is information (if the receiver understands it), that is, messages, according to the authors, in general under the form of data But the example of the shout (see section 3) also shows that information may be transmitted without being represented through data.
It is interesting to observe that their valuable book does not mention competency. Sometimes they marginally touch "skills" [p. 11, 77, 97], but their main focus is on storing and transmitting knowledge (rather, their understanding of it), practices and technologies for knowledge management, etc.
An interesting book is the one by Keith Devlin [Devlin 1999]. He attempts "to develop a scientific understanding of information and knowledge" [p. 3]. "It is because it is built on a sound, theoretical investigation of information that this book differs from the majority of business books with the word "information" or "knowledge" in their titles … [p. 5]. But, at the end of the book, he says that his "theory," which he calls "situation theory" or, more precisely his book is simply "at the very least, the beginnings of a science of information." It could not be more that that, because those two words depend on human factors, something he clearly recognizes in the latter case [idem]. Fundamentally, it seems to us that his "theory" covers specification of contexts ("situations"). In fact, in our sense, for having information it is necessary to have an understanding of the message or perceived phenomenon, as in the examples of feeling the cold air or shouting, cf. section 3. This involves context, given by the person who receives the message or has the perception.
As we do, he also gives importance to a precise conceptualization of data, information and knowledge: "Understanding the subtle distinctions between the three concepts of data, information, and knowledge is essential…" [p. 14]. According to our concepts, the distinctions are not subtle, they are enormous; thus, maybe our concepts are clearer. "… as we have observed, the distinction between information and knowledge is not a clean one, and it in large part a matter of emphasis." [p. 151]. We consider our concepts quite "clean," mainly in practical areas like engineering or data processing (as we have seen, the situation is not so simple as far as purely intellectual fields are concerned).
"Roughly speaking, data is what newspapers, reports, and ‘computer information systems' provide us." [p. 14]. Our definition (cf. section 2) is much more precise and generic. Next comes the following: "When people acquire data and fir it into an overall framework of previously acquired information, that data becomes information." [idem], which sounds quite circular. As a good mathematician, Devlin gives and "equation": "Information = Data + Meaning" [idem] and thus arrives to one of the forms of information given here.
According to our concepts, data are representations. Information can only be acquired from some data if a meaning is associated to them, but it may also be acquired without data. By the way, we don't like these representations through equations, as this would be a branch of Mathematics – which it is not, by all means. How is it possible to add two different measures? (This is analogous to adding somebody's height to her weight.) Still worse, here we don't even have measures, although it is possible to express data through "bit" according to our concepts. He does not define data and, as we saw, "meaning" is something informal. Thus, this addition has nothing to do with additions in Mathematics, and the whole is not an equation, as he calls it.
At this point, he reaches knowledge: "When a person internalizes information to the degree that he or she can make use of it, we call it knowledge. … As an equation: Knowledge = Internalized information + Ability to utilize the information" [p. 15]. Here we diverge: in the expounded sense, it is possible to have knowledge without making use of it. Furthermore, it is possible to use theoretical information (internalized) to derive some other theoretical information. For us, knowledge requires a personal experience, but the internalization mentioned by him could happen from theoretical data (as in the case of the guide book in section 3). But the disagreement is not total: "Knowledge exists in the individual minds of people." [idem], in spite of probable different conceptions of "mind" (for us, it is not physical). But, at the end of the book he says: "Knowledge is information possessed in a form that makes it available for immediate use to guide action." [p. 200]. If it exists in the minds of people, how is it possible to say that it has some "form"?
We are not going to enter into the details of other formulations he gives to these concepts, because this would turn into a review of the book. It remains to briefly handle what he understands under "competency." He deals very little with it, and calls it "expertise" [p. 185]. Our denomination looks better, because at the lowest competency levels it is not possible to say that a professional is an expert. He introduces the competency grades mentioned in out section 5, characterizing each grade [p. 186]. The "novice" is a person characterized by following rules consciously and unquestioningly. The "advanced beginner" also follows rules, but "modifies some of the rules according to context." The "competent performer still follows rules but does so in a fairly fluid fashion – at least when things proceed normally." "For much of the time the proficient performer does not select and follow rules." As it may be seen, these characterizations are not very objective and probably require that the degree of competence has to be determined by another professional. And what should be said about intellectual areas, as designing, computer programming, etc.? In these areas, the activity is always conscious. He makes the following association: "Stage 1 of expertise [novice] corresponds roughly to information so simply and directly linked to its representation that it could almost be classified as data. Stages 2 and 3 of expertise correspond more or less to the possession of information. Stages 4 and 5 correspond to knowledge." [p. 188]. These analogies do not seem to be very natural. From our point of view, it is like mixing up quite different things. Someone con do nothing just with data, because they have no meaning. With information it is possible to do something concrete, but only in the case it is related (or becomes related while executing an action) with some knowledge, otherwise it has nothing to do with the real world. In our characterizations, even in its more elementary level, competency always involves some knowledge, because it deals with the real world.
Devlin does not recognize that competency has to do with some ability (skill) over some knowledge area, as we will describe in the next section
.Our example of a competent guide to Paris indicates that competency is an ability of producing something (acting as a guide) over a certain knowledge area (Paris). In this case, a person has to know Paris in order to be a competent guide to it, hence the knowledge involved. Someone is competent in a foreign language (the knowledge area), if s/he has the ability of reading it, or understanding the spoken language, or speaking it, or giving lectures in it, or doing written, simultaneous or consecutive translations from it, etc. Note that a person may have different competencies for every ability in each one of many different foreign languages. But for all foreign languages one may consider the same abilities. This answers the question formulated in the beginning of this paper.
Thus, one may construct for each professional a competency matrix, indicating in its rows the knowledge areas of interest, and in its columns the various abilities which apply to each area. Each cell contains a degree of competency, e.g. one of the 5 or 6 mentioned in section 5 or those that will be described in the sequel; the lack of competence may be indicated y an empty cell.
A professional may not be competent in a certain ability for a certain knowledge area, but may have knowledge (personal experience) thereof. We indicate this fact by assigning a degree of knowledge to the correspondent cell in his matrix. The same with information if there is no knowledge, in the senses expounded before. Thus, the competency matrices may be used for representing also knowledge (requiring some practical experiences, such as having done some exercises, having accompanied some project, etc.) and information (representing a mere theoretical knowledge, obtained through reading, studying, taking courses without practical exercises, etc.)
To simplify the matrix, representing some degree of competency in a cell overrides the representation of some degree of knowledge, which in turn overrides the representation of information. This has worked well in the various fields of data processing; interviewed professionals were quite satisfied with this simplification. A further simplification was introduced at PROMON (see section 1), shrinking the knowledge and information degrees from 5 to 3 values (none, weak/reasonable and good/excellent). In the PRODESP system the simplification was even greater: only two degrees were used for compentency ("basic" and "advanced"), and only one each for knowledge and information (corresponding to having it or not), besides "none" (empty cell), valid for the three.
For example, if someone has just taken a theoretical course or has read some manuals on a certain field we insert a degree of information in the correspondent cell. If the same person has done some practical exercises or has carefully examined some products developed using that information, we classify it as knowledge. Only if the professional has already produced some product in that area or has worked in it for some time we insert a degree of competency.
In the engineering and data processing fields, many products and systems are produced through projects. In these cases, the following typical abilities were represented, corresponding to project phases: 1. Analysis (of requirements and objectives); 2. Design (planning and product modeling); 3. Construction (programming, assembling the system or product); 4. Implementation (testing, user training); 5. Support (maintenance, help desk). At PROMON, the first two items were combined in just one, because we noticed that every professional that had one of them had the other too.
The PROMON system had only one matrix, for I.T. In the PRODESP system, the number of matrices is variable. We introduced the following: 1. I.T.; 2. Systems (where each knowledge area is a system developed by the company; 3. Administrative areas (representing the competency outside of I.T., as for instance in legal procedures, management of human resources, etc.); 4. Foreign languages and 5) Academic studies. Abilities and competency degrees vary with each matrix. For example, in the academic studies there are only two columns for abilities: highest degree attained (complete or incomplete), and number of years of experience the professional has worked in each area which s/he had some academic study.
The PROMON system was implemented as a prototype, using electronic spreadsheets. Thus, the competency matrix of a professional is simply a spreadsheet. Some database structuring was employed, such as coding the knowledge areas.
On the other hand, at PRODESP a database management system was used, permitting the generalization of all data structures. Each matrix is displayed as a column in which knowledge areas are structured in the form of a two-level tree in the MS Windows standard, that is, with possibility of expanding or contracting the first level. Abilities appear in a second column, also in two levels; the degree of competency appears at the side of the ability. While exhibiting a matrix for a certain professional, the selection of an area with the "mouse" produces automatically the display of the degrees of competency assigned to that professional in the abilities valid for that matrix.
10. Selecting professionals
For the selection of professionals who satisfy a certain combination of competencies, both PROMON and PRODESP systems employ the same matrix representation used for assigning competencies to professionals. In the former, cells are filled with the minimal desired competencies. When various abilities are desired for the same area or different areas, the logical AND connective is assumed. On the other hand, in the PRODESP system it is possible to specify if in the comparison with competencies of professionals which of the comparison operators <=, = or >= is going to be used. This indicates a "less than or equal to" (to see which professional don't have the minimum required competency, for instance when selecting candidates for a training program), "equal" or "greater than or equal" competency than that indicated at the selection window.
Each line of this condition may have: 1) Just one knowledge area (indicating that any professional with any non-empty cell in any ability for that area is selected); 2) An area and an ability (any professional with any degree assigned in that ability for that area); 3) An area, an ability and a degree of competency (the overall comparison operator is used). A line with a selection condition may be combined with the next one through a logical AND or OR connective. In the latter case, it is possible to specify alternatives like "competency in UNIX or LINUX installation.
Besides the selection condition, it is possible to specify an obsolescence factor, giving the year in which the professional has worked in a knowledge area for the last time, as for instance "select the professionals who have worked with Delphi at least up to year 1998."
When using competency matrices and vectors for the assignment of professionals to projects and positions, one should take into account the observation we made on objective and subjective evaluations (see end of section 5).
During the specification of a selection condition the system assembles an SQL query to the database, using an algorithm which was designed for this purpose.
The human resource management system PeopleSoft (HRMS module), which does not have the matrix concept (curiously, except for foreign languages, where various fixed abilities are considered), permits specifying selections with a "degree of importance" (from a fixed number of them) for each term of the selection condition. As there is an internal quantification associated with each degree of competency, a linear combination of degrees of competency multiplied by the degree of importance produces a numerical "global competency" for each selected professional. The system displays the professionals in the order of the global competency. The assignment of those weights is a delicate question. The system should permit their variation, making it possible to simulate various combinations; this is not the case with the PeopleSoft system.
11. Applications
The application which generated the study presented here and its first implementation (the PROMON system) had as a goal the selection of professionals for organizing competency centers (see section 12) and the selection of project teams. At first glance, it may seem natural to formulate a single selection condition for the whole team of a project. Obviously, this is not the case: project managers think on terms of professional profiles necessary to form sub-teams.
If there are no available professionals with a required competency, the fact that someone was able to transform, in another project, his information of knowledge into competency is a strong indication that s/he will repeat the feat. For this, it would be necessary to store a history of competency changes for every professional.
Besides being useful for selecting professionals, competency matrices serve to count how many professionals have at least a certain degree of required competency in each area/ability. We called this resulting matrix general competency count. With it, one has the profile of the enterprise in terms of competencies. It is then possible to detect cells (indicating abilities within knowledge areas) where there are too few professionals with the required competency. To capitalize the existing professionals in the enterprise, for future filling up of the required cells, the characterizations expounded here may be utilized. Thus, a professional who has a good knowledge does not require further training; what is needed in this case is participating in project teams to acquire the desired competency level in some area/ability. The lack of information indicates the need for some training, taking courses, studying manuals, etc. If a professional has the information, it may be useful to allocate him/her to a course with practical sections, or even to take part in a team to acquire some knowledge and eventually the required competency.
The system may be expanded to represent core competency matrices. Each non-empty cell indicates the fact that the company finds it essential not to hire external services for the correspondent competencies, or not to acquire finished external products. Comparing with the general competency count matrix, it is possible to detect which areas/abilities are lacking in the enterprise or require upgrading.
Competency management systems as described here may be useful for a call center, when looking for professionals to answer customer questions, to select candidates within the company to participate in disputations for vacant positions, professionals that may give press interviews on some of the company's projects or products, etc.
Another application could be the allocation of teachers in a state or city educational system. For example, the State of São Paulo Education Secretary hires approximately 300,000 teachers. The system could allow for the creation of a matrix with cities, locations and neighborhoods, so that it could indicate not just teachers according to their pedagogical competencies, but also according to preferences of geographical region.
Finally, a similar system could be useful for collecting data on candidates for enterprise openings, because the matrices of a professional may be considered as curriculum systematization. A specific system for openings is the excellent Selector, developed by PCA Software Engineering, in São Paulo (see
www.selector.com.br, in Portuguese). It is based upon "curriculum cards". There is a card with basic information, permitting each enterprise the addition of its own cards. Any person may introduce his/her professional data through the Internet, filling up the basic card, adding up further data in specific cards corresponding to enterprise openings, introduced by the latter. In some sense, PCA has created a language for the definition of curriculum cards. The system has an interesting weights associated to each required competency, thus permitting the ordering of the selected candidates. This method permits knowing which candidates fulfill all the minimum requirements, because they receive the maximum weight sum. Many great enterprises are using the system; through the Internet it is possible to gain a good idea of its principles, implementation and usefulness. Up to the time we wrote this paper, we felt the absence in that system of the concepts formulated here for information, knowledge and competency and the characterization of the latter as the confluence of an ability over a knowledge area, with the corresponding matrix representation.12. Implementation
At PROMON, the prototype implementation was done with José Márcio Illoz, using a spreadsheet program. A "standard" matrix was implemented, with the knowledge area names in its rows and abilities in the columns. Areas were codified, with their codes being used for making the logical link with a spreadsheet containing the consolidation of competencies of all professionals. This consolidated matrix is used for selecting and counting of professionals.
A very important work connected to competency assessment is the establishment of knowledge areas. In the case of PROMON, 160 different areas were collected for I.T., divided in 3 hierarchical levels, which we called "great areas," "areas" and "sub-areas." Unfortunately in the I.T. field it is necessary to enter into too much detail. For example, a professional that has competency on the installaton of LINUX may not have it for MS NT.
The PRODESP system was programmed by Mateus Saldanha using Delphi and Oracle. The use of a network database management system permitted the generalization of the system for any number of matrices, and a variable number of abilities per matrix (in the PROMON case, there was just a single matrix with a fixed, pre-determined number of abilities, sue to the fixed format imposed by the electronic spreadsheet software). The variable number of abilities per matrix permitted increasing the number of abilities in relation to the PROMON system for the I.T. matrix. In the PRODESP system, there are 3 groups of abilities: infrastructure (hardware and software), system development, and acting as instructor in the knowledge area, in a total of 8 second-level abilities. By the way, the ability of "acting as instructor" exists in every matrix, but for the matrix of academic studies. This way, area duplication was eliminated. For example, in the PROMON system there was a UNIX entry in the section for infrastructure, that is, installation of this operating system, and the same area for Development, that is, using UNIX for developing systems. One of the consequences was the reduction of hierarchical levels of knowledge areas to just 2, simplifying the system. Finally in the PRODESP system, an obsolescence factor (see section 10), and an access security system were also introduced.
Security in the PRODESP system is made in 4 access levels. The first corresponds to the "generic user," and is opened to any person have access to the network. Such a user may only use the screen for specifying conditions for the selection of professionals. S/he obtains only the name and basic personal data of those professionals who satisfy the given selection condition. The "personal user" has his/her data stored into the system, with a password. Besides selections, such a person may access his/her own personal data and competency matrices, and may change them. The "supervisor" may have all the access of a personal user, plus being able to read the competency matrices of his subordinates. Finally, the "system administrator" has unrestricted access to any data.
A delicate question concerns filling up the matrices. At PROMON, we did the interviews and assigned the competency degrees together with each professional. At PRODESP this would be unfeasible, because there are too many professionals (about a thousand, just for I.T.). The solution has been to give a lecture on the system concepts, and letting every professional fill up his/her own matrix. Later on, the supervisors or project leaders have to revise the data and try a uniformization.
13. Competency Centers
An enterprise dedicated to projects may be organized into "competency centers" (CCs). This means that professionals are grouped not into business departments or divisions, but into groups of associated knowledge areas. In this organization, business departments are reduced. Their goal is now to develop new project for the enterprise or its clients. The project designs are the responsibility of business departments, which request from the Project Management Center one or more managers for a project, and from each CC the technicians necessary to develop it.
Obviously, clear characterizations of information, knowledge and competency held by professionals and surveying them are necessary for the characterization, organization and functioning of a CC.
The reason for organizing CCs is clear: enterprises want to optimize the allocation of human resources, diminishing personnel idle time and choosing for each project or function the right people with the necessary competency. Moreover, such an organization provides for a much greater flexibility and operational dynamics, making it surely more suitable to our hectic, fast changing times.
The advantages are clear. But, what about the disadvantages? We fear that CCs may disrupt social integration and a sense of identity to the company. Professionals may identify themselves with the project they are involved with, but projects are not so stable and durable as departments and divisions are. When a project was the initiative and accomplishment of a department, finishing it would mean remaining in the same department and embarking on another project with some of the same colleagues and in the same administrative environment. Coming from a CC, after the end of a project the professional will return to that center, meeting people that participated in other projects. It is being argued in the company for which we are studying the question of the ITCC that professionals will develop an identity to their CCs. They will be able to interact much more with their peers, who were scattered along many departments in the classical organizational models and had almost no contact with each other. This should also provide for the exchange of information and knowledge, helping the development of competencies through joint work. We hope this view is correct, and that our fears do not materialize. Anyhow, in this field our colleagues and we are in the realm of pure information. That company is in the process of developing knowledge and competency on how CCs work. We will have lots of things to say in the future.
14. Conclusions
In this work original characterizations for information, knowledge and competency were presented. In the dozes of interviews we made to assess competencies of professionals in the I.T. area, those concepts showed to be extremely useful. Interviewed people got rapidly used to them, when classifying their degrees of information, knowledge and competency. Another contribution was the characterization of competency as referring to a certain ability over a certain knowledge area. This way, it was possible to introduce more clarity into those concepts, and represent competencies as bi-dimensional matrices, grouping together knowledge area in different matrices according to the set of abilities which apply to different areas.
These matrices represent, in essence, a systematization of curricula in terms of competencies, knowledge and information possessed by professionals. Traditionally, curricula used for selecting candidates for available positions of in project team formation, consist of non-systematic texts. Even when well organized in clear sections, these texts can not be subjected to data processing, as opposed to our method. It differs from other competency management systems by using matrices and taking into account different degrees of information, knowledge and competency.
It is important to emphasize that the computer only indicates which professionals qualify to the required competencies. After this indication, one should proceed to examining curricula, doing personal interviews, etc. This way one complements the data with a phase of subjective analysis, always necessary when dealing with human questions (see section 5); otherwise people are handled as machines, leading in general to psychological problems, besides fault selections.
The practical results of competency assessments using our method at the large companies PROMON and PRODESP was very good. Professionals were thankful for the opportunity of having their curricula represented in a systematized way, and the possibility of continuously updating them.
There are 3 great problems for assessing competencies with our method. Firstly, there is a need of leveling the criteria for assigning the various competency degrees, otherwise it is not possible to compare professionals. This problem was solved at PROMON by concentrating all interviews in just one interviewer. But this is not feasible when having hundreds of professionals, because each interview takes at least 1 hour. Secondly, this method does not take into account the quality of projects and the work accomplished by assessed professionals. For this, it would be necessary to introduce one further factor, which should be assessed by managers and project leaders. But then an aspect of assessment by third parties would be introduced, with all the social problems involved. We avoided these problems by not considering that quality. A third problem, which was not faced, was the introduction of a behavioral matrix, with competencies on leadership, on team work, quality of written and oral communications, etc. Many authors give more importance to behavioral competencies than technical ones, as for example Daniel Goleman. But the assessment of these competencies introduces a delicate factor which should probably avoided in an inicial phase of a competency management project: the need for an assessment done by the professional's manager. At PROMON were required to assess just technical competencies, thus the problem was cut at its root. At PRODESP, we proposed to start without behavioral competencies, in order to avoid the problems originating from a professional being assessed by a third party, which could position them against our project. Maybe because in both cases we avoided such conflicts, the system was extremely well accepted by professionals in both companies.
References
Cusumano, M and Selby, R.W. How Microsoft Builds Software. Communications
of the ACM v20n6, June 1997, pp 53-61.
Davenport, T.H. and
L.Prusak.Working Knowledge: How Organizations Manage What They Know.
Boston: Harvard Business School Press, 1998.
Devlin, K. InfoSense:
Turning Information into Knowledge. N.York: W.H.Freeman, 1999.
Goleman,
D. Emotional Intelligence. N.York: Brockman, 1993.
Malhorta, Y.
Tools@Work - Deciphering The Knowledge Management Hype. Journal of Quality
& Participation, special issue on Learning and Information Management,
v21n4, July/Aug 1998, pp. 58-60. Available through