DATABASE CONCEPTS
● Introduction
Like a library, secondary storage is designed to store information. How is this stored information organized? What are databases, and why do you need to know anything about them?
Only a few decades ago, a computer was considered to be an island with only limited access to information beyond its own hard disk. Now, through communication networks and the Internet, individual computers have direct electronic access to almost unlimited sources of information.
In today's world, almost all information is stored in databases. They are an important part of nearly every organization including schools, hospitals, and banks. To effectively compete in today's world, you need to know how to find information and understand how it is stored.
To efficiently and effectively use computers, you need to understand data fields, records, tables, and databases. You need to be aware of the different ways in which a database can be structured and the different types of databases. Also, you need to know the most important database uses and issues.
DATA
As we have discussed throughout this article, information systems consist of A people, procedures, software, hardware, data, and the Internet. This blog focuses on data, which can be defined as facts or observations about people, places, things, and events. More specifically, this blog focuses on how databases store, organize, and use data.
Not long ago, data was limited to numbers, letters, and symbols recorded by keyboards. Now, data is much richer and includes:
∆ Audio capture, interpreted, and save using microphone and voice recognition systems.
∆ Music downloaded from the Internet and saved on smartphones, tablets, and other devices.
∆ Photographs captured by digital cameras, edited by image editing software, and shared with others over the Internet.
∆ Video captured by digital video cameras, TV tuner cards, and webcams.
There are two ways or perspectives to view data. These perspectives are the physical view and the logical view. The physical view focuses on the actual format and location of the data.Data is recorded as digital bits that are typically grouped together into bytes that represent characters using a coding scheme such as Unicode. Typically, only very specialized computer professionals are concerned with the physical view. The other perspective, the logical view, focuses on the meaning, content, and context of the data. End users and most computer professionals are concerned with this view. They are involved with actually using the data with application programs. This chapter presents the logical view of data and how data is stored in databases.
Questions :
Describe some of the different types of data.
What is the physical view of data?
What is the logical view of data?
DATA ORGANIZATION
The first step in understanding databases is to learn how data is organized. In The the logical view, data is organized into groups or categories. Each group s more complex than the one before.
● Character: A character is the most best logical data element. It is a single letter, number, or special character, such as a punctuation mark, or a symbol, such as $.
● Field: The next higher level is a field or group of related characters. In our example, Masurkar is in the data field for the Last Name of an employee. It consists of the individual letters (characters) that make up the last name. A data field represents an attribute (description or characteristic) of some entity (person, place, thing, or object). For example, an employee is an entity with many attributes, including his or her last name.
● Record: A record is a collection of related fields. A record represents a collection of attributes that describe an entity. In our example, the payroll record for an employee consists of the data fields describing the attributes for one employee. These attributes are First Name, Last Name, Employee ID, and Salary
● Table: A table is a collection of related records. For example, the Payroll Table would include payroll information (records) for the employees (entities).
● Database: A database is an integrated collection of logically related tables. For example, the Personnel Database would include all related employee tables including the Payroll Table and the Benefits Table.
KEY FIELD
Each record in a table has at least one distinctive field, called the key field. Also known as the primary key, this field uniquely identifies the record. Tables can be related or connected to other tables by common key fields.
For most employee database, a key field is an employee identification number: Key fields in different tables can be used to integrate the data in a database. For example, in the Personnel Database, both the Payroll and the Benefits tables include the field Employee ID. Data from the two tables could be related by combining all records with the same key field (Employee ID).
BATCH VERSUS REAL-TIME PROCESSING
Traditionally, data is processed in one of two ways. These are batch processing, or what we might call "later," and real-time processing, or what we might call "now."These two methods have been used to handle common record-keeping activities such as payroll and sales orders.
● Batch processing: In batch processing, data is collected over several hours, days, or even weeks. It is then processed all at once as a "batch." If you have a credit card, your bill probably reflects batch processing. That is, during the month, you buy things and charge them to your credit card. Each time you charge something, an electronic copy of the transaction is sent to the credit card company. At some point in the month, the company's data processing department puts all those transactions and those of many other customers) together and processes them at one time. The company then sends you a single bill totaling the amount you owe.
● Real-time processing: Real-time processing, also known as online processing, occurs when data is processed at the same time the transaction occurs. For example, whenever you request funds at an ATM, real-time processing occurs. After you have provided account information and requested a specific withdrawal, the bank's computer verifies that you have sufficient funds in your account. If you do, then the funds are dispensed to you, and the bank immediately updates the balance of your account.
Questions :-
From the logical view, describe how data is organized or categorized.
What are key fields and how are they used?
Compare batch processing and real-time processing.
DATABASES
Many organizations have multiple files on the same subject or person. For example, a customer's name and address could appear in different files within the sales department, billing department, and credit department. This is called data redundancy. If the customer moves, then the address in each file must be updated. If one or more files are overlooked, problems are likely to arise. For example, a product ordered might be sent to the new address, but the bill might be sent to the old address. This situation results from a lack of data integrity.
Moreover, data spread around in different files is not as useful. The marketing department, for instance, might want to offer special promotions to customers who order large quantities of merchandise. To identify these customers, the marketing department would need to obtain permission and access files in the billing department. It would be much more efficient if all data were in a common database. A database can make the needed information available.
NEED FOR DATABASES
For an organization, there are many advantages to having databases:
● Sharing: In organizations, information from one department can be readily shared with others. Billing could let marketing know which customers ordered large quantities of merchandise.
● Security: Users are given passwords or access only to the kind of information they need. Thus, the payroll department may have access to employees' pay rates, but other departments would not.
● Less data redundancy: Without a common database, individual departments have to create and maintain their own data, and data redundancy results. For example, an employee's home address would likely appear in several files, Redundant data causes inefficient use of storage space and data maintenance problems.
● Data integrity: When there are multiple sources of data, each source may have variations. A customer's address may be listed as "Main Street" in one system and "Main St." in another. With discrepancies like these, it is probable that the customer would be treated as two separate people.
DATABASE MANAGEMENT
In order to create, modify, and gain access to a database, special software is required. This software is called a database management system, which is commonly abbreviated as DBMS.
Some DBMSs, such as Microsoft Access, are designed specifically for personal computers. Other DBMS is designed for specialized database servers. DBMS software is made up of five parts or subsystems: DBMS engine, data definition,data manipulation, application generation, and data administration.
● The DBMS engine provides a bridge between the logical view of the data and the physical view of the data. When users request data (logical perspective). the DBMS engine handles the details of actually locating the data (physical perspective)
● The data definition subsystem defines the logical structure of the database by using a data dictionary or schema. This dictionary contains a description of the structure of data in the database. For a particular item of data, it defines the names used for a particular field. It defines the type of data for each field (text, numeric, time, graphic, audio, and video). An example of an Access data dictionary form is presented.
● The data manipulation subsystem provides tools for maintaining and analyzing data. Maintaining data is known as data maintenance. It involves adding new data, deleting old data, and editing existing data. Analysis tools support viewing all or selected parts of the data, querying the database, and generating reports. Specific tools include query-by-example and a specialized programming language called structured query language (SQL).
● The application generation subsystem provides tools to create data entry forms and specialized programming languages that interface or work with common and widely used programming languages such as C++ or Visual Basic. For a data entry form created by the application generation subsystem in Access.
● The data administration subsystem helps to manage the overall database, including maintaining security, providing disaster recovery support and monitoring the overall performance of database operations. Larger organizations typically employ highly trained computer specialists, called database administrators (DBAs), to interact with the data administration subsystem. Additional duties of database administrators include determining processing rights or determining which people have access to what kinds of data in the database.
Questions :-
What is data redundancy?
What is data integrity?
What are the advantages to having databases?
List the five basic subsystems and describe each.
What is a data dictionary? Data maintenance? What are processing rights?
TYPES OF DATABASES
Databases may be small or large, limited in accessibility or widely accessible. Databases may be classified into four types: individual, company, distributed, and commercial
INDIVIDUAL
The individual database is also called a personal computer database. It is a collection of integrated files primarily used by just one person. Typically, the data and the DBMS are under the direct control of the user. They are stored either on the user's hard-disk drive or on a LAN file server.
There may be many times in your life when you will find this kind of database valuable. If you are in sales, for instance, a personal computer database can be used to keep track of your customers. If you are a sales manager, you can keep track of your salespeople and their performance. If you are an advertising account executive, you can keep track of your different projects and how many hours to charge each client.
COMPANY
Companies, of course, create databases for their own use. The company database may be stored on a central database server and managed by a database administrator. Users throughout the company have access to the database through their personal computers linked to local or wide area networks.
Company databases are the foundation for management information systems. For instance, a department store can record all sales transactions in the database. A sales manager can use this information to see which salespeople are selling the most products. The manager can then determine year-end sales bonuses. Or the store's buyer can learn which products are selling well or not selling and make adjustments when reordering. A top executive might combine overall store sales trends with information from outside databases about consumer and population trends. This information could be used to change the whole merchandising strategy of the store.
DISTRIBUTED
Many times the data in a company is stored not in just one location but in several locations.
It is made accessible through a variety of communications networks. The database, then, is a distributed database. That is not all the data in a database is physically located in one place. Typically, database servers on a client/server network provide the link between the data,
For instance, some database information can be at regional offices. Some can be at company headquarters, some down the hall from you, and some even overseas. Sales figures for a chain of department stores, then, could be located at the various stores. But executives at district offices or at the chain's headquarters could have access to all these figures.
COMMERCIAL
A commercial database is generally an enormous database that an organization develops to cover particular subject. It offers access to this database to the public or selected outside individuals for a fee, Sometimes commercial databases also are called information utilities or data banks. An example is NSE Infobase, which offers a variety of information-gathering and reporting services.
Some of the most widely used commercial databases are:
● Dialog Information Services - offer business information, as well as technical and scientific information.
● Dow Jones Interactive Publishing - provides world news and information on business, investments, and stocks.
● NSE Infobase-offers news and information on legal news, public records,and business issues.
Most of the commercial databases are designed for organizational as well as individual use. Organizations typically pay a membership fee plus hourly use fees. Often, individuals are able to search the database to obtain a summary of available information without charge. They pay only for those items selected for further investigation.
Questions :-
List four types of databases and describe each.
Give a brief example of each type of database.
What is the difference between a company database and a distributed database?
DATABASE USES AND ISSUES
Databases offer great opportunities for productivity. In fact, in corporate libraries, electronic databases are now considered more valuable than books and journals. However, maintaining databases means users must make constant efforts to keep them from being tampered with or misused.
STRATEGIC USES
Databases help users to keep up to date and to plan for the future. To support the needs of managers and other business professionals, many organizations collect data from a variety of internal and external databases. This data is then stored in a special type of database called a data warehouse. A technique called data mining is often used to search these databases to look for related information and patterns.
There are hundreds of databases available to help users with both general and specific business purposes, including
● Business directories providing addresses, financial and marketing information,products, and trade and brand names.
● Demographic data, such as county and city statistics, current estimates on population and income, employment statistics, census data, and so on.
● Business statistical information, such as financial information on publicly traded companies, market potential of certain retail stores, and other business data and information,
● Text databases providing articles from business publications, press releases, reviews on companies and products, and so on.
● Web databases covering a wide range of topics, including all of the above. As mentioned earlier, web search sites like Google maintain extensive databases of available Internet content.
SECURITY
Precisely because databases are so valuable, their security has become a critical issue. One concern is that personal and private information about people stored in database may be used for the wrong purposes. For instance, a person's credit history or medical records might be used to make hiring or promotion decisions. Another concern is unauthorized users gaining access to a database. For example, there have been numerous instances in which a computer virus has been launched into a database or network.
Security may require putting guards in company computer rooms and checking the identification of everyone admitted. Some security systems electronically check fingerprints. Security is particularly important to organizations using WANs. Violations can occur without actually entering secured areas. As mentioned in previous chapters, most major corporations today use special hardware and software called firewalls to control access to their internal networks.
Questions :-
What is a data warehouse? What is data mining?
What are some database security concerns?
What is a firewall