The Data Cleansing & Deduplication Blog

Glossary of Common Data Management Terms

Posted by:

data management

Maintaining voluminous amount of data and business information can be a real challenge. Yet it is something that every business must tackle and handle with great efficiency. Otherwise, the quality of databases can be compromised and a lot of business decisions could go haywire. While there is a need to invest in reliable data management systems, many benefits can also be derived by investing in people, equipping them w ith the right knowledge and tools so they can perform their functions to the fullest.

To create high quality and reliable data, a thorough understanding not only of the system but of the terms being used is of prime importance. Here is a list of common database terms:

Append

Similar to import; copying all or a subset of records from one table, and adding them to another.

Attribute

Describes the value found in each field in a table. Every field or column in a database table represents a single attribute of that table. (An attribute is what the data in that field represents, while the value is the actual data that a specific field contains.

 

Browse

A way of displaying multiple records from a table, or set of related tables, in a tabular format wherein rows display records and columns display fields, like a spreadsheet; aka Browse Table.

Case; Casing

To designate which characters in an alpha string will be uppercase and which will be lowercase. Common casing methods include: uppercase all characters; lowercase all characters; uppercase first character of the string; uppercase the first character of each “word” (space-separated substrings) contained (aka called “Proper” case); lowercase the entire string, then uppercase the first character; or lowercase the entire string, then uppercase the first character of each “word.”

Case-Sensitive

To be aware of the case of character values. In this context, “SPUD,” “Spud” and “spud” would all be considered as different strings, so the case-sensitivity of a function or query will influence the values they will return.

Column

Synonymous with field.

 

Concatenation

Linking a consecutive series of field values, strings or a combination of both together in order to build a data item or field value (e.g. concatenating City + comma + space + State + space + ZIP Code to form the last line of an address).

Consolidation

Also known as merging (mailing list terminology), consolidation is merging various data sets into a single master set, standardizing table structure, data types, fields and their values.

 

CRM: (Customer Relationship Management)

Customer relationship management (CRM) is a widely-implemented strategy for managing a company’s interactions with customers, clients and sales prospects. It involves using technology to organize, automate, and synchronize business processes—principally sales activities, but also those for marketing, customer service, and technical support. The overall goals are to find, attract, and win new clients, nurture and retain those the company already has, entice former clients back into the fold, and reduce the costs of marketing and client service.

Data Migration

The process of transferring data between storage types, formats, or computer systems. Data migration is usually performed programmatically to achieve an automated migration, freeing up human resources from tedious tasks. It is required when organizations or individuals change computer systems or upgrade to new systems, or when systems merge (such as when the organizations that use them undergo a merger or takeover). To achieve an effective data migration procedure, data on the old system is mapped to the new system providing a design for data extraction and data loading. The design relates old data formats to the new system’s formats and requirements. Programmatic data migration may involve many phases but it minimally includes data extraction where data is read from the old system and data loading where data is written to the new system.

Data Type

Every field in every table in a database must be declared as a specific type of data with defined parameters and limitations (e.g. numeric, character or text, date, logical, etc.), known as a data type.

Database

1) A collection of all the data needed by a person or organization to perform their required functions.
2) A collection of related files or tables.
3) Any collection of data organized to answer queries; or,
4) [Informally,] a database management system.
Databases usually consist of both data and metadata [data about the database’s data]. When a database contains a description of its own structure, it is said to be self-describing. A database is integrated when it includes its relationships among data items as well as the data items themselves.

Database Administrator [DBA]

The person who is ultimately responsible for the functionality, integrity, and safety of the database.

Database Management System [DBMS]

Also called a database manager. An integrated collection of programs designed to enable people to design databases, enter and maintain data, and perform queries.

Database Manager

1) The person with primary responsibility for the design, construction, and maintenance of a database. 2) [Informally,] a database management system.

Deduplication

The removal of redundant data by removing duplicate records. The duplicate data is deleted saving only one copy of the data.

Domain

A collection or range of all the possible values a field can contain. Although a field’s domain is typically finite, it may be infinite as well.

Field

Synonymous with column. A component of a relation or table that holds a single attribute of that relation or table.

 

File

1) The separately named unit of storage for all data, programs and indexes on most computer systems. For example, a table or a whole database may be stored in one file;
2) Term used as a synonym for relation or table in some database managers [usually smaller or older], like dBase, FoxPro, Alpha Four/Five, etc.

Filter; Filtering

The act of choosing particular records while filtering out others; also referred to as selecting.

Import

An operation by which an application brings data in from a generic source file (from which embedded programming and control characters have been filtered out) and converts it into its own native file format for use.

Index

1) A method used to reorder display or output records in a specific order.
2) A data structure of pointers used to provide rapid, random access to rows in the table.

Integrity

The property of the database that ensures that the data contained in the database is as accurate and consistent as possible.

Key

A key is a field, or combination of fields, that uniquely identifies a record in a table.

Matchcode

Primarily used for identifying duplicate records either contained in the same table, or shared between several tables or lists. A specialized composite key , a matchcode is typically a collection of name and address components that best represent the unique identity of each whole record and the entity it represents. A matchcode might consist of combined First Name, Last Name, Company, Address and ZIP Code field elements, concatenated into matchcode “keys,” which can be compared during the deduping or purging process.

 

Null; Null Value

If a field contains a data item, that has a specific value. A field that does not contain a data item is said to have a null value. In a numeric field, a null value is not the same as a value of zero; in a character field, a null value is not the same as a blank — both the numeric zero and blank character are definite values. A null value indicates that the that the field’s value is undefined — it’s value is not known.

Parse; Parsing

Intelligently separating a field value or string into its component parts (e.g. parsing a Full Name into its five characteristic components: Prefix, First Name, Middle Name [or Initial], Last Name and Suffix). The opposite action is called concatenation.

 

Post; Posting

An operation that adds, subtracts or replaces values in one table using values in another.

Purge; Purging

Also called deduping by the mailing industry, purging, removes duplicate records from within a single file, table (or mailing list), or those shared among several files, tables or lists.

 

Query

1) Literally, a question you ask about data in the database in the form of a command written in a query language, defining sort order and selection, that is used to generate an ad hoc list of records.
2) The output subset of data produced in response to a query.

Record

Synonymous with row and tuple. An instance of data in a table, a record is a collection of all the facts related to one physical or conceptualentity; often referring to a single object or person, usually represented as a row of data in a table, and sometimes referred to as a tuple in some, particularly older, database management systems.

 

Scrubbing

The cleaning of data to remove any inconsistencies or inaccuracies.

Select; Selection

A query in which only some of the records in the source table appear in the output.

Server

The part of a client/server system that holds the database (the back end). The server also holds the server portion of the database management system.

SQL

Pronounced “Sequel”, it stands for Structured Query Language, the standard format for commands that most database software understands. There are different dialects, since every program handles certain types of data differently, but the core commands are always the same. ODBC uses SQL as the “Lingua Franca” to transfer information between databases. Currently accepted ANSI standard is SQL-92.

String

A sequence of alphanumeric characters.

Structure

The basic architecture of a table including: number of fields, their names, sequence, data types and sizes.

Table

Synonymous with relation. A collection of data organized into records and fields (aka rows and columns), with fields being descriptions of the kinds of information contained in each record (attributes); and records being specific instances usually referring to specific objects or persons (entities).

 

Transaction

1)    The fundamental unit of change in many (transaction-oriented) databases. A single transaction may involve changes in several tables, all of which must be made simultaneously in order for the database to be internally consistent and correct.

2) A real-life event which is modeled by the changes to the database;

3) The sequence of SQL statements whose effect is not accessible to other transactions until all of its statements have been executed.

 

Transactional Database

A transactional database is a DBMS where write operations on the database are able to be rolled back if they re not completed properly. If a transactional database system loses electrical power half-way through a transaction, the partially completed transaction will be rolled back and the database will be restored to the state it was in before the transaction started.

Validation

Verification that a field’s value doesn’t violate any constrains defined for it by the database.

Value

The computer representation of a fact about an entity.

 

WinPure is the data management partner you require. Our products are fast, incredibly simple to use and critically effective. Our goal is to provide you with exceptional performance with all of your data management and data integration needs. Click here to download a free trial and start energizing your marketing strategies today!

 

 

0


About the Author: