Accessing
Organizational
Information – Data Warehouse
What is
DATA WAREHOUSE?
History of Data Warehousing
vIn the 1990’s executives became less
concerned with the day-to-day business operations and more concerned with
overall business functions
vThe data warehouse provided the ability to
support decision making without disrupting the day-to-day operations, because:
§Operational information is
mainly current – does not include the history for better decision making
§Issue of quality information
vWithout information history, it
is difficult to tell how and why things change over time.
Data Warehouse Fundamentals
vData
warehouse – a logical collection of information –
gathered from many different operational databases – that supports business
analysis activities and decision-making tasks
vThe primary purpose of a data warehouse is to
combined information throughout an organization into
a single repository for decision-making purposes – data warehouse support only
analytical processing
Data Warehouse Model
vExtraction,
transformation, and loading (ETL) – a
process that extracts information from internal and external databases, transforms the information using a common set of
enterprise definitions, and loads the information into a data warehouse.
vData warehouse then
send subsets of the information to data mart.
vData mart – contains a subset of data warehouse
information
Multidimensional Analysis
and Data Mining
and Data Mining
vRelational Database contain information in a
series of two-dimensional tables.
vIn a data warehouse and data mart, information is multidimensional, it contains layers of columns and rows
§Dimension –
a particular attribute
of information.
vOnce a cube of information is created, users
can begin to slice and dice the cube to drill down into the information.
vUsers can analyze information in a number of
different ways and with number of different dimensions.
Multidimensional Analysis
and Data Mining
and Data Mining
vData
mining – the process of analyzing data to extract
information not offered by the raw data alone. Also known as "knowledge
discovery" – computer-assisted tools and techniques for sifting through
and analyzing vast data stores in order to find trends, patterns, and
correlations that can guide decision making and increase understanding.
vTo perform data mining users need data-mining
tools
Data-mining tool –
uses a variety of techniques to find patterns and relationships in large
volumes of information. Eg: retailers can use knowledge
of these patterns to improve the placement of items in the layout of a
mail-order catalog page or Web page.
Information Cleansing or Scrubbing
vAn organization must maintain high-quality
data in the data warehouse
vInformation
cleansing or scrubbing – a process that weeds
out and fixes or discards inconsistent, incorrect, or incomplete information
vOccur during ETL process and second on the
information once if is in the data warehouse
Business Intelligence
vBusiness intelligence – refers to applications and
technologies that are used to gather, provide access, analyze data, and
information to support decision making effort.
vthese
systems will illustrate business intelligence in the areas of customer
profiling, customer support, market research, market segmentation, product
profitability, statistical analysis, and inventory and distribution analysis to
name a few
vEg: Excel,
Access