THE HINDU BUSINESS LINE
Financial Daily
from THE HINDU group of publications

Wednesday, June 21, 2000

• AGRI-BUSINESS
• BANKING & FINANCE
• COMMODITIES
• CORPORATE
• INDUSTRY
• INFO-TECH
• LETTERS
• LOGISTICS
• MACRO ECONOMY
• MARKETING
• MARKETS
• MONEY
• NEWS
• OPINION
• VARIETY
• EWORLD
• INFO-TECH
• CATALYST
• INVESTMENT WORLD
• MONEY & BANKING
• LOGISTICS

• PAGE ONE
• INDEX
• HOME

Info-Tech | Next


Developing a data warehouse

George Albert

IN the previous article we looked at an ideal Data Warehouse Architecture (DWA). DWA is essentially a framework for understanding data warehousing and how the components of data warehousing fit together. On the Internet, a correct data warehousing strate gy is crucial for speed and accessibility of information.

Most organisations will not be able to put together the complex DWA at one go. DWA essentially provides a road map to what may be the ultimate data warehouse. Coupled with an understanding of the options at hand, the Data Warehouse Architecture provides a useful way of determining if the organisation is moving toward a reasonable data-warehousing framework.

Organisations also need to understand that a data warehouse is not just putting in all the content into a large computer. It is more conceptual than that. One has to see what kinds of data requirements various levels of the organisation need. Often DW de signers find that officials at an organisation are not able to tell what their data requirements are.

When developing a DW three aspects must be kept in mind, scope of the data warehouse, data redundancy, and type of end-user. The scope of a data warehouse may vary to include all the informational data for the entire enterprise from the beginning of time , or it may be a narrow personal data warehouse for a single manager for a single year.

The broader the scope, greater is the value of the data warehouse to the enterprise and the more expensive and time consuming it is to create and maintain. Hence it is good to begin with functional, departmental or divisional data warehouses and then exp and them as users provide feedback.

Data redundancy is essentially duplication of data. Greater the number of users and more distributed the databases, higher the redundancy. This is so, as data has to be duplicated in more warehouses.

There are generally three levels of data redundancy that enterprises look at when weighing data warehouse options.

``Virtual' or ``Point-to-Point'' data warehouses have the least amount of redundancy, distributed data warehouses have the maximum amount of redundancy and central data warehouses have less redundant data than the latter and more than the former option. Organisations generally chose between a blend of the three warehouses to optimally meet their requirements.

In a virtual or point-to-point data warehousing system end-users are connected directly to the warehouse using tools in the ``data access network''. This approach provides the ultimate in flexibility as well as the minimum amount of redundant data that m ust be loaded and maintained. Virtual warehousing is, however only an initial strategy in organisations and they then have to move on to more complex systems.

The central data warehouse is a single physical database that contains all of the data for a specific functional area, department, division or enterprise. Central data warehouses are selected where there is a common need for informational data and there are large numbers of end-users already connected to a central computer or network. The data stored in the warehouse is accessible from one place and must be loaded and maintained on a regular basis.

In a distributed data warehouses, components of the data warehouse are distributed across a number of different physical databases. This is ideal for large decentralised organisations which have empowered decision-making down to the lower levels. This pu shes data needed for decision-making down (or out) to the local area network or local computer serving the local decision-maker. Distributed data warehouses usually involve the most redundant data and as a consequence most complex loading and updating pr ocesses and hence must be attempted by organisations with large resources.

The end user also plays a key role to choosing a DW strategy. If you are a small business-to-business (B2B) company there is no point going in for a distributed data warehouse. A central data warehouse will do just fine. However, if you are an Amazon.com then a distributed data warehouse is a must.

Related links:
The importance of data warehousing
Layers in data warehouse architecture

Comment on this article to BLFeedback@thehindu.co.in

Send this article to Friends by E-Mail


Next: Hinduja Finance to merge with Ashok Leyland IT unit
Info-Tech

Agri-Business | Banking & Finance | Commodities | Corporate | Industry | Info-Tech | Letters | Logistics | Macro Economy | Marketing | Markets | Money | News | Opinion | Variety | eWorld | Info-Tech | Catalyst | Investment World | Money & Banking | Logistics |

Page One | Index | Home


Copyright © 2000 The Hindu Business Line.

Republication or redissemination of the contents of this screen are expressly prohibited without the written consent of The Hindu Business Line.