Tuesday, May 25, 2010

Data Reduction Technology

Data Reduction Technology
Data size is increasing! Corporates struggles to maintain the costs and also investing huge in backup and DR solutions to protect the critical data and for high availability of data. Major storage companies new offering related data reduction technologies will help in shrinking the data size which will help in performance, data integrity, eliminating redundant data, reduce data protection costs, improve the utilization of storage, faster remote backups, replication, and disaster recovery.
There are a number of technologies that fall under the classification of data reduction or deduplication techniques.
NetApp & EMC provides data reduction technology.
NetApp Inc – Deduplication works at block level is the most prominent of the offerings taking aim at primary storage.
EMC - Celerra Data Deduplication, which actually performs compression before tackling deduplication on file-based data.
The following table shows four major data reduction technologies along with the space they can be expected to save when applied to a “file server or nas data set.
Technology
"Typical" Space Savings
Resource Footprint
File-level deduplication
10%
Low
Fixed block deduplication
20%
High
Variable Block Deduplication
28%
High
Compression
40% - 50%
Medium
File-level deduplication, also known as file single instancing, provides relatively modest space savings but is also relatively lightweight in terms of the CPU and memory resources required to implement it. Fixed-block deduplication provides better space savings but is far more resource-intensive due to the processing power required to calculate hashes for each block of data and the memory required to hold the indices used to determine if a given hash has been seen before. Variable-block deduplication provides slightly better space savings than fixed block deduplication but the difference is not significant when applied to file system data. Variable block deduplication is much effective to data sets that contain misaligned data, such as backup data in backup-to-disk or VTL environments. The resource footprint of variable block deduplication is not dissimilar to fixed block deduplication. It requires similar amounts of memory and slightly more processing power. Compression is often considered to be different from deduplication. However, compression can be described as infinitely variable, bit-level, intra-object deduplication. Technical pedantry aside it is simply another technique that alters the way in which data is stored to improve the efficiency with which it is stored. In fact it offers by far the greatest space savings of all the techniques listed for typical NAS data, and is relatively modest in terms of its resource footprint. It is relatively computer-intensive but requires very little memory.
Technological Classification
The practical benefits of this technology depend upon various factors like –
Application’s Point – Source Vs Target
Time of Application – Inline vs Post-Process
Granularity – File vs Sub-File level
Algorithm – Fixed size blocks Vs Variable length data segments

Sunday, January 17, 2010

ReplV2 & NAS Shutdown

Celerra Gateway NAS 5.6 / Replication V2

(1) whether i need to stop replication for a nas gateway planned shutdown?
No, you do not need to stop replication V2 since it is common base checkpoints on both the source and destination.
(2) how to shut down celerra gateway?
if connected through serial
-step1- stop all nas services using respective celerra commands; then mount respective partitions.
-step2-then shutdown nas using server_cpu (if IP Replication configured otherwise use nas_halt). Before these DM's should be contacted each other and atleast one to be in "5".
**if not connected through serial - then follow only step2**
**use latest procedures in powerlink**

Monday, December 28, 2009

Any conversion for backend CLARiiON in Celerra GW attached ?

Note1 - Check compatibility for upgrades for Flare code, Dart Code & Compatibility NAS GW with Target CX Backend (in EMC ESM)
Note2 - Validate Flare / Dart (in some cases NAS Code upgrade may reqd)
Note3 - Check whether required space is available in first five drive for conversion (vault drives). This is to accomodate latest flare code and mandatory pre-requisites (after considering the utilised space for NAS OS Luns)
Note4 - Shutdown procedure for NAS GW
Note5 - Conversion procedure for CX Backend
Note6 - Power ON procedure for CX and NAS GW
Note7 - Validate accessibility/environment

EMC has internal procedures which will be released case-to-case basis. It is mandatory to follow all those procedure after change control process !

Tuesday, November 24, 2009

Please share some EMC.Com (Tech Tips) regards to difference in NAS for Integrated and Gateway Model?

Celerra - NAS Storage Platform from EMC features are really impressive depending on the model customer purchase. EMC Celerra's main hardware components are Control Station, Data Mover and Backend Storage (which could be CLARiiON or DMX). In integrated models all storage volume allocation configuration for NAS are done through control station without loging to Backend Storage (done through inbuilt batch script); whereas in gateway model BAckend Storage is a different component and Celerra Gateway model is a different component; hence in gateway we need to zone celerra to backend CX or DMX; then for allocating volumes to Celerra we need to login to backend storage like DMX or CX.

In short
Gateway - Doesn't have integrated CX or DMX (Fabric switch or direct connect. If switch involved then need to zone with data mover and CX or DMX)
Integrated - The CX or DMX is included in NAS (no need to zone or switch not involved)

Friday, November 20, 2009

How to move CIFS from PDM to VDM?

There is a process within EMC to move the PDM to VDM for CIFS environment. Moving PDM to VDM requires EMC approval and EMC approved documents and would need to engage a real Celerra expert. This is typically done by cifs_move command. EMC recommends CIFS environment to be on VDM. Downtime required to move from PDM to VDM on production environment; it could be 2 to 3 hours for moving and validation.