<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>jarneil</title>
	<atom:link href="http://jarneil.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://jarneil.wordpress.com</link>
	<description>The thoughts of Jason Arneil</description>
	<lastBuildDate>Tue, 24 Jan 2012 16:23:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='jarneil.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>jarneil</title>
		<link>http://jarneil.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://jarneil.wordpress.com/osd.xml" title="jarneil" />
	<atom:link rel='hub' href='http://jarneil.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Building an Exadata Compute Node USB imaging drive</title>
		<link>http://jarneil.wordpress.com/2012/01/17/building-an-exadata-compute-node-usb-imaging-drive/</link>
		<comments>http://jarneil.wordpress.com/2012/01/17/building-an-exadata-compute-node-usb-imaging-drive/#comments</comments>
		<pubDate>Tue, 17 Jan 2012 14:58:29 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[Exadata]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1974</guid>
		<description><![CDATA[I&#8217;ve been having a lot of fun recently with doing a bare metal restore of an Exadata compute node. This process is reasonably well documented in MOS: 1084360.1. However one area where it falls down on, is how to actually image the node. So here is my guide to building an Exadata Compute Node USB [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1974&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been having a <em>lot</em> of fun recently with doing a bare metal restore of an Exadata compute node. This process is reasonably well documented in MOS: <a href="https://supporthtml.oracle.com/ep/faces/secure/km/DownloadAttachment.jspx?attachid=1084360.1:BAREMETAL_RESTORECN">1084360.1</a>. However one area where it falls down on, is how to actually image the node.</p>
<p>So here is my guide to building an Exadata Compute Node USB imaging drive. This will use Virtualbox on OS X. Yes, you could build it on an existing Exadata node.</p>
<p><strong>Obtaining Exadata Software</strong></p>
<p>First thing you need to do is download the correct version you are after. Yes, you can grab it from <a href="https://edelivery.oracle.com/">Oracle edelivery</a>, as long as the version you are after is not <em>too</em> old. There are choices for the various versions and separate downloads for compute nodes and storage nodes. You will download a zip file. When unzipped, assuming it is a compute node you are building you will find a computeImageMake_VERSION.tar.zip. You need to uncompress and then untar this <strong>as root</strong>  </p>
<p>You will now have a dl360 directory. Just happens that dl360 is a model of HP server. If you are building a storage node, it will be dl180, also a type of HP Proliant server.</p>
<p>Descending to this newly created directory you will see a readme: README_FOR_FACTORY.txt, which explains how to build the image. You will also see makeImageMedia.sh which actually does the building of the image.</p>
<p><strong>Building the Image</strong></p>
<p>Using a Linux VM and VirtualBox. present the USB stick to the VM. You need at least 2GB in size.</p>
<p><pre class="brush: bash;">
[root@localhost ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                       14G   11G  2.3G  83% /
/dev/sda1              99M   13M   82M  14% /boot
tmpfs                 502M     0  502M   0% /dev/shm
Downloads             297G  210G   88G  71% /media/sf_Downloads
/dev/hdc               43M   43M     0 100% /media/VBOXADDITIONS_4.1.6_74713
/dev/sdb1             3.8G  888K  3.8G   1% /media/UNTITLED
</pre></p>
<p>Unmount the device:</p>
<p><pre class="brush: bash;">
[root@localhost ~]# umount /dev/sdb1
</pre></p>
<p>Use fdisk to remove the FAT32 formatting and create a partition:</p>
<p><pre class="brush: bash;">

[root@localhost ~]# fdisk /dev/sdb

Command (m for help): p

Disk /dev/sdb: 4041 MB, 4041211904 bytes
255 heads, 63 sectors/track, 491 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1         492     3946495    b  W95 FAT32
Partition 1 has different physical/logical beginnings (non-Linux?):
     phys=(1023, 254, 63) logical=(0, 0, 3)
Partition 1 has different physical/logical endings:
     phys=(1023, 254, 63) logical=(491, 80, 37)

Command (m for help): d
Selected partition 1

Command (m for help): p

Disk /dev/sdb: 4041 MB, 4041211904 bytes
255 heads, 63 sectors/track, 491 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-491, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-491, default 491):
Using default value 491

Command (m for help): p

Disk /dev/sdb: 4041 MB, 4041211904 bytes
255 heads, 63 sectors/track, 491 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1         491     3943926   83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

</pre></p>
<p>This is the crucial part, the build will fail if it is formatted with FAT when you try and create the image. </p>
<p>Now you are ready to run the makeImageMedia.sh script:</p>
<p><pre class="brush: bash;">
[root@localhost dl360]# ./makeImageMedia.sh

Please wait. Calculating md5 checksums for cellbits ...
Please wait. Making initrd ...
180027 blocks
Please wait. Calculating md5 checksums for boot ...

Choose listed USB devices to set up the Oracle CELL installer

sdb   Approximate capacity 3946 MB
Enter the comma separated (no spaces) list of devices or word 'ALL' for to select all: ALL
sdb will be used as the Oracle CELL installer

All data on sdb will be erased. Proceed [y/n]? y

Command (m for help): Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): Command action
   e   extended
   p   primary partition (1-4)
Partition number (1-4): First cylinder (1-491, default 1): Last cylinder or +size or +sizeM or +sizeK (1-491, default 491):
Command (m for help): The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
umount2: Invalid argument
umount: /dev/sdb1: not mounted
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
492032 inodes, 983973 blocks
49198 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1010827264
31 block groups
32768 blocks per group, 32768 fragments per group
15872 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Writing inode tables: done                           
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 25 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
Copying files... will take several minutes


    GNU GRUB  version 0.97  (640K lower / 3072K upper memory)

 [ Minimal BASH-like line editing is supported.  For the first word, TAB
   lists possible command completions.  Anywhere else TAB lists the possible
   completions of a device/filename.]
grub&gt; root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83
grub&gt; setup (hd0)
 Checking if &quot;/boot/grub/stage1&quot; exists... no
 Checking if &quot;/grub/stage1&quot; exists... yes
 Checking if &quot;/grub/stage2&quot; exists... yes
 Checking if &quot;/grub/e2fs_stage1_5&quot; exists... yes
 Running &quot;embed /grub/e2fs_stage1_5 (hd0)&quot;...  15 sectors are embedded.
succeeded
 Running &quot;install /grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/grub/stage2 /grub/grub.conf&quot;... succeeded
Done.
grub&gt; Done creation of installation USB for DL360
</pre></p>
<p>This will now have built your image on the USB drive. It is now a good idea to test that it really has worked and contains the software you require.</p>
<p><strong>Testing your Image</strong></p>
<p>There are a couple of ways of testing you have the image you want. Thanks to my <a href="http://www.e-dba.com">e-dba</a> colleague, Matthew Walden, for pointing these out.</p>
<p>First off, present again the USB drive to your VM. </p>
<p>On OS X when inserting an Exadata USB image, it is unrecognised by the OS and you see this dialogue box:</p>
<p><a href="http://jarneil.files.wordpress.com/2012/01/eject.jpg"><img src="http://jarneil.files.wordpress.com/2012/01/eject.jpg?w=600&#038;h=471" alt="" title="eject" width="600" height="471" class="aligncenter size-full wp-image-1980" /></a></p>
<p>You need to choose eject. Then you can attach the USB drive to the VirtualBox VM. Once this is done, the following folder will appear:</p>
<p><a href="http://jarneil.files.wordpress.com/2012/01/folder.jpg"><img src="http://jarneil.files.wordpress.com/2012/01/folder.jpg?w=600&#038;h=399" alt="" title="folder" width="600" height="399" class="aligncenter size-full wp-image-1981" /></a></p>
<p>You can look inside the image.id file to check what the image was built with:</p>
<p><a href="http://jarneil.files.wordpress.com/2012/01/image-id.jpg"><img src="http://jarneil.files.wordpress.com/2012/01/image-id.jpg?w=600&#038;h=472" alt="" title="image.id" width="600" height="472" class="aligncenter size-full wp-image-1982" /></a></p>
<p>You can clearly see this is a Compute node and is built with the 11.2.1.2.3 software version.</p>
<p>You can also test that your image will boot using VirtualBox and the <a href="http://www.plop.at/en/bootmanagerdl.html">plop bootloader</a>. This allows you to select a USB device to boot from in VirtualBox. It downloads to an iso file which you boot a VM from. You then see the following menu:</p>
<p><a href="http://jarneil.files.wordpress.com/2012/01/plop.jpg"><img src="http://jarneil.files.wordpress.com/2012/01/plop.jpg?w=600&#038;h=464" alt="" title="plop" width="600" height="464" class="aligncenter size-full wp-image-1985" /></a></p>
<p>After presenting your USB drive containing the Exadata image to the VM, you can select the USB device to boot from. Eventually you will see the following splash screen:</p>
<p><a href="http://jarneil.files.wordpress.com/2012/01/boot.jpg"><img src="http://jarneil.files.wordpress.com/2012/01/boot.jpg?w=600&#038;h=375" alt="" title="boot" width="600" height="375" class="aligncenter size-full wp-image-1986" /></a></p>
<p>The above steps should give you some confidence that the USB image you have created will actually boot and Exadata node. </p>
<p>Note the above is only intended to for bare metal restore or reimaging of an Exadata node. In later software versions mounting an iso image via virtual cdrom on the iLOM works fine and would be my preferred solution.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1974/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1974/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1974/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1974/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1974/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1974/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1974/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1974/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1974/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1974/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1974/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1974/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1974/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1974/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1974&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2012/01/17/building-an-exadata-compute-node-usb-imaging-drive/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2012/01/eject.jpg" medium="image">
			<media:title type="html">eject</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2012/01/folder.jpg" medium="image">
			<media:title type="html">folder</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2012/01/image-id.jpg" medium="image">
			<media:title type="html">image.id</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2012/01/plop.jpg" medium="image">
			<media:title type="html">plop</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2012/01/boot.jpg" medium="image">
			<media:title type="html">boot</media:title>
		</media:content>
	</item>
		<item>
		<title>Visualising Exadata Disk hierarchy</title>
		<link>http://jarneil.wordpress.com/2011/12/11/visualising-exadata-disk-hierarchy/</link>
		<comments>http://jarneil.wordpress.com/2011/12/11/visualising-exadata-disk-hierarchy/#comments</comments>
		<pubDate>Sun, 11 Dec 2011 16:00:35 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[Exadata]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1962</guid>
		<description><![CDATA[I have already discussed previously the terminology involved in the hierarchy of abstractions when talking about Exadata disks. When learning new terminology I always think some form of diagram can make things much, much clearer. Hopefully the following diagrams will be of use to you too. First up, somewhat simplistically, how ordinary storage cell drives [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1962&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I have already <a href="http://jarneil.wordpress.com/2011/11/20/the-griddisk-is-connected-to-the-celldisk/">discussed previously</a> the terminology involved in the hierarchy of abstractions when talking about Exadata disks. When learning new terminology I always think some form of diagram can make things much, much clearer. Hopefully the following diagrams will be of use to you too.</p>
<p>First up, somewhat simplistically, how ordinary storage cell drives are carved up with griddisks:</p>
<p><a href="http://jarneil.files.wordpress.com/2011/12/griddisk-013.png"><img src="http://jarneil.files.wordpress.com/2011/12/griddisk-013.png?w=600&#038;h=415" alt="" title="griddisk.013" width="600" height="415" class="alignright size-full wp-image-1963" /></a></p>
<p>So we see here how multiple griddisks are created on top of 1 celldisk, though they have been labelled with the name of the ASM Diskgroup that they belong to. The thing to take away though, is the multiple griddisks presented to ASM from the 1 celldisk.</p>
<p>A bit more interesting is how a storage cell system disk looks like:</p>
<p><a href="http://jarneil.files.wordpress.com/2011/12/system-disk-012.png"><img src="http://jarneil.files.wordpress.com/2011/12/system-disk-012.png?w=600&#038;h=415" alt="" title="system disk.012" width="600" height="415" class="alignleft size-full wp-image-1965" /></a></p>
<p>So we see that there is no griddisk associated with the SYSTEM diskgroup (yes, newer storage cell versions have this called DBFS_DG). However we note that there are a multitude of additional partitons to hold the storage cell software, and (assuming an upgrade has been done) one previous version of the cell software.</p>
<p>It should be noted that there is still 1 celldisk created here on this system disk, and the two griddisks are still as normal carved out from this.</p>
<p>Finally a look at how the full hierarchy stacks up:</p>
<p><a href="http://jarneil.files.wordpress.com/2011/12/hierachy-007.png"><img src="http://jarneil.files.wordpress.com/2011/12/hierachy-007.png?w=600&#038;h=415" alt="" title="hierachy.007" width="600" height="415" class="alignleft size-full wp-image-1969" /></a></p>
<p>So here we see how all the terminology stacks up, with the actual physical disk at the bottom of the hierarchy.</p>
<p>The top pieces of the hierarchy are of course the ASM diskgroups upon which you can create your database datafiles.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1962/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1962/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1962/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1962/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1962/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1962/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1962/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1962/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1962/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1962/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1962/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1962/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1962/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1962/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1962&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2011/12/11/visualising-exadata-disk-hierarchy/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2011/12/griddisk-013.png" medium="image">
			<media:title type="html">griddisk.013</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2011/12/system-disk-012.png" medium="image">
			<media:title type="html">system disk.012</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2011/12/hierachy-007.png" medium="image">
			<media:title type="html">hierachy.007</media:title>
		</media:content>
	</item>
		<item>
		<title>UKOUG Annual Conference 2011</title>
		<link>http://jarneil.wordpress.com/2011/12/07/ukoug-annual-conference-2011/</link>
		<comments>http://jarneil.wordpress.com/2011/12/07/ukoug-annual-conference-2011/#comments</comments>
		<pubDate>Wed, 07 Dec 2011 22:31:01 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[UKOUG]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1946</guid>
		<description><![CDATA[Having been involved with the UKOUG for more than 10 years now, one of the things that is so nice about returning to the annual UKOUG conference is the sheer number of familiar faces. It also is a great place to hear about other people&#8217;s experiences and an opportunity to broaden your horizons. Sadly due [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1946&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Having been involved with the UKOUG for more than 10 years now, one of the things that is so nice about returning to the annual UKOUG conference is the sheer number of familiar faces.</p>
<p>It also is a great place to hear about other people&#8217;s experiences and an opportunity to broaden your horizons.</p>
<p>Sadly due to my youngest daughter&#8217;s terrible timing of being born at this time of the year, I had to miss the fabulous Oaktable Sunday and be at her 4th birthday party instead. Thanks to everyone who told me how good it was. While this did not exactly help, I&#8217;m glad others had fun.</p>
<p>One other regular feature of UKOUG conference time is that weather becomes extremely cold. It has been a ridiculously mild autumn/winter in the UK so far this year, but as I soon as I got to Birmingham I&#8217;m sure the temperature was 10C colder than in the previous week.</p>
<p>Some highlights that I particularly enjoyed:</p>
<p><strong>Monday</strong></p>
<p>Monday at UKOUG had a lot of talks in an Exadata stream, so I spent most of my time swimming there. The highlight presentation for me on Monday, was by <a href="http://fritshoogland.wordpress.com/">Frits Hoogland</a>. It started off looking like there would be a small crowd, but people kept flooding in after the start and by 5 minutes in, there was standing room only. A small explosion of a lightbulb blowing failed to put Frits off his stride in his &#8220;Explaining Exadata&#8221; presentation, and Frits gave an excellent overview of the components of Exadata.</p>
<p>I really liked how Frits was trying to strip away the magic and mystique surrounding Exadata &#8211; as he says it&#8217;s not magic just built on standard components. He then delved into how ASM works with the storage cells, and gave a good discussion on flash performance numbers. He Also gave some good information on real life infiniband performance.</p>
<p>Frits had a nice filmed demo highlighting various features of exadata using select of a very large table. Frits&#8217; example of the EHCC showed he could get huge space savings. He had lovely visualisations of the io and database wait profile while doing the scan of the table.</p>
<p>Frits shows that going from 11.2.0.1 to 11.2.0.2 improved serial direct path speed quite substantially. He then showed the improvement that parallel query could bring.</p>
<p>I also attended an Exadata round table, and there were people from a few companies who had implemented Exadata. I had gone hoping the discussions would be very techie, but a lot of the discussions focused around more of the business issues involved in an Exadata implementation &#8211; not because there were not loads of techies at the roundtable, but I think this is indicative of how disruptive Exadata is. It is hard to confine it to the normal silos that companies have. Seems like having DBAs sitting in the Unix/SA team to manage Exadata really is the way to do it.</p>
<p><strong>Tuesday</strong></p>
<p>Tuesday really was a superb day at UKOUG, presentation after presentation was just outstanding. The quality of the presenters is outstandingly high. First up I thought I&#8217;d try something different and went to <a href="http://nuijten.blogspot.com/">Alex Nuijten</a> talking about analytic functions. Now, I&#8217;m not much of an SQL coder, but it is good to have a change from the usual presentations I go to: RAC, Dataguard &#8211; more towards the &#8220;infrastructure&#8221; end than the SQL end of being a DBA.</p>
<p>Alex is a superb presentator, was deeply impressed by the quality of the slides. Being on first thing in the morning meant I think he had a tough crowd, and he kept asking for audience feedback but most people were fairly quiet. He explained most of his material really well. As he said himself, this was not an introductory presentation, so while I got lost in a few of the steps/techniques he showed, I&#8217;m sure for the vast majority of the audience, starting in a better place than me, this would not have been the case.</p>
<p>Next stand out presentation was by <a href="http://connormcdonald.wordpress.com//">Connor Mcdonald</a> on a year in purgatory, where he had some issues regarding upgrading, and eventually had to perform upgrades over 8 Oracle versions to find a stable release. </p>
<p>All I can say is wow. This was a <em>stunning</em> presentation. </p>
<p>I was on the edge of my seat wanting to know what happened next. That sickening feeling when one of his production upgrades went horribly wrong and he was looking at potential millions of pounds of loss for the business, was just palpable. </p>
<p>Lots of useful tips, including what not to do, but his final message about essentially being on the mainstream in terms of release (OS) I think is so true.</p>
<p>I then had the pleasure of a <a href="http://blog.tanelpoder.com/">Tanel Poder</a> troubleshooting masterclass where he discussed a performance problem and as always emphasised that you need to take a systematic approach to problem solving.</p>
<p>As if that was not enough top presenters and presentations I had even more to come in the form of a fairly impromptu parallel query masterclass with <a href="http://structureddata.org/">Greg Rahn</a>. Greg managed to go on for nearly 2 hours, and there were numerous useful asides. Greg also just has a great way of presenting complex information, I really was astonished by the depth of his technical knowledge, this was in the unconference and there were only a few us lucky enough to attend this one, but I&#8217;m certainly glad I did!</p>
<p><strong>Wednesday</strong></p>
<p>After a couple of <em>really </em> late nights, though I admit not as late as some, I was feeling really tired come Wednesday morning, but as I could only stay for a few presentations I thought I would attend the first presentation of the day. I started of the day with <a href="http://oracledoug.com/serendipity/">Doug Burns</a> and his presentation on partitioned statistics, and I am really glad I did. Doug was sounded a bit underweather earlier in the week, but he was bang on form with this presentation. Doug has a really great rapport with an audience and, maybe this is the Scot in me, but I think he has a lovely delivery &#8211; a really good presenter. Also the material was very interesting and I picked up some statistics gather tips &#8211; it&#8217;s hard to get decent statistics on very large tables even if they are partitioned. </p>
<p>One other new feature of UKOUG 2011 was the lunchtime oaktalks, which were 10 minute lightning talks. These were a lot of fun.</p>
<p>Wednesday was somewhat curtailed for me as I had to return to Oxford to see the tinselled vision of my eldest daughter singing in her school Christmas concert.</p>
<p>All in all, it&#8217;s been a great few days.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1946/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1946/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1946/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1946/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1946/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1946/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1946/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1946/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1946/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1946/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1946/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1946/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1946/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1946/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1946&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2011/12/07/ukoug-annual-conference-2011/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>
	</item>
		<item>
		<title>When Oracle Patching Goes Wrong &#8211; OUI-67124</title>
		<link>http://jarneil.wordpress.com/2011/11/27/when-oracle-patching-goes-wrong-oui-67124/</link>
		<comments>http://jarneil.wordpress.com/2011/11/27/when-oracle-patching-goes-wrong-oui-67124/#comments</comments>
		<pubDate>Sun, 27 Nov 2011 11:59:34 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[Exadata]]></category>
		<category><![CDATA[patchset]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1925</guid>
		<description><![CDATA[Sometimes things never quite go according to plan. You can test, and test in a UAT or dev environment, but just sometimes, something comes out of left field when you come to roll it into production. Just such an issue appeared when I was really rolling a patch into production. This was Exadata BP 11, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1925&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Sometimes things <em>never</em> quite go according to plan. You can test, and test in a UAT or dev environment, but just sometimes, something comes out of left field when you come to roll it into production. Just such an issue appeared when I was really rolling a patch into production.</p>
<p>This was Exadata BP 11, but it is not really exadata specific. The bundle patch was being applied with the opatch auto command and it all appeared to be going well, and no indication of an issue appeared in the window where I was applying the patch, but when I checked how many patches were installed in the GI home, instead of seeing the following:</p>
<p><pre class="brush: bash;">
db01(oracle):+ASM1:oracle$ /u01/app/oracle/product/11.2.0.2/grid/OPatch/opatch lsinventory  |grep -i applied

Patch  12914289     : applied on Sat Nov 12 14:14:39 GMT 2011 
Patch  12421404     : applied on Sat Nov 12 14:12:01 GMT 2011 
Patch  12902308     : applied on Sat Nov 12 13:03:27 GMT 2011
</pre></p>
<p>I found only the 12902308 patch applied to the GI home. I knew this bundle patch required that I was left with 3 patches applied to the GI home so I knew something had gone awry.</p>
<p>Looking into the log file for the patch application eventually revealed the following:</p>
<p><pre class="brush: bash;">
 -------------------------------------------------------------------------------- 
 The following warnings have occurred during OPatch execution: 
 1) OUI-67303: 
 Patches [   12419090 ] will be rolled back.

 2) OUI-67124:Copy failed from '/u01/app/oracle/BP11/12902308/12421404/files/bin/crsctl.bin' to '/u01/app/oracle/product/11.2.0.2/grid/bin/crsctl.bin'...

 3) OUI-67124:ApplySession failed in system modification phase... 'ApplySession::apply failed: Copy failed from '/u01/app/oracle/BP11/12902308/12421404/files/bin/crsctl.bin' to '/u01/app/oracle/product/11.2.0.2/grid/bin/crsctl.bin'...
</pre></p>
<p>So now we can see where the issue occurred, but we still need to work out how to fix it. Checking out the file with ls all looked fine, and permissions seemed to look good too. </p>
<p>I&#8217;d also like to point out at this point that using opatch auto it is meant to take care of shutting down the GI stack cleanly and basically automating the application of the patch to both the GI and RDBMS homes.  </p>
<p><strong>fuser to the rescue</strong></p>
<p>Last idea was to check if there was any processes using this file, a simple ps was not giving any clue that something was running from this $ORACLE_HOME, though there were lots of processes owned by the oracle user, nothing was obviously running from the GI home. One excellent way of finding out if a process is using a particular file or filesystem is <a href="http://unixhelp.ed.ac.uk/CGI/man-cgi?fuser">fuser</a>. I ran this and saw the following:</p>
<p><pre class="brush: bash;">
fuser -c /u01/app/oracle/product/11.2.0.2/grid/bin/crsctl.bin

/u01/app/oracle/product/11.2.0.2/grid/bin/crsctl.bin:  1106c  2569c  3493c  4348c  4865c  5863c  5887c  6666c  6739c  7036c  7230c  7299c  7303c  8411  8428c  8487c  8545c  8642c  9462c  9754c 10634c 10710c 11278c 11413ce 11919c 12344 12907c 13550c 13674c 14992 15166c 15480c 15987 16282c 16421c 16982c 17390c 17500c 17860c 17932c 18162c 18373c 18667c 19065c 19980c 20017c 20019c 20115c 20139c 20441c 20594c 20942c 21202c 21305c 21761c 21825c 24599c 24792c 
</pre></p>
<p>Ouch! That is a <em>lot</em> of processes that were using this executable, looking at a few they seemed to be ssh processes owned by oracle off to other servers. It seemed a bit of a pain to go through them all killing each one individually, and this is where fuser comes to the rescue again!</p>
<p><pre class="brush: bash;">

fuser -ck /u01/app/oracle/product/11.2.0.2/grid/bin/crsctl.bin
</pre></p>
<p>The -k flag kills all processes accessing this file. Now I could try manually applying the missing patches:</p>
<p><strong>Manually Patching GI</strong></p>
<p>This is well documented but bears repeating, when you are attempting to manually apply a patch (or indeed rollback a patch) to the GI Home you have to unlock the home as root. You need to run the following:</p>
<p><pre class="brush: bash;">
# /u01/app/oracle/product/11.2.0.2/grid/crs/install/rootcrs.pl -unlock
</pre></p>
<p>Now as the oracle user you descend to your patch directory and apply the patch with a simple opatch apply (or napply) and once you have applied all the patches to the GI Home you need to lock the GI Home again. Once more you need to run as root:</p>
<p><pre class="brush: bash;">
# /u01/app/oracle/product/11.2.0.2/grid/crs/install/rootcrs.pl -patch
</pre></p>
<p>Avoiding these steps is certainly one advantage of the opatch auto, I just wish it made it a bit obvious when it failed to apply <em>every</em> patch to a home!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1925/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1925&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2011/11/27/when-oracle-patching-goes-wrong-oui-67124/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>
	</item>
		<item>
		<title>The Griddisk is connected to the Celldisk&#8230;</title>
		<link>http://jarneil.wordpress.com/2011/11/20/the-griddisk-is-connected-to-the-celldisk/</link>
		<comments>http://jarneil.wordpress.com/2011/11/20/the-griddisk-is-connected-to-the-celldisk/#comments</comments>
		<pubDate>Sun, 20 Nov 2011 20:29:55 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[Exadata]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1885</guid>
		<description><![CDATA[One of the songs my children like to hear is called dem bones, you are probably familiar with how it tells you which bones are connected to which. When considering the hierarchy of abstractions within Exadata disk drives I am often very much reminded of this song. When presenting storage from an Exadata cell there [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1885&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://jarneil.files.wordpress.com/2011/11/skeleton.gif"><img src="http://jarneil.files.wordpress.com/2011/11/skeleton.gif?w=600" alt="" title="skeleton"   class="alignright size-full wp-image-1886" /></a></p>
<p>One of the songs my children like to hear is called <a href="http://www.youtube.com/watch?v=RN9s5vs07WM&amp;feature=related">dem bones</a>, you are probably familiar with how it tells you which bones are connected to which. </p>
<p>When considering the hierarchy of abstractions within Exadata disk drives I am often very much reminded of this song. When presenting storage from an Exadata cell there are 4 layers to deal with before we have something that ASM knows how to operate on, that is something that ASM can actually create a diskgroup with.</p>
<p>It is worth also pointing out that ASM is running on the compute nodes, but as an administrator you will be operating on the storage cell to create the disks that ASM can actually use.</p>
<p>We could work our way from the actual physical drives up to what ASM sees, but lets go in reverse from ASM drilling back down to a &#8220;brown spinny thing&#8221;. </p>
<p>All commands I&#8217;m going to show here were run on the <a href="http://www.e-dba.com">e-dba</a> Proof of Concept Exadata X2-2 quarter rack. Lets first take a look at the diskgroups available to the RDBMS:</p>
<p><pre class="brush: plain;">
SQL&gt; select group_number, name 
from v$asm_diskgroup;

GROUP_NUMBER NAME
------------ ------------------------------
	   1 DATA_EX01
	   2 DBFS_DG
	   3 RECO_EX01

</pre></p>
<p>So we see the 3 diskgroups.</p>
<p>Let us look at the individual ASM disks that make up these diskgroups:</p>
<p><pre class="brush: bash;">
SQL&gt; select group_number, disk_number, name, path 
from v$asm_disk 
where name like '%CEL01' 
order by 1,2 asc;
GROUP_NUMBER DISK_NUMBER NAME				   PATH
------------ ----------- --------------------- ------------------------------------------
	   1	       0 DATA_EX01_CD_00_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_00_ex01cel01
	   1	       1 DATA_EX01_CD_01_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_01_ex01cel01
	   1	       2 DATA_EX01_CD_02_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_02_ex01cel01
	   1	       3 DATA_EX01_CD_03_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_03_ex01cel01
	   1	       4 DATA_EX01_CD_04_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_04_ex01cel01
	   1	       5 DATA_EX01_CD_05_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_05_ex01cel01
	   1	       6 DATA_EX01_CD_06_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_06_ex01cel01
	   1	       7 DATA_EX01_CD_07_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_07_ex01cel01
	   1	       8 DATA_EX01_CD_08_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_08_ex01cel01
	   1	       9 DATA_EX01_CD_09_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_09_ex01cel01
	   1	      10 DATA_EX01_CD_10_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_10_ex01cel01
	   1	      11 DATA_EX01_CD_11_EX01CEL01	o/192.168.10.3/DATA_EX01_CD_11_ex01cel01
	   2	       0 DBFS_DG_CD_02_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_02_ex01cel01
	   2	       1 DBFS_DG_CD_03_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_03_ex01cel01
	   2	       2 DBFS_DG_CD_04_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_04_ex01cel01
	   2	       3 DBFS_DG_CD_05_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_05_ex01cel01
	   2	       4 DBFS_DG_CD_06_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_06_ex01cel01
	   2	       5 DBFS_DG_CD_07_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_07_ex01cel01
	   2	       6 DBFS_DG_CD_08_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_08_ex01cel01
	   2	       7 DBFS_DG_CD_09_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_09_ex01cel01
	   2	       8 DBFS_DG_CD_10_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_10_ex01cel01
	   2	       9 DBFS_DG_CD_11_EX01CEL01	o/192.168.10.3/DBFS_DG_CD_11_ex01cel01
	   3	       0 RECO_EX01_CD_00_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_00_ex01cel01
	   3	       1 RECO_EX01_CD_01_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_01_ex01cel01
	   3	       2 RECO_EX01_CD_02_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_02_ex01cel01
	   3	       3 RECO_EX01_CD_03_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_03_ex01cel01
	   3	       4 RECO_EX01_CD_04_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_04_ex01cel01
	   3	       5 RECO_EX01_CD_05_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_05_ex01cel01
	   3	       6 RECO_EX01_CD_06_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_06_ex01cel01
	   3	       7 RECO_EX01_CD_07_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_07_ex01cel01
	   3	       8 RECO_EX01_CD_08_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_08_ex01cel01
	   3	       9 RECO_EX01_CD_09_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_09_ex01cel01
	   3	      10 RECO_EX01_CD_10_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_10_ex01cel01
	   3	      11 RECO_EX01_CD_11_EX01CEL01	o/192.168.10.3/RECO_EX01_CD_11_ex01cel01

</pre></p>
<p>So we have limited the output to disks from just the one cell. We can see that there are disks here being used in the 3 different diskgroups. In the path field we see the IP address of the cell that the asm disk resides in. The names of the form DATA_EX01_CD_00_EX01CEL01 are griddisks and these are created and managed on the storage cell.</p>
<p>I really like the naming convention and you can tell a lot about where a this ASM disk came from by looking at the name. </p>
<p>So we see the name of the cell: ex01cel01</p>
<p>We see which celldisk this griddisk is created upon: CD_00 (within the ex01cel01 cell)</p>
<p>And we see the diskgroup that this belongs to: DATA_EX01</p>
<p>Lets take a look at what the storage cell can tell us about the griddisks:</p>
<p><pre class="brush: bash;">
CellCLI&gt; list griddisk
	 DATA_EX01_CD_00_ex01cel01	 active
	 DATA_EX01_CD_01_ex01cel01	 active
	 DATA_EX01_CD_02_ex01cel01	 active
	 DATA_EX01_CD_03_ex01cel01	 active
	 DATA_EX01_CD_04_ex01cel01	 active
	 DATA_EX01_CD_05_ex01cel01	 active
	 DATA_EX01_CD_06_ex01cel01	 active
	 DATA_EX01_CD_07_ex01cel01	 active
	 DATA_EX01_CD_08_ex01cel01	 active
	 DATA_EX01_CD_09_ex01cel01	 active
	 DATA_EX01_CD_10_ex01cel01	 active
	 DATA_EX01_CD_11_ex01cel01	 active
	 DBFS_DG_CD_02_ex01cel01  	 active
	 DBFS_DG_CD_03_ex01cel01  	 active
	 DBFS_DG_CD_04_ex01cel01  	 active
	 DBFS_DG_CD_05_ex01cel01  	 active
	 DBFS_DG_CD_06_ex01cel01  	 active
	 DBFS_DG_CD_07_ex01cel01  	 active
	 DBFS_DG_CD_08_ex01cel01  	 active
	 DBFS_DG_CD_09_ex01cel01  	 active
	 DBFS_DG_CD_10_ex01cel01  	 active
	 DBFS_DG_CD_11_ex01cel01  	 active
	 RECO_EX01_CD_00_ex01cel01	 active
	 RECO_EX01_CD_01_ex01cel01	 active
	 RECO_EX01_CD_02_ex01cel01	 active
	 RECO_EX01_CD_03_ex01cel01	 active
	 RECO_EX01_CD_04_ex01cel01	 active
	 RECO_EX01_CD_05_ex01cel01	 active
	 RECO_EX01_CD_06_ex01cel01	 active
	 RECO_EX01_CD_07_ex01cel01	 active
	 RECO_EX01_CD_08_ex01cel01	 active
	 RECO_EX01_CD_09_ex01cel01	 active
	 RECO_EX01_CD_10_ex01cel01	 active
	 RECO_EX01_CD_11_ex01cel01	 active
</pre></p>
<p>These match the names found in V$ASM_DISK. We can look in detail at an individual griddisk:</p>
<p><pre class="brush: bash;">
CellCLI&gt; list griddisk where name='DATA_EX01_CD_00_ex01cel01' detail

	 name:              	 DATA_EX01_CD_00_ex01cel01
	 availableTo:       	 
	 cellDisk:          	 CD_00_ex01cel01
	 comment:           	 
	 creationTime:      	 2011-06-08T13:33:48+01:00
	 diskType:          	 HardDisk
	 errorCount:        	 0
	 id:                	 550975c0-9d2e-47dd-85cd-d2550d394ec9
	 offset:            	 32M
	 size:              	 423G
	 status:            	 active
</pre></p>
<p>So the griddisk DATA_EX01_CD_00_ex01cel01 is created on the celldisk CD_00_ex01cel01 and we can check whether there are other griddisks on this celldisk:</p>
<p><pre class="brush: bash;">
CellCLI&gt; list griddisk where celldisk='CD_00_ex01cel01'

	 DATA_EX01_CD_00_ex01cel01	 active
	 RECO_EX01_CD_00_ex01cel01	 active
</pre></p>
<p>So there are two griddisks created on this celldisk. That is one important thing to remember you can create multiple griddisks on top of the one celldisk, and it is the griddisks that are presented to ASM.</p>
<p>Lets look at this celldisk in more detail:</p>
<p><pre class="brush: bash;">
CellCLI&gt; list celldisk where name='CD_00_ex01cel01' detail

	 name:              	 CD_00_ex01cel01
	 comment:           	 
	 creationTime:      	 2011-06-08T13:32:08+01:00
	 deviceName:        	 /dev/sda
	 devicePartition:   	 /dev/sda3
	 diskType:          	 HardDisk
	 errorCount:        	 0
	 freeSpace:         	 0
	 id:                	 2dd77a53-53f1-49b5-98a0-d86a19140dc0
	 interleaving:      	 none
	 lun:               	 0_0
	 raidLevel:         	 0
	 size:              	 528.734375G
	 status:            	 normal
</pre><br />
So we see this celldisk is associated with a device /dev/sda and it also has a lun 0_0 associated with it. Because this is a system disk the actual celldisk has been created on partition /dev/sda3, non celldisks use the entire device rather than a partition.</p>
<p>Lets have a look at the luns we have:</p>
<p><pre class="brush: bash;">
CellCLI&gt; list lun where diskType='HARDDISK' 

	 0_0 	 0_0 	 normal
	 0_1 	 0_1 	 normal
	 0_2 	 0_2 	 normal
	 0_3 	 0_3 	 normal
	 0_4 	 0_4 	 normal
	 0_5 	 0_5 	 normal
	 0_6 	 0_6 	 normal
	 0_7 	 0_7 	 normal
	 0_8 	 0_8 	 normal
	 0_9 	 0_9 	 normal
	 0_10	 0_10	 normal
	 0_11	 0_11	 normal
</pre></p>
<p>So here we are keeping the output to just the hard drives and ignoring the flashdisks. We see we have 12 luns, which is a 1-1 mapping to the number of physical drives we have within the cell. Let us look in more detail at a lun:</p>
<p><pre class="brush: bash;">
CellCLI&gt; list lun 0_0 detail

	 name:              	 0_0
	 cellDisk:          	 CD_00_ex01cel01
	 deviceName:        	 /dev/sda
	 diskType:          	 HardDisk
	 id:                	 0_0
	 isSystemLun:       	 TRUE
	 lunAutoCreate:     	 FALSE
	 lunSize:           	 557.861328125G
	 lunUID:            	 0_0
	 physicalDrives:    	 20:0
	 raidLevel:         	 0
	 lunWriteCacheMode: 	 &quot;WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU&quot;
	 status:            	 normal
</pre></p>
<p>So focusing down in on lun 0_0 we see that there is a celldisk created upon this, and matching with earlier we see that it is celldisk CD_00_ex01cel01, and as noted earlier it is created on device /dev/sda. We also see that this is associated with a physical drive 20:0.</p>
<p>So now we can drill on down to the actual physical drive:</p>
<p><pre class="brush: bash;">
CellCLI&gt; list physicaldisk 20:0 detail

	 name:              	 20:0
	 deviceId:          	 19
	 diskType:          	 HardDisk
	 enclosureDeviceId: 	 20
	 errMediaCount:     	 0
	 errOtherCount:     	 0
	 foreignState:      	 false
	 luns:              	 0_0
	 makeModel:         	 &quot;SEAGATE ST360057SSUN600G&quot;
	 physicalFirmware:  	 0805
	 physicalInsertTime:	 2010-12-31T14:24:44+00:00
	 physicalInterface: 	 sas
	 physicalSerial:    	 E1P6N9
	 physicalSize:      	 558.9109999993816G
	 slotNumber:        	 0
	 status:            	 normal
</pre><br />
So here we finally see some details of the hard drive itself, including the fact that it is a Seagate drive. We can also link it back here to lun 0_0.</p>
<p>Finally just for fun, we can even use the MegaCli command to obtain info about the drive:</p>
<p><pre class="brush: bash;">
[root@cel01 ~]# /opt/MegaRAID/MegaCli/MegaCli64 PDList -a0
                                     
Adapter #0

Enclosure Device ID: 20
Slot Number: 0
Device Id: 19
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 557.861 GB [0x45bb9000 Sectors]
Firmware state: Online, Spun Up
SAS Address(0): 0x5000c50028c59721
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST360057SSUN600G08051047E1P6N9          
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive:  Not Certified
.
.
</pre></p>
<p>We can see the DeviceId of 19 is matching up both here and with the cellcli command.</p>
<p>So we essentially have the following chain</p>
<blockquote><p><strong>Gridisk </strong> </p></blockquote>
<blockquote><p><strong>Celldisk </strong></p></blockquote>
<blockquote><p> <strong>Lun</strong></p></blockquote>
<blockquote><p><strong>Physical Drive</strong> </p></blockquote>
<p>With the key point being multiple griddisks can be presented to ASM that have been created on top of a celldisk that effectively maps to a single physical drive.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1885/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1885/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1885/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1885/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1885/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1885/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1885/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1885/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1885/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1885/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1885/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1885/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1885/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1885/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1885&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2011/11/20/the-griddisk-is-connected-to-the-celldisk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2011/11/skeleton.gif" medium="image">
			<media:title type="html">skeleton</media:title>
		</media:content>
	</item>
		<item>
		<title>Exadata Storage Cells also use hardware RAID? &#8211; Yep it&#8217;s true</title>
		<link>http://jarneil.wordpress.com/2011/11/13/exadata-storage-cells-also-use-hardware-raid-yep-its-true/</link>
		<comments>http://jarneil.wordpress.com/2011/11/13/exadata-storage-cells-also-use-hardware-raid-yep-its-true/#comments</comments>
		<pubDate>Sun, 13 Nov 2011 16:36:01 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[Exadata]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1844</guid>
		<description><![CDATA[We have already seen that the compute nodes in an Exadata system are using hardware RAID to offer increased availability and serviceability for the disk drives in them. What about the Storage Cells themselves? At this point you are quite possibly thinking I&#8217;ve gone a bit nuts. Everyone knows Exadata uses ASM to offer highly [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1844&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>We have already <a href="http://jarneil.wordpress.com/2011/11/06/exadata-x2-2-compute-node-drive-protection/">seen</a> that the compute nodes in an Exadata system are using hardware RAID to offer increased availability and serviceability for the disk drives in them. What about the Storage Cells themselves?</p>
<p><a href="http://download.oracle.com/docs/cd/E19477-01/820-5830-13/app_ipb_X4275.html"><img src="http://jarneil.files.wordpress.com/2011/11/app_ipb_x4275-4.jpg?w=600" alt="" title="app_ipb_X4275-4"   class="alignright size-full wp-image-1847" /></a></p>
<p>At this point you are quite possibly thinking I&#8217;ve gone a bit nuts. Everyone knows Exadata uses ASM to offer highly resilient storage with all the benefits that ASM brings to the table, and everyone knows you don&#8217;t <em>need</em> hardware RAID to have these benefits.</p>
<p>So surely an Exadata Storage Cell does not use hardware RAID, right?</p>
<p><strong>Storage Cell Hardware</strong></p>
<p>So how can you tell you are working on a Storage Cell, as opposed to the compute node? Well lets check what dmidecode states:</p>
<p><pre class="brush: bash;">
[root@cel01 ~]# dmidecode -s system-product-name 

SUN FIRE X4275 SERVER
</pre></p>
<p>This is actually a V2 box, while the X2-2 box is different in a couple of ways:</p>
<p><pre class="brush: bash;">
[root@cel01 ~]# dmidecode -s system-product-name 

SUN FIRE X4270 M2 SERVER       
</pre></p>
<p>The <a href="http://www.oracle.com/us/products/servers-storage/servers/x86/sun-fire-x4270-m2-server-ds-079882.pdf">X4270 M2</a> can actually take 24 2.5&#8243; drives or 12 3.5&#8243; drives. Currently only the 12 disk option is available.</p>
<p>The schematic for this <a href="http://www.oracle.com/us/products/servers-storage/servers/x86/034677.pdf">server</a> is above, basically it is a 2U box that can take up to 12 drives. In Exadata these storage cells are running linux:</p>
<p><pre class="brush: bash;">

[root@cel01 ~]# uname -r 
2.6.18-194.3.1.0.3.el5
</pre></p>
<p>However, they have our old friend the LSI MegaRAID controller installed:</p>
<p><pre class="brush: bash;">


[root@cel01 ~]# lsscsi -v

[0:2:0:0]    disk    LSI      MR9261-8i        2.12  /dev/sda
  dir: /sys/bus/scsi/devices/0:2:0:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:0/0:2:0:0]
[0:2:1:0]    disk    LSI      MR9261-8i        2.12  /dev/sdb
  dir: /sys/bus/scsi/devices/0:2:1:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:1/0:2:1:0]
[0:2:2:0]    disk    LSI      MR9261-8i        2.12  /dev/sdc
  dir: /sys/bus/scsi/devices/0:2:2:0  [/sys/devices/pci0000:00/0000:00:05.0/0000:13:00.0/host0/target0:2:2/0:2:2:0]
.
.
</pre></p>
<p>I&#8217;ve abbreviated the output to just the 3 drives, while the full output shows all 12 and the flash cards as well. Ok, so it&#8217;s pretty clear there is the LSI MegaRAID MR9261-8i card, just like the compute nodes. </p>
<p><strong>MegaRAID Configuration</strong></p>
<p>Lets take a look at what our old friend is doing in the storage cell:</p>
<p><pre class="brush: bash;">

[root@cel01 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -ShowSummary -aALL
                                    
System
        OS Name (IP Address)       : Not Recognized
        OS Version                 : Not Recognized
        Driver Version             : Not Recognized
        CLI Version                : 8.00.23

Hardware
        Controller
                 ProductName       : LSI MegaRAID SAS 9261-8i(Bus 0, Dev 0)
                 SAS Address       : 500605b00250ef70
                 FW Package Version: 12.12.0-0048
                 Status            : Optimal
        BBU
                 BBU Type          : Unknown
                 Status            : Healthy
        Enclosure
                 Product Id        : HYDE12         
                 Type              : SES
                 Status            : OK

                 Product Id        : SGPIO          
                 Type              : SGPIO
                 Status            : OK

        PD
                Connector          : Port 0 - 3&lt;Internal&gt;&lt;Encl Pos 0 &gt;: Slot 11
                Vendor Id          : SEAGATE
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active

                Connector          : Port 0 - 3&lt;Internal&gt;&lt;Encl Pos 0 &gt;: Slot 10
                Vendor Id          : SEAGATE
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active

                Connector          : Port 0 - 3&lt;Internal&gt;&lt;Encl Pos 0 &gt;: Slot 9
                Vendor Id          : SEAGATE
                Product Id         : ST360057SSUN600G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active
.
.
.

Storage

       Virtual Drives
                Virtual drive      : Target Id 0 ,VD name
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0

                Virtual drive      : Target Id 1 ,VD name
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0

                Virtual drive      : Target Id 2 ,VD name
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 0
.
.
.
</pre></p>
<p>Again, output chopped after 3 drives for brevity. Basically we have 12 Physical Drives mapped to 12 Virtual Drives all with RAID level 0. But each RAID 0 stripe is only across a single drive.</p>
<p>You can even see that the LSI RAID controller has the same 512MB battery backed cache:</p>
<p><pre class="brush: bash;">


[root@cel01 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -cfgDsply -aALL                                    
==============================================================================
Adapter: 0
Product Name: LSI MegaRAID SAS 9261-8i
Memory: 512MB
BBU: Present
Serial No: SV03902812
==============================================================================
Number of DISK GROUPS: 12


DISK GROUP: 0
Number of Spans: 1
SPAN: 0
Span Reference: 0x00
Number of PDs: 1
Number of VDs: 1
Number of dedicated Hotspares: 0
Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-0, Secondary-0, RAID Level Qualifier-0
Size                : 557.861 GB
State               : Optimal
Stripe Size         : 1.0 MB
Number Of Drives    : 1
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Access Policy       : Read/Write
Disk Cache Policy   : Disabled
Encryption Type     : None
Physical Disk Information:
Physical Disk: 0
Enclosure Device ID: 20
Slot Number: 0
Device Id: 19
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 557.861 GB [0x45bb9000 Sectors]
Firmware state: Online, Spun Up
SAS Address(0): 0x5000c50028c59721
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: SEAGATE ST360057SSUN600G08051047E1P6N9         
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive:  Not Certified
.
.
</pre></p>
<p>Output chopped after 1 drive, as it does not get any more interesting. You can see that again the drives are in writeback mode, which means acknowledgements are given upon data being written to cache as opposed to actually physically on disk &#8211; again you&#8217;ve got to make sure your batteries are good to give yourself some protection on power failure.</p>
<p>Of course RAID-0 will not give any protection to your devices upon the event of hard disk failure but you can still say it&#8217;s true that an Exadata Storage Cell is using hardware RAID. </p>
<p>Joel Goodman has written an <a href="http://dbatrain.wordpress.com/2011/10/14/maintaining-your-cells-image/comment-page-1/#comments">excellent account</a> of how two of the 12 drives, the system disks, are used to create the various O/S devices.</p>
<p>We can see the differences between a system drive and a non-system drive with the following:</p>
<p><pre class="brush: bash;">
[root@cel01 /]# fdisk -l

Disk /dev/sda: 598.9 GB, 598999040000 bytes 
255 heads, 63 sectors/track, 72824 cylinders 
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System 
/dev/sda1   *           1          15      120456   fd  Linux raid autodetect 
/dev/sda2              16          16        8032+  83  Linux 
/dev/sda3              17       69039   554427247+  83  Linux 
/dev/sda4           69040       72824    30403012+   f  W95 Ext'd (LBA) 
/dev/sda5           69040       70344    10482381   fd  Linux raid autodetect 
/dev/sda6           70345       71649    10482381   fd  Linux raid autodetect 
/dev/sda7           71650       71910     2096451   fd  Linux raid autodetect 
/dev/sda8           71911       72171     2096451   fd  Linux raid autodetect 
/dev/sda9           72172       72432     2096451   fd  Linux raid autodetect 
/dev/sda10          72433       72521      714861   fd  Linux raid autodetect 
/dev/sda11          72522       72824     2433816   fd  Linux raid autodetect
</pre></p>
<p>So this is one of the two system drives while a non system drive has the following:</p>
<p><pre class="brush: bash;">

[root@cel01 /]# fdisk -l /dev/sdc

Disk /dev/sdc: 598.9 GB, 598999040000 bytes 
255 heads, 63 sectors/track, 72824 cylinders 
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdc doesn't contain a valid partition table

</pre></p>
<p>So from all these partitons on the system drives we then use <a href="http://linuxmanpages.com/man8/mdadm.8.php">mdadm</a> to create software RAID devices by combining partitions from each system drive:</p>
<p><pre class="brush: bash;">
[root@cel01 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md6              9.9G  4.4G  5.0G  47% /
tmpfs                  12G     0   12G   0% /dev/shm
/dev/md8              2.0G  618M  1.3G  33% /opt/oracle
/dev/md4              116M   52M   59M  47% /boot
/dev/md11             2.3G   88M  2.1G   4% /var/log/oracle
</pre></p>
<p>And we can see that these /dev/md devices are made up from the /dev/sd[a-b] devices:</p>
<p><pre class="brush: bash;">
[root@cel01 ~]# mdadm -Q -D /dev/md6
/dev/md6:
        Version : 0.90
  Creation Time : Fri Dec 31 14:08:30 2010
     Raid Level : raid1
     Array Size : 10482304 (10.00 GiB 10.73 GB)
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 6
    Persistence : Superblock is persistent

    Update Time : Fri Nov 11 16:42:07 2011
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 87891e9e:e9bb6307:1e49e958:271166fe
         Events : 0.4

    Number   Major   Minor   RaidDevice State
       0       8        6        0      active sync   /dev/sda6
       1       8       22        1      active sync   /dev/sdb6
</pre></p>
<p>So while the Exadata storage server does indeed have a hardware RAID capability the O/S on the storage cell is given higher availability by utilising mdadm software RAID. This allows the unused space on the system drives to still be used in the ASM diskgroups.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1844/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1844/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1844/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1844/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1844/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1844/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1844/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1844/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1844/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1844/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1844/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1844/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1844/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1844/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1844&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2011/11/13/exadata-storage-cells-also-use-hardware-raid-yep-its-true/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2011/11/app_ipb_x4275-4.jpg" medium="image">
			<media:title type="html">app_ipb_X4275-4</media:title>
		</media:content>
	</item>
		<item>
		<title>Exadata X2-2 Compute Node Drive Protection</title>
		<link>http://jarneil.wordpress.com/2011/11/06/exadata-x2-2-compute-node-drive-protection/</link>
		<comments>http://jarneil.wordpress.com/2011/11/06/exadata-x2-2-compute-node-drive-protection/#comments</comments>
		<pubDate>Sun, 06 Nov 2011 18:19:55 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[Exadata]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1814</guid>
		<description><![CDATA[There are indeed a few differences between an Exadata V2 and an X2-2 box and one of the differences is the fact that the compute nodes build LVM logical volumes on top of the hardware enabled RAID device built using the same technology as V2. First off, how can we tell we are working on [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1814&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>There are indeed a few differences between an Exadata V2 and an X2-2 box and one of the differences is the fact that the compute nodes build <a href="http://www.markus-gattol.name/ws/lvm.html">LVM</a> logical volumes on top of the hardware enabled RAID device built using the <a href="http://jarneil.wordpress.com/2011/10/30/exadata-uses-hardware-raid-you-bet-it-does/">same technology</a> as V2.</p>
<p>First off, how can we tell we are working on an X2 system as opposed to V2? You can tell this from the output of the following command:</p>
<p><pre class="brush: bash;">

[root@db01 ~]# dmidecode -s system-product-name

SUN FIRE X4170 M2 SERVER  
</pre></p>
<p>The key to the above is that the X2 uses the X4710 M2 version, while the V2 is output will not have the M2 part, though is still an X4170.</p>
<p>The M2 still has the exact same LSI MegaRAID controller. There is a nice way of summarising your configuration:</p>
<p><pre class="brush: bash;">
[root@db01 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -ShowSummary -aALL
                                     
System
        OS Name (IP Address)       : Not Recognized
        OS Version                 : Not Recognized
        Driver Version             : Not Recognized
        CLI Version                : 8.00.23

Hardware
        Controller
                 ProductName       : LSI MegaRAID SAS 9261-8i(Bus 0, Dev 0)
                 SAS Address       : 500605b00292ed90
                 FW Package Version: 12.12.0-0048
                 Status            : Optimal
        BBU
                 BBU Type          : Unknown
                 Status            : Healthy
        Enclosure
                 Product Id        : SGPIO           
                 Type              : SGPIO
                 Status            : OK

        PD 
                Connector          : Port 0 - 3&lt;Internal&gt;: Slot 3 
                Vendor Id          : HITACHI 
                Product Id         : H103030SCSUN300G
                State              : Dedicated HotSpare
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 278.875 GB
                Power State        : Active

                Connector          : Port 0 - 3&lt;Internal&gt;: Slot 2 
                Vendor Id          : HITACHI 
                Product Id         : H103030SCSUN300G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 278.875 GB
                Power State        : Active

                Connector          : Port 0 - 3&lt;Internal&gt;: Slot 0 
                Vendor Id          : HITACHI 
                Product Id         : H103030SCSUN300G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 278.875 GB
                Power State        : Active

                Connector          : Port 0 - 3&lt;Internal&gt;: Slot 1 
                Vendor Id          : HITACHI 
                Product Id         : H103030SCSUN300G
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 278.875 GB
                Power State        : Active

Storage

       Virtual Drives
                Virtual drive      : Target Id 0 ,VD name DBSYS
                Size               : 557.75 GB
                State              : Optimal
                RAID Level         : 5 


Exit Code: 0x00
</pre></p>
<p>This nicely shows you the state of the Physical Drives (PD), shows you have a HotSpare, and also shows the Virtual Drive created on top of this with what RAID level has been used. Remember, this is just the same as a V2. This summary command is not available with MegaCli-5.00 version of the MegaCli rpm, but it is available with version 8.00 of the rpm.</p>
<p>However a simple df -h command on X2 shows quite a difference from a V2:</p>
<p><pre class="brush: bash;">
[root@db01 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VGExaDb-LVDbSys1
                       30G  6.8G   22G  25% /
/dev/sda1             124M   36M   82M  31% /boot
/dev/mapper/VGExaDb-LVDbOra1
                       99G   20G   74G  22% /u01
tmpfs                  81G  615M   80G   1% /dev/shm
</pre></p>
<p>Now this is quite different. We know as usual the LSI RAID controller presents the one device /dev/sda:</p>
<p><pre class="brush: bash;">
[root@db01 ~]# fdisk -l

Disk /dev/sda: 598.8 GB, 598879502336 bytes
255 heads, 63 sectors/track, 72809 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          16      128488+  83  Linux
/dev/sda2              17       72809   584709772+  8e  Linux LVM

</pre></p>
<p>So 2 partitions created on top of this device, one is presented as the /boot partition. While the other is being used for the LVM. </p>
<p><pre class="brush: bash;">
[root@db01 ~]# pvdisplay 
  --- Physical volume ---
  PV Name               /dev/sda2
  VG Name               VGExaDb
  PV Size               557.62 GB / not usable 1.64 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              142751
  Free PE               103327
  Allocated PE          39424
  PV UUID               KYm3PX-4V3W-9T5L-QZBC-mW0n-jEDJ-QrJszz
</pre></p>
<p>This output ties in quite nicely with the fdisk output. One PV (Physical Volume) /dev/sda2, which has the VG (Volume Group) VGExaDb created on it.</p>
<p><pre class="brush: bash;">
[root@db01 ~]# lvdisplay
  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbSys1
  VG Name                VGExaDb
  LV UUID                pJFeiN-Kqa4-VMpS-YYH0-BrY4-baKd-XivOZE
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                30.00 GB
  Current LE             7680
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbSwap1
  VG Name                VGExaDb
  LV UUID                fnP95e-qCR9-Z2PT-faHR-HRI6-ReMX-btfeCU
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                24.00 GB
  Current LE             6144
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
  --- Logical volume ---
  LV Name                /dev/VGExaDb/LVDbOra1
  VG Name                VGExaDb
  LV UUID                LbuHDM-GSeK-fTRJ-Mgti-xDqz-siK5-yWEj4g
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                100.00 GB
  Current LE             25600
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2
</pre></p>
<p>We can see the Volume Group VGExaDb has multiple Logical Volumes (LV) created upon it. These logical volumes are mapped to the /dev/mapper devices that you can see in the df output:</p>
<p><pre class="brush: bash;">
[root@db01 ~]# ls -ltar /dev/VGExaDb/LVDb*
lrwxrwxrwx 1 root root 28 Sep 14 11:57 /dev/VGExaDb/LVDbSys1 -&gt; /dev/mapper/VGExaDb-LVDbSys1
lrwxrwxrwx 1 root root 29 Sep 14 11:57 /dev/VGExaDb/LVDbSwap1 -&gt; /dev/mapper/VGExaDb-LVDbSwap1
lrwxrwxrwx 1 root root 28 Sep 14 11:57 /dev/VGExaDb/LVDbOra1 -&gt; /dev/mapper/VGExaDb-LVDbOra1
</pre></p>
<p>I&#8217;m Not really sure what using LVM gives you in place of creating the simple partitions, but there is a large chunk of unallocated space that could be used to extend the LV&#8217;s if you wanted. </p>
<p>In fact you can tell how much free space you have with the following:</p>
<p><pre class="brush: bash;">
[root@db01 ~]# vgdisplay -s
  &quot;VGExaDb&quot; 557.62 GB [154.00 GB used / 403.62 GB free]
</pre></p>
<p>You can use lvextend to utilise this free space on the LVs but this is not something that can be done while oracle is still running on the node in question.</p>
<p>There you have it the X2-2 compute nodes have LVM logical volumes created on top of the same old LSI MegaRAID hardware RAID device.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1814/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1814/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1814/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1814/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1814/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1814/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1814/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1814/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1814/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1814/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1814/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1814/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1814/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1814/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1814&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2011/11/06/exadata-x2-2-compute-node-drive-protection/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>
	</item>
		<item>
		<title>Exadata Uses Hardware RAID? &#8211; You Bet it Does!</title>
		<link>http://jarneil.wordpress.com/2011/10/30/exadata-uses-hardware-raid-you-bet-it-does/</link>
		<comments>http://jarneil.wordpress.com/2011/10/30/exadata-uses-hardware-raid-you-bet-it-does/#comments</comments>
		<pubDate>Sun, 30 Oct 2011 20:01:27 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[Exadata]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1776</guid>
		<description><![CDATA[I was reading an Excellent post on Exadata hardware by Frits Hoogland and I managed to get myself completely bamboozled by the fact that on an Exadata V2 all you see on the compute nodes is one solitary disk device, and a couple of partitions created from this. Now it does not take much to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1776&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://jarneil.files.wordpress.com/2011/10/mega.jpg"><img src="http://jarneil.files.wordpress.com/2011/10/mega.jpg?w=600" alt="" title="Mega.jpg"   class="alignright size-full wp-image-1777" /></a></p>
<p>I was reading an Excellent <a href="http://fritshoogland.wordpress.com/2011/10/28/a-look-into-the-exadata-infrastructure/">post</a> on Exadata hardware by <a href="http://fritshoogland.wordpress.com/">Frits Hoogland</a> and I managed to get myself completely bamboozled by the fact that on an Exadata V2 all you see on the compute nodes is one solitary disk device, and a couple of partitions created from this.</p>
<p>Now it does not take much to work out that a setup created on a single disk would not really be a great selling point for a high end piece of kit, so how do the compute nodes have acceptable levels of disk drive availability and resilience? </p>
<p>Well the traditional answer to increase disk availability is to use hardware RAID and Exadata is no exception in reaching for this solution. Specifically the compute nodes are using the PCI express card pictured opposite as a hardware RAID controller. This is an LSI MegaRAID controller.</p>
<p>You can see that it is installed with the following command:</p>
<p><pre class="brush: bash;">

[root@db01 ~]# lspci -v
0d:00.0 RAID bus controller: LSI Logic / Symbios Logic Unknown device 0079 (rev 03) 
        Subsystem: LSI Logic / Symbios Logic Unknown device 9263 
        Flags: bus master, fast devsel, latency 0, IRQ 66
       .
       .
       . 
</pre></p>
<p>This was run on a V2 compute node and uses the <a href="http://linux.die.net/man/8/lspci">lspci</a> command to display all the attached PCI cards. I&#8217;ve filtered out everything but the LSI MegaRAID controller.</p>
<p>So we now know we have one of these in our Exadata system, how do we know it is actually doing anything of use for us? I still see the following in /proc/partitions:</p>
<p><pre class="brush: bash;">
[root@db01 ~]# cat /proc/partitions 
major minor  #blocks  name

   8     0  285155328 sda 
   8     1   62910508 sda1 
   8     2   16771860 sda2 
   8     3  205471350 sda3


</pre></p>
<p>This obviously matches up with what we see with looking at df.</p>
<p><pre class="brush: bash;">
[root@db01 ~]# df -h 
Filesystem            Size  Used Avail Use% Mounted on 
/dev/sda1              60G   22G   35G  39% / 
/dev/sda3             193G  130G   54G  71% /u01
</pre></p>
<p>Lets use <a href="http://linux.die.net/man/8/lsscsi">lsscsi</a> to check what SCSI devices are attached to the system:</p>
<p><pre class="brush: bash;">
[root@db01 ~]# lsscsi 
[0:2:0:0]    disk    LSI      MR9261-8i        2.12  /dev/sda
</pre></p>
<p>So the LSI module is managing the /dev/sda device through which there are the 3 partitions created on.</p>
<p>Lets use <a href="http://linux.die.net/man/8/dmesg">dmesg</a> to check how many drives were found at boot time:</p>
<p><pre class="brush: bash;">
[root@db01 ~]# dmesg 
           .
           .
SCSI subsystem initialized
megasas: 00.00.04.38 Fri. Jan. 14 12:24:32 EDT 2011
megasas: 0x1000:0x0079:0x1000:0x9263: bus 13:slot 0:func 0
GSI 21 sharing vector 0x42 and IRQ 21
ACPI: PCI Interrupt 0000:0d:00.0[A] -&gt; GSI 24 (level, low) -&gt; IRQ 66
PCI: Setting latency timer of device 0000:0d:00.0 to 64

 gen2: instance-&gt;base_addr = df2fc000&lt;6&gt;megasas: FW now in Ready state
megasas: cpx is not supported.
megasas_init_mfi: fw_support_ieee=0&lt;6&gt;scsi0 : LSI SAS based MegaRAID driver
  Vendor: HITACHI   Model: H103030SCSUN300G  Rev: A2A8
  Type:   Direct-Access                      ANSI SCSI revision: 06
  Vendor: HITACHI   Model: H103030SCSUN300G  Rev: A2A8
  Type:   Direct-Access                      ANSI SCSI revision: 06
  Vendor: HITACHI   Model: H103030SCSUN300G  Rev: A2A8
  Type:   Direct-Access                      ANSI SCSI revision: 06
  Vendor: HITACHI   Model: H103030SCSUN300G  Rev: A2A8
  Type:   Direct-Access                      ANSI SCSI revision: 06
  Vendor: LSI       Model: MR9261-8i         Rev: 2.12
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 1169686528 512-byte hdwr sectors (598880 MB)
sda: Write Protect is off
sda: Mode Sense: 1f 00 00 08
SCSI device sda: drive cache: write back, no read (daft)
SCSI device sda: 570310656 512-byte hdwr sectors (291999 MB)
sda: Write Protect is off
sda: Mode Sense: 1f 00 00 08
SCSI device sda: drive cache: write back, no read (daft)
 sda: sda1 sda2 sda3
sd 0:2:0:0: Attached scsi disk sda
         .
         .
</pre></p>
<p>I&#8217;ve edited the output from this for brevity. So dmesg seems to be reporting that 4 hard drives and the LSI RAID controller have been found.</p>
<p>We can use the MegaCli64 command to interrogate the LSI RAID controller. The MegaCli64 command has a huge number of commands you can give to interrogate the controller, here is a view of what Logical Device my controller has control off:</p>
<p><pre class="brush: bash;">
[root@db01 ~]# /opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -LALL -aALL


Adapter 0 -- Virtual Drive Information: 
Virtual Disk: 0 (Target Id: 0) 
Name: 
RAID Level: Primary-5, Secondary-0, RAID Level Qualifier-3 
Size:271.945 GB 
State: Optimal 
Stripe Size: 1.0 MB 
Number Of Drives:3 
Span Depth:1 
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU 
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU 
Access Policy: Read/Write 
Disk Cache Policy: Disabled 
Encryption Type: None 
Number of Dedicated Hot Spares: 1 
    0 : EnclId - 252 SlotId - 3

Exit Code: 0x00
</pre></p>
<p>You can see from this that the controller is aware of 4 Physical Drives. These drives match up quite nicely of course with what dmesg says was found at boot time. 1 of the drives is marked as a Dedicated HotSpare. </p>
<p>The drives are in a RAID5 set. You can also see some nice information about the configuration of the controller with the CfgDsply MegaCli64 option:</p>
<p><pre class="brush: bash;">

[root@db01~]# /opt/MegaRAID/MegaCli/MegaCli64 -CfgDsply -aALL

============================================================================== 
Adapter: 0 
Product Name: LSI MegaRAID SAS 9261-8i 
Memory: 512MB 
BBU: Present 
Serial No: SV04004860 
============================================================================== 
Number of DISK GROUPS: 1

DISK GROUPS: 0 
Number of Spans: 1 
SPAN: 0 
Span Reference: 0x00 
Number of PDs: 3 
Number of VDs: 1 
Number of dedicated Hotspares: 1 
Virtual Disk Information: 
Virtual Disk: 0 (Target Id: 0) 
Name: 
RAID Level: Primary-5, Secondary-0, RAID Level Qualifier-3 
Size:271.945 GB 
State: Optimal 
Stripe Size: 1.0 MB 
Number Of Drives:3 
Span Depth:1 
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU 
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU 
Access Policy: Read/Write 
Disk Cache Policy: Disabled 
Encryption Type: None 
Physical Disk Information: 
Physical Disk: 0 
sd 0:2:0:0: Attached scsi disk sda

</pre></p>
<p>I&#8217;ve edited for brevity, but this output then goes to show some information on the Physical Drives. This output is nice in that you can see how much memory the controller has, you also see the number of physical drives in the RAID set and whether there is a Hot spare.</p>
<p>It is also worth noting the Cache Policy that the controller is currently using. WriteBack as displayed here means the controller will acknowledge the write as soon as the data is resident on the cache of the controller, rather than waiting for it to be written out to disk.</p>
<p>I think you can see that there is no need to be bamboozled and that Exadata really does use hardware RAID to increase the availability of the drives within the compute nodes.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1776/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1776/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1776/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1776/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1776/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1776/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1776/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1776/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1776/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1776/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1776/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1776/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1776/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1776/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1776&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2011/10/30/exadata-uses-hardware-raid-you-bet-it-does/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>

		<media:content url="http://jarneil.files.wordpress.com/2011/10/mega.jpg" medium="image">
			<media:title type="html">Mega.jpg</media:title>
		</media:content>
	</item>
		<item>
		<title>Exadata Storage Cells and ASM Mirroring and Disk partnering</title>
		<link>http://jarneil.wordpress.com/2011/10/26/exadata-storage-cells-and-asm-mirroring-and-disk-partnering/</link>
		<comments>http://jarneil.wordpress.com/2011/10/26/exadata-storage-cells-and-asm-mirroring-and-disk-partnering/#comments</comments>
		<pubDate>Wed, 26 Oct 2011 14:37:00 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[ASM]]></category>
		<category><![CDATA[Exadata]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1753</guid>
		<description><![CDATA[I think it is true to say that the majority of people using ASM are using ASM external redundancy. There is a lot less experience out there in using ASM mirroring. Why would you buy a big, expensive storage array and then not use all its features? Hardware RAID being a primary example of one [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1753&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I think it is true to say that the majority of people using ASM are using ASM external redundancy. There is a lot less experience out there in using ASM mirroring. Why would you buy a big, expensive storage array and then not use all its features? Hardware RAID being a primary example of one of those features you are paying for.</p>
<p>But then comes along Exadata, it too is big and expensive, but the one thing it does not give you is hardware RAID protection &#8211; Data protection is done through ASM mirroring. The same is true of the newly minted <a href="http://www.oracle.com/us/products/database/database-appliance/index.html">Oracle Database Appliance</a>.</p>
<p>You&#8217;ll be aware the Exadata storage comes in the form of so called storage cells, filled with 12 drives. This gives not only a minimum of 36 drives with the smallest quarter rack, but also a minimum of 3 storage cells.</p>
<p><strong>ASM Mirroring</strong></p>
<p>I have <a href="http://jarneil.wordpress.com/2008/11/19/asm-mirroring-2/">discussed</a> ASM extent mirroring fairly well in the past, but to recap, in a normal redundancy environment ASM writes primary and secondary extents. These extents are written to different <a href="http://download.oracle.com/docs/cd/E11882_01/server.112/e18951/asmcon.htm#OSTMG94058">failure groups</a>. Drives sharing the same components should be in the same failure group and primary and secondary extents should reside within a different failure group. </p>
<p>The idea of course being is, should the component that all disks in the failure group share actually fail, then all data is still accessible via the mirrored extents that you know are guaranteed to be in a different failure group that is not dependent on this now failed shared component.</p>
<p>In Exadata the obvious boundary for failure groups is the storage cell. Your data still needs to be accessible in its entirety should you lose an entire cell &#8211; therefore primary and secondary extents must be stored on different cells. Therefore drives in the same cell are in the same failure group and drives from different cells are in different failure groups.</p>
<p><strong>Disk Partnering </strong></p>
<p>Each disk in a failure group maintains a set of so called partner disks. Each ASM disk has up to 10 partner disks that secondary extents for this disk can be written to. However on both 1/4 and 1/2 rack Exadata boxes running 11.2.0.2 I have only ever seen 8 partner disks being used. The partner disks for a disk must of course reside in a different failure group</p>
<p>This can be seen below, taken from a 1/4 rack V2 Exadata box. Note I&#8217;m focussing on the disk 0 which is on cell01:</p>
<p><pre class="brush: bash;">

select DISK, NUMBER_KFDPARTNER, NAME, FAILGROUP
from V$ASM_DISK A, X$KFDPARTNER B
where DISK = 0
and GRP=1
and B.NUMBER_KFDPARTNER = A.DISK_NUMBER
and name like 'DATA%'
order by 2 asc
/
  
      DISK NUMBER_KFDPARTNER      NAME                                 FAILGROUP
---------- ----------------- ------------------------------ ------------------------------
      0            16           DATA_EX01_CD_04_EX01CEL02              EX01CEL02
      0            18           DATA_EX01_CD_06_EX01CEL02              EX01CEL02
      0            20           DATA_EX01_CD_08_EX01CEL02              EX01CEL02
      0            21           DATA_EX01_CD_09_EX01CEL02              EX01CEL02
      0            28           DATA_EX01_CD_04_EX01CEL03              EX01CEL03
      0            29           DATA_EX01_CD_05_EX01CEL03              EX01CEL03
      0            30           DATA_EX01_CD_06_EX01CEL03              EX01CEL03
      0            34           DATA_EX01_CD_10_EX01CEL03              EX01CEL03

</pre></p>
<p>Here I have chosen to focus on the first disk, disk 0 in Diskgroup 1 and find all partner disks for this. Now this disk is in cell 1. Cell 1 has disks 0 &#8211; 11. cell 2 has 12 &#8211; 23, and cell 3 has 24 &#8211; 35.</p>
<p>You can see here that the partner disks for this disk are spread evenly over both cell2 (16, 18, 20, 21) and cell 3 (28, 29, 30, 34). </p>
<p>I was quite interested in seeing if there was any overlap in what set of disk partners would be chosen by a parter of 0. I looked at the partners for the first partner of 0:</p>
<p><pre class="brush: bash;">

      DISK NUMBER_KFDPARTNER      NAME                            FAILGROUP
---------- ----------------- ------------------------------ ------------------------------
     16             0           DATA_EX01_CD_00_EX01CEL01         EX01CEL01
     16             2           DATA_EX01_CD_02_EX01CEL01         EX01CEL01
     16             6           DATA_EX01_CD_06_EX01CEL01         EX01CEL01
     16            11           DATA_EX01_CD_11_EX01CEL01         EX01CEL01
     16            27           DATA_EX01_CD_03_EX01CEL03         EX01CEL03
     16            29           DATA_EX01_CD_05_EX01CEL03         EX01CEL03
     16            32           DATA_EX01_CD_08_EX01CEL03         EX01CEL03
     16            35           DATA_EX01_CD_11_EX01CEL03         EX01CEL03

</pre></p>
<p>As you can see they only have disk 29 in common. Of course disk 16 being in cell2 could not have chosen any drives from this cell, but in cell3 it only has 1 partner in common with disk 0 from cell 1.</p>
<p>At least you can the disk partnering algorithm on Exadata is ensuring the partner drives are chosen to be on different cells, guaranteeing your data will survive the unavailability of a cell.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1753/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1753/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1753/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1753&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2011/10/26/exadata-storage-cells-and-asm-mirroring-and-disk-partnering/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>
	</item>
		<item>
		<title>Replacing a failed Exadata Storage Server System Drive</title>
		<link>http://jarneil.wordpress.com/2011/10/21/replacing-a-failed-exadata-storage-server-system-drive/</link>
		<comments>http://jarneil.wordpress.com/2011/10/21/replacing-a-failed-exadata-storage-server-system-drive/#comments</comments>
		<pubDate>Fri, 21 Oct 2011 09:27:59 +0000</pubDate>
		<dc:creator>jarneil</dc:creator>
				<category><![CDATA[Exadata]]></category>

		<guid isPermaLink="false">http://jarneil.wordpress.com/?p=1734</guid>
		<description><![CDATA[I was involved with the procedure to replace a failed system drive in an Exadata Storage Server. The actual physical procedure of removing the old failed drive and inserting the new drive is conducted by an Oracle engineer, but I was connected to the box while this process occurred and monitored the situation afterwards. It [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1734&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I was involved with the procedure to replace a failed system drive in an Exadata Storage Server. The actual physical procedure of removing the old failed drive and inserting the new drive is conducted by an Oracle engineer, but I was connected to the box while this process occurred and monitored the situation afterwards.</p>
<p>It is worthwhile first recapping what the storage cells look like. The V2 box I was working on has 12 600GB SAS drives. </p>
<p>The O/S for storage cells is stored on the first 2 drives. These are called the system drives. A small portion of these drives is carved out for storing the current and backup system images. Joel Goodman has <a href="http://dbatrain.wordpress.com/2011/10/14/maintaining-your-cells-image/">written</a> up a fantastic account of what the various partitions on the 2 system drives are used for. The rest of these drives is also used to be presented to ASM as griddisks.</p>
<p>The various partitions on the  system drives in a cell form RAID mirror pairs using the linux software RAID tool <a href="http://en.wikipedia.org/wiki/Mdadm">mdadm</a>. When replacing a system drive it is important to ensure these mirrored pairs get resynchronised or you will compromise the availability of your cell.</p>
<p>Replacing a system drive therefore involves a bit more risk than just any old storage cell drive, also there are more things that require checking to ensure your cell comes back to full health.</p>
<p>I should point out was the drive had not completely failed, but had gone into predicted failure first thing to look at was the cellcli command list physical disk:</p>
<p><pre class="brush: bash;">
CellCLI&gt; list physicaldisk 20:0 detail

         name:                   20:0 
         deviceId:               8 
         diskType:               HardDisk 
         enclosureDeviceId:      20 
         errMediaCount:          102 
         errOtherCount:          0 
         foreignState:           false 
         luns:                   0_0 
         makeModel:              &quot;SEAGATE ST360057SSUN600G&quot; 
         physicalFirmware:       0805 
         physicalInsertTime:     2010-06-15T01:33:16+01:00 
         physicalInterface:      sas 
         physicalSerial:         E0D6MA 
         physicalSize:           558.9109999993816G 
         slotNumber:             0 
         status:                 predictive failure
</pre></p>
<p>So you can see the status has gone to predictive failure. After the engineer had swapped the drives, you can use <a href="http://linux.die.net/man/8/mdadm">mdadm</a> to check on the health of your partitions:</p>
<p><pre class="brush: bash;">

[root@cel06 ~]# mdadm -Q --detail /dev/md6 
/dev/md6: 
        Version : 0.90 
  Creation Time : Mon Jun 14 16:59:44 2010 
     Raid Level : raid1 
     Array Size : 10482304 (10.00 GiB 10.73 GB) 
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB) 
   Raid Devices : 2 
  Total Devices : 3 
Preferred Minor : 6 
    Persistence : Superblock is persistent
    Update Time : Mon Oct 10 14:11:36 2011 
          State : clean, degraded, recovering 
 Active Devices : 1 
Working Devices : 2 
 Failed Devices : 1 
  Spare Devices : 1
 Rebuild Status : 27% complete
           UUID : e5c9de09:8950bae7:3df3da71:745e4b3c 
         Events : 0.1426
    Number   Major   Minor   RaidDevice State 
       2      65      214        0      spare rebuilding   /dev/sdad6 
       1       8       22        1      active sync   /dev/sdb6
       3       8        6        -      faulty spare
</pre></p>
<p>So you can see that the RAID partition is being rebuilt. Eventually though it did rebuild. But even some time later, still mdadm was complaining about the faulty spare:</p>
<p><pre class="brush: bash;">
[root@cel06 ~]# mdadm -Q --detail /dev/md6 
/dev/md6: 
        Version : 0.90 
  Creation Time : Mon Jun 14 16:59:44 2010 
     Raid Level : raid1 
     Array Size : 10482304 (10.00 GiB 10.73 GB) 
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB) 
   Raid Devices : 2 
  Total Devices : 3 
Preferred Minor : 6 
    Persistence : Superblock is persistent

    Update Time : Wed Oct 19 14:26:00 2011 
          State : clean 
 Active Devices : 2 
Working Devices : 2 
 Failed Devices : 1 
  Spare Devices : 0

           UUID : e5c9de09:8950bae7:3df3da71:745e4b3c 
         Events : 0.5338

    Number   Major   Minor   RaidDevice State 
       0      65      214        0      active sync   /dev/sdad6 
       1       8       22        1      active sync   /dev/sdb6

       2       8        6        -      faulty spare

</pre></p>
<p>So we are still stuck with the faulty spare output. It looks like it may be a bug in some versions of the storage cell software. To clear it up do the following:</p>
<p><pre class="brush: bash;">
[root@cel06 ~]# mdadm --manage /dev/md6 --remove failed
mdadm: hot removed 8:6
</pre></p>
<p>Then wonderfully, the output from the query command is all clear:</p>
<p><pre class="brush: bash;">
[root@cel06 ~]# mdadm -Q --detail /dev/md6
/dev/md6:
        Version : 0.90
  Creation Time : Mon Jun 14 16:59:44 2010
     Raid Level : raid1
     Array Size : 10482304 (10.00 GiB 10.73 GB)
  Used Dev Size : 10482304 (10.00 GiB 10.73 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 6
    Persistence : Superblock is persistent

    Update Time : Fri Oct 21 10:13:35 2011
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : e5c9de09:8950bae7:3df3da71:745e4b3c
         Events : 0.5338

    Number   Major   Minor   RaidDevice State
       0      65      214        0      active sync   /dev/sdad6
       1       8       22        1      active sync   /dev/sdb6
</pre></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jarneil.wordpress.com/1734/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jarneil.wordpress.com/1734/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/jarneil.wordpress.com/1734/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/jarneil.wordpress.com/1734/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/jarneil.wordpress.com/1734/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/jarneil.wordpress.com/1734/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/jarneil.wordpress.com/1734/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/jarneil.wordpress.com/1734/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/jarneil.wordpress.com/1734/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/jarneil.wordpress.com/1734/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/jarneil.wordpress.com/1734/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/jarneil.wordpress.com/1734/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/jarneil.wordpress.com/1734/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/jarneil.wordpress.com/1734/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jarneil.wordpress.com&amp;blog=1749502&amp;post=1734&amp;subd=jarneil&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jarneil.wordpress.com/2011/10/21/replacing-a-failed-exadata-storage-server-system-drive/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/ddedfa1a8e6d735d710e404ff39e5eef?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jarneil</media:title>
		</media:content>
	</item>
	</channel>
</rss>
