xaminmo: Josh 2016 (Default)
This is happiness...

tsminst1@tsm:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial

/bin/bash# for i in /dev/sd? ; do smartctl -a $i ; done | grep 'Device Model'
Device Model: Samsung SSD 850 EVO 250GB
Read more... )
xaminmo: Josh 2016 (Default)
Upgrading TSM server from Q9650 Core 2 Quad 3.0GHz, 8GB DDR2 on Win 2008R2.

New system is HP Z600, two-socket, 6-core 2.66GHz Xeon X5650 and 48GB of RAM. Wattage is the same per socket, but two sockets now. 3x the cores, 4x the performance.

SSDs for DB and Log are also moving to EVO 850
Read more... )
xaminmo: Josh 2016 (Default)
I'm sure this will all change in a week.

You can force-reinstall Download Director from here:

Uninstall info, my whining, etc is at my reference site:
xaminmo: Josh 2016 (Default)
This changes periodically, but for today, here is what I would do.

My PowerHA selection process would be:
* 7.1.3 SP06 if I needed to deploy quickly, because I have build docs for that.
* 7.1.4 doesn't exist, but if it came out before deployment, I would consider it. Whichever was a newer
Read more... )
xaminmo: Josh 2016 (Default)
QEMU on Windows will run ppc64 and ppc64le emulation.
It emulates the same as what PowerKVM on an S812L would provide.
It's kind of slow because there is no KVM module, AND Intel vs PPC,
AND emulator mode is single-core/proc/thread.

You can get Windows installer
Read more... )
xaminmo: (Josh 2014)
Because Facebook notes editor has zero formatting functionality in the new version.
PowerHA QuickBuild - Sanitized )
xaminmo: (Josh 2014)
Anyone in the ADSM/TSM/Spectrum Protect land, could you take a moment to add a vote to this RFE?

It's a request for the client to support more than 1 producer thread per filesystem, and more than 4 producer threads per DSMC instance. It's dated code that doesn't take into account high performance disk subsystems.


xaminmo: Josh 2016 (Default)
It depends on if it's small files or not. I normally have a small-file pool, which is the DIRMC, VMTLMC, and TOCDESTINATION. The offsite for this is kept reclaimed down to 1 tape, and I try to restore that primary pool first.

For large file, TDP, VM and image full backups, they can ...
Read more... )
xaminmo: Josh 2016 (Default)
Because sometimes it's hard to find, especially when I'm on limited bandwith, this is for my easy reference
Read more... )
xaminmo: Josh 2016 (Default)
BACKUP STGPOOL for dedupe runs about 6x slower than direct tape to tape.

1) First, the database has a huge number of random reads for dedupe rehydration.
Tack on any Dedup Deletion activity (SHOW DEDUPDELETEINFO) and anything else that's competing for DB IOPS.
FIX: Put the database on SSD or RAM backed storage.
NOTE: SSD stats are usually lies. Sustained performance is 4500-12,000 IOPS each, divided by 2 for RAID-1/10, or by 3.5 for RAID-5/6)
FIX: increase server memory and provide more for DB2 bufferpools.
NOTE: This might require manually changing bufferpools, limiting filesystem cache, etc.
FIX: Large amounts of cache for the database containers

2) Next, the file class, while sequential, still has a large number of random read IOPS.
TSM Server has no read ahead for this. It reads the chunks in order, rather than requesting a huge buffer full of chunks.
As such, streaming speed will be limited by DB latency, file-class latency, and actual read IO times.
FIX: Reduce the latency for your file class
FIX: Reduce the latency for your database
FIX: Don't do anything else during BACKUP STGPOOL.
FIX: Run your EXPIRE INVENTORY and IDENTIFY DUPLICATE after, not before.
FIX: Submit a Design Change Request (DCR) for larger chunk read cache to be used for BACKUP STGPOOL.
FIX: Submit a Design Change Request (DCR) for larger tape write buffer.

3) Last, tape buffer underruns can kill performance.
If the write buffer empties, then the tape will stop.
Before it begins again, the tape has to be repositioned backward.
For LTO drives, usually the minimum write speed is 50MB/sec.
Anything less, and you have latency and tape life consumed by "shoe shining".
FIX: Fix/improve issues 1 and 2 above.
FIX: Submit a design change request to allow TSM to interleave more threads onto the same tape at once.
FIX: Use tape drives with lower minimum speeds to prevent underruns
FIX: Don't use tape. Use virtual tape, another dedupe disk pool, or a replica target TSM server.

4) Check TSM server instrumentation.
This will show you where your time is spent, and what to upgrade next.
wait several minutes
INSTRUMENTATION END FILE=/tsm/instrumentation.out

xaminmo: Josh 2016 (Default)
NDMP backups into a TSM storage pool will not be deduplicated.
If you set ENABLENASDEDUPE YES, that only affects NetApp backups.
IBM doesn't make the NDMP code, so they don't support deduplication of anything but NetApp.
That means neither IBM's v7000 Unified backups, nor any other NDMP device, get deduplicated.

As such, go ahead and have your NDMP backups go to a DISK pool or direct to tape.
Sending to your dedupe pool will just clog things up.

xaminmo: Josh 2016 (Default)
This is a defect in DB2 10.5 FP1
The defect does not exist in DB2 9.7 FP6
This problem affects TSM customers with billions of extents (over 30TB deduplicatedmay release late enough to include DB2 10.5 FP3a,

In TSM Server on AIX (unk if limited to AIX),
and there are more than maxint unique values for BFID,
then COLCARD may become negative.

A negative column cardinality will the index for queries against it,
which will lead to slowdowns and lock escalations within TSM.
This will present as a growing dedupdelete queue, slow expire, slow BACKUP STGPOOL, and slow client backups.

This is not exactly maxint related, as maxint - colcard was higher than the number of columns by about 20%.

You can check for this by logging in to your instance user, and running:

db2 connect to tsmdb1
db2 set schema tsmdb1

The output should say "0 record(s) selected."
If it lists any negative values for tables, then that table's index will becompromised.

There is no fix for TSM Server 7.1, as no patches are available.
TSM 7.1.1 will release with DB2 10.5 FP3, which will not include a fix for this problem.
As of 2014-08-01, the problem has not been isolated yet.

The workaround is to update column cardinality to a reasonable value.
It doesn't need to be exact. An example command might be:

db2 connect to tsmdb1
db2 set schema tsmdb1

There is no APAR for this, and no hits on Google for "DB2 'negative column cardinality'".
This seems slightly related to: http://www-01.ibm.com/support/docview.wss?uid=swg1IC99408

NOTE: DO NOT INSTALL DB2 FIXPACK SEPARATELY. The TSM bundled DB2 is very slightly different. Standard DB2 fixpacks are not supported. If you decide to do this, you may find command or schema problems. If it works, then you may not be able to upgrade TSM afterward without a BACKUP DB, uninstall, reinstall, RESTORE DB -- at best.

If you have a large dedupe database, your options include:
* Stay at TSM 6.x
* Monitor for negative column cardinality
* Wait for an APAR and efix from IBM.
* Wait for TSM or TSM 7.2.0 in 2015 (or whatever versions will contain fixes).

xaminmo: Josh 2016 (Default)
This is why powerpath for boot devices is a BAD thing. At some point, someone will put a non-powerpath device with a powerpath device inside of rootvg on a production server. Then you end up completely broken:

root@somehost:/>bosboot -ad /dev/hdiskpower7
0301-154 bosboot: missing proto file: /usr/lib/boot/network/chrp.hdiskpower.proto

### This error means you need to run "pprootdev fix"

root@somehost:/>pprootdev fix
pprootdev: PowerPath boot is not currently enabled.

root@somehost:/>lspv | grep rootvg
hdiskpower7 FFFFFFFFBBBBBBBB rootvg active

root@somehost:/>pprootdev on
bosboot verification failed.
Run 'bosboot -vd /dev/ipldevice' to determine cause of failure.

root@somehost:/>bosboot -vd /dev/ipldevice
0301-154 bosboot: missing proto file: /usr/lib/boot/network/chrp.hdiskpower.proto

### I tried to reinstall powerpath, and that patently failed. I need to bring the apps offline, then remove powerpath0, then reboot, then remove powerpath0, then uninstall, then reinstall, then configure powerpath, then reboot, then pprootdev on, then reboot, then pprootdev fix.
That's assuming it will all actually work.

### Sigh. Time to shave a yak.

Or really, time to move to MPIO because they're running reserve_policy=single_path anyway.
xaminmo: Josh 2016 (Default)
In the past, I set up TSM.PWD as root, but this seems to not be what I needed.

I'm posting because the error messages and IBM docs don't cover this.

tsmdbmgr.log shows:
ANS2119I An invalid replication server address return code rc value = 2 was received from the server.

TSM Activity log shows:
ANR2983E Database backup terminated due to environment or setup issue related to DSMI_DIR - DB2 sqlcode -2033 sqlerrmc 168. (SESSION: 1, PROCESS: 9)

db2diag.log shows:

2014-02-26- E415619A371 LEVEL: Error
PID : 15138852 TID : 1 PROC : db2vend
INSTANCE: tsminst1 NODE : 000
HOSTNAME: tsmserver
FUNCTION: DB2 UDB, database utilities, sqluvint, probe:321
DATA #1 : TSM RC, PD_DB2_TYPE_TSM_RC, 4 bytes
TSM RC=0x000000A8=168 -- see TSM API Reference for meaning.

EDUID : 38753 EDUNAME: db2med.35926.0 (TSMDB1) 0
FUNCTION: DB2 UDB, database utilities, sqluMapVend2MediaRCWithLog, probe:656
DATA #1 : String, 134 bytes
Vendor error: rc = 11 returned from function sqluvint.
Return_code structure from vendor library /tsm/tsminst1/sqllib/adsm/libtsm.a:

DATA #2 : Hexdump, 48 bytes
0x0A00030462F0C4D0 : 0000 00A8 3332 3120 3136 3800 0000 0000 ....321 168.....
0x0A00030462F0C4E0 : 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0A00030462F0C4F0 : 0000 0000 0000 0000 0000 0000 0000 0000 ................

EDUID : 38753 EDUNAME: db2med.35926.0 (TSMDB1) 0
FUNCTION: DB2 UDB, database utilities, sqluMapVend2MediaRCWithLog, probe:696
MESSAGE : Error in vendor support code at line: 321 rc: 168

RC 168 per dsmrc.h means:
#define DSM_RC_NO_PASS_FILE 168 /* password file needed and user is
not root */

Verified everything required for this:
• passworddir points to the right directory
• DSMI_DIR points to the right directory
• dsmtca runs okay
• dsmapipw runs okay

Verified hostname info was correct

dsmffdc.log shows:
[ FFDC_GENERAL_SERVER_ERROR ]: (rdbdb.c:4200) GetOtherLogsUsageInfo failed, rc=2813, archLogDir = /tsm/arch.

Checked, and the log directory inside dsmserv.opt was typoed as /tsm/arch instead of /tsm/arc as was used to create the instance and as exists on the filesystems.

Updated dsmserv.opt and restarted tsm server. No change other than fixing Q LOG

The TSM.PWD file must be owned by the instance user, not by root.
Make sure to run the dsmapipw as the instance user, or chown the file after.


TSM 7.1

Feb. 20th, 2014 11:03 pm
xaminmo: (Josh 2004 Happy)
GAHHHHHHHHHH! If you install TSM client first on a clean AIX 7.1 system, TSM Server won't install. TSM Client comes with a version of xlsmp.rte that reverts the OS level to an unsupported version. You have to go find install base AIX media and install the version from there. This is a packaging oversight. Someone thought a 5 year old prerequesite was ok.

Further, you cannot call your TSM Server "tsmserver". Even though this matches the hostname requirements, Operation Center says nope with ANRI0011E.

Also, I'm absolutely required to use a password with at last 6 characters, one upper, one lower, one digit, and two nonalpha characters from a specific list.


A while back I tried to update TSM to 7.1 on Windows, but I installed the new Operation Center first. After that, the new deployment tool that is non-standard to everything except IBM refused to install TSM server, saying there was nothing to upgrade, but also that it couldn't install because TSM server was already installed.

But, someone, somewhere, is getting their bonus.
xaminmo: Josh 2016 (Default)
If you have 6 filesystems backing a sequential access file storage pool, and you remove one filesystem, TSM cannot calculate free space properly.

Instead of looking at the free space of the remaining filesystems, it take the total space of the filesystems, minus the volumes in that device class.

Since there may still be old volumes in the "removed" directory, it considers the device class 100% full if everything currently existing cannot fit into the remaining directories.

Note that removing a directory from a device class does not invalidate the existing volumes in that directory. So long as the directory is still accessible, the volumes will be usable.

This is a problem when you want to reduce a filesystem but not migrate 100% off of it, as there is no other way to tell TSM not to allocate new volumes in that directory other than to remove that dir from the device class.

xaminmo: (Josh 2004 Happy)
IBM plans to move about 110,000 retirees off its company-sponsored health plan. The move affects all IBM retirees once they become eligible for Medicare, beginning December 31, 2013.

IBM Chief Health Director Kyu Rhee told retirees that to keep receiving coverage, they will need to pick a plan offered through Extend Health, a large private Medicare exchange run by New York-based Towers Watson & Co.

Instead of subsidizing retiree health premiums directly, IBM will give retirees an annual contribution via a health retirement account that they can use to buy Medicare Advantage plans and supplemental Medicare policies on the exchange, as well as pay for other medical expenses.

Retirees who don't enroll in a plan through Extend Health won't receive the subsidy.

xaminmo: (Josh 2004 Happy)
I ran into an issue that might be procedural, but I though you guys might want to know anyway.

We are pursuing with IBM HW support as of 2013-07-18.
I am going to test further in my lab. I suspect this may be related to
bkprofata and rstprofdata copying over some internal seed for MAC addresses

I plan to try this in my lab on p5 rackmount servers via an HMC.
If I can reproduce it there, then I expect support's response to be "don't do that".
As such, I also will try a factory reset to see if that will clear the condition.

If I cannot reproduce it there, then it's either SDMC/FSM related (which is going away),
or it's blade/Flex Node related (No other test resources, but maybe L3 can help).

If L3 decides that rstprofdata cannot be used on a different system,
then I would want them to A) Limit the command to that functionality,
and B) Update documentation for both commands to reflect this.

bkprofdata & rstprofdata were used to clone the LPAR layout from one blade to another.
To reset the WWNs, I was able to delete and re-add the virtual fibre adapters.
New LPARs and new virtual fibre adapters automatically get WWNs with the blade/node number as part of the WWN.
This part works as I would expect.

To reset the MAC addresses, this did not work.
Delete and re-add virtual ethernet adapters does not change the MAC addresses.
Adding a new adapter that did not exist before to the same slot number,
on the same LPAR ID, on two different Flex nodes, and both get the same MAC accress.

Current resolution is to override the MAC address with a user specified value in the LPAR profile.
This can be done from Profile -> Virtual -> Ethernet -> Advanced -> checkbox

Change from commandline:
chsyscfg -m Server-7895-23X-SN1012345 -r prof -i \

To remove and Readd:
chsyscfg -m Server-7895-23X-SN1012345 -r prof -i \
chsyscfg -m Server-7895-23X-SN1012345 -r prof -i \

I've never seen this happen on any other POWER series servers, and I've built a lot of p7 systems, ranging from p710 to p780, including matching LPARs between CECs. This is on top of the whole slew of LPARable systems I've built and/or supported.

I looked into the profile data backup files themselves, and there is no mention of system serial, system name, WWN prefix, or MAC prefix.

I restored mode 3 of the profile data backups prior to any config work, and when adding new virtual NICs to LPARs, the MAC addresses still mirror eachother.

I plan to test this with two p505 systems on an HMC to see if similar issues occur.

I don't have the resources to test this on blades, or on another SDMC.

We are pursuing with IBM HW support as of 2013-07-18

### END NOTICE ###

After a week, still no no response from support,
but I think I found out why this was a problem.

On physical hardware, "lssyscfg -r lpar" will show virtual_eth_mac_base_value=
On the flex nodes, this value is not exposed.

I can't tell if this is an SDMC/FSM limitation, or a flex node limitation.
I know that IVM sees it, but am not sure about HMC.

So, when LPAR profiles are copied over, they will bring the VEMBV,
and there is no way to change it short of deleting and re-creating.

All in all, it may just be easier to use mksyscfg from the start.
An example might be:

mksyscfg -r lpar -m Server-8205-E6D-SN10FFFFF -i profile_name=DefaultProfile,\

But there's already reference online for this sort of command.

Also, while working on a p740 via IVM, I ran into more differences from HMC/SDMC.
When you add a client LPAR with virtual SCSI, IVM automagically creates the VIO server virtual scsi server adapter. In addition, +1 from that slot it creates a virtual serial adapter for mkvterm.

If you're used to adding virtual scsi adapters in order, and you don't skip a slot on the mksyscfg lines, then you'll get this error:
[VIOSE01050173-0290] Cannot create virtual serial adapter in the management partition in the virtual slot number specified 20.

I couldn't find this error anywhere else on the internet, and it was a little confusing since I wasn't making a virtual serial adapter.
xaminmo: (Logo IBM CATE)
I always run into issues when I work in a multiple VLAN environment, because it's not *that* common for my builds. This is a reminder for me.

The magic is when using multiple VLANs:
1) Don't use the real VLAN ID for the trunk PVID unless you know for certain that was set on the switch. It is stripped off of all packets, and who knows what the PVID of the switch is, if any.
2) Any mismatch between PVID on the SEA and the trunk will cause packets to be dropped.
3) Don't use IEEE VLAN mode for the client adapter unless you're going to add VLAN interfaces from AIX. When not in VLAN mode, the PVID is ADDED to all packets on client adapters.
4) When using multiple trunks on one SEA, they all have to be the same trunk priority. ha_mode=sharing balances not using trunk priority, but based on the order of the virt_adapters field.
xaminmo: Josh 2016 (Default)
Because I didn't find this online
Note, run-time is about an hour.
Read more... )


xaminmo: Josh 2016 (Default)

July 2018

12 34567
15 161718192021


RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Apr. 26th, 2019 01:51 am
Powered by Dreamwidth Studios