xaminmo: Josh 2016 (Default)
cat < <'EOF' >/etc/systemd/system/db2fmcd.service
[Unit]
Description=DB2V111

[Service]
ExecStart=/opt/tivoli/tsm/db2/bin/db2fmcd
Restart=always
KillMode=process
KillSignal=SIGHUP

[Install]
WantedBy=default.target
EOF
systemctl enable db2fmcd.service
systemctl start db2fmcd.service

cp -p
Read more... )
http://omnitech.net/reference/2018/06/13/tsm-systemd-autostart/
xaminmo: Josh 2016 (Default)
We ran into an issue where a level-zero operator became root, and cleaned up some TSM dedupe-pool containers so he'd stop getting full filesystem alerts.

Things exposed:

How does someone that green get full, unmonitored root access?
* They told false information about timestaps during
Read more... )
http://omnitech.net/reference/2018/04/19/spectrum-protect-container-vulnerability/
xaminmo: Josh 2016 (Default)
I ordered the new backup server on October 27.
Initial setup gave app crashes intermittently, so was not ready to make it live yet.
I ran BOINC on it for a day, and at one point, all tasks died at once.

Syslog showed EDAC errors starting 11 days after I got the system, calling out
Read more... )
http://omnitech.net/news/2017/11/14/tsm-server-status/
xaminmo: Josh 2016 (Default)
This is happiness...

tsminst1@tsm:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial

/bin/bash# for i in /dev/sd? ; do smartctl -a $i ; done | grep 'Device Model'
Device Model: Samsung SSD 850 EVO 250GB
Read more... )
http://omnitech.net/news/2017/10/30/protect-initial-install/
xaminmo: Josh 2016 (Default)
Upgrading TSM server from Q9650 Core 2 Quad 3.0GHz, 8GB DDR2 on Win 2008R2.

New system is HP Z600, two-socket, 6-core 2.66GHz Xeon X5650 and 48GB of RAM. Wattage is the same per socket, but two sockets now. 3x the cores, 4x the performance.

SSDs for DB and Log are also moving to EVO 850
Read more... )
http://omnitech.net/news/2017/10/24/new-data-protection/
xaminmo: (Josh 2014)
Anyone in the ADSM/TSM/Spectrum Protect land, could you take a moment to add a vote to this RFE?

It's a request for the client to support more than 1 producer thread per filesystem, and more than 4 producer threads per DSMC instance. It's dated code that doesn't take into account high performance disk subsystems.

https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=86101

Like:
xaminmo: Josh 2016 (Default)
It depends on if it's small files or not. I normally have a small-file pool, which is the DIRMC, VMTLMC, and TOCDESTINATION. The offsite for this is kept reclaimed down to 1 tape, and I try to restore that primary pool first.

For large file, TDP, VM and image full backups, they can ...
Read more... )
http://omnitech.net/reference/2014/10/22/after-dr-of-a-tsm-server-do-you-need-to-restore-the-primary-storage-pool-from-the-copy-cool/
xaminmo: Josh 2016 (Default)
BACKUP STGPOOL for dedupe runs about 6x slower than direct tape to tape.
Why?

1) First, the database has a huge number of random reads for dedupe rehydration.
Tack on any Dedup Deletion activity (SHOW DEDUPDELETEINFO) and anything else that's competing for DB IOPS.
FIX: Put the database on SSD or RAM backed storage.
NOTE: SSD stats are usually lies. Sustained performance is 4500-12,000 IOPS each, divided by 2 for RAID-1/10, or by 3.5 for RAID-5/6)
FIX: increase server memory and provide more for DB2 bufferpools.
NOTE: This might require manually changing bufferpools, limiting filesystem cache, etc.
FIX: Large amounts of cache for the database containers

2) Next, the file class, while sequential, still has a large number of random read IOPS.
TSM Server has no read ahead for this. It reads the chunks in order, rather than requesting a huge buffer full of chunks.
As such, streaming speed will be limited by DB latency, file-class latency, and actual read IO times.
FIX: Reduce the latency for your file class
FIX: Reduce the latency for your database
FIX: Don't do anything else during BACKUP STGPOOL.
FIX: Run your EXPIRE INVENTORY and IDENTIFY DUPLICATE after, not before.
FIX: Submit a Design Change Request (DCR) for larger chunk read cache to be used for BACKUP STGPOOL.
FIX: Submit a Design Change Request (DCR) for larger tape write buffer.

3) Last, tape buffer underruns can kill performance.
If the write buffer empties, then the tape will stop.
Before it begins again, the tape has to be repositioned backward.
For LTO drives, usually the minimum write speed is 50MB/sec.
Anything less, and you have latency and tape life consumed by "shoe shining".
FIX: Fix/improve issues 1 and 2 above.
FIX: Submit a design change request to allow TSM to interleave more threads onto the same tape at once.
FIX: Use tape drives with lower minimum speeds to prevent underruns
FIX: Don't use tape. Use virtual tape, another dedupe disk pool, or a replica target TSM server.

4) Check TSM server instrumentation.
This will show you where your time is spent, and what to upgrade next.
INSTRUMENTATION BEGIN
BACKUP STGPOOL DEDUP COPYPOOL
wait several minutes
INSTRUMENTATION END FILE=/tsm/instrumentation.out


http://omnitech.net/reference/2014/08/12/tsm-dedup-backup-stgpool-performance/
xaminmo: Josh 2016 (Default)
NDMP backups into a TSM storage pool will not be deduplicated.
If you set ENABLENASDEDUPE YES, that only affects NetApp backups.
IBM doesn't make the NDMP code, so they don't support deduplication of anything but NetApp.
That means neither IBM's v7000 Unified backups, nor any other NDMP device, get deduplicated.

As such, go ahead and have your NDMP backups go to a DISK pool or direct to tape.
Sending to your dedupe pool will just clog things up.



http://omnitech.net/reference/2014/08/12/tsm-and-ndmp/
xaminmo: Josh 2016 (Default)
This is a defect in DB2 10.5 FP1
The defect does not exist in DB2 9.7 FP6
This problem affects TSM 7.1.0.0 customers with billions of extents (over 30TB deduplicatedmay release late enough to include DB2 10.5 FP3a,

In TSM Server 7.1.0.0 on AIX (unk if limited to AIX),
when RUNSTATS parses BF_AGGREGATED_BITFILES,
and there are more than maxint unique values for BFID,
then COLCARD may become negative.

A negative column cardinality will the index for queries against it,
which will lead to slowdowns and lock escalations within TSM.
This will present as a growing dedupdelete queue, slow expire, slow BACKUP STGPOOL, and slow client backups.

This is not exactly maxint related, as maxint - colcard was higher than the number of columns by about 20%.

You can check for this by logging in to your instance user, and running:

db2 connect to tsmdb1
db2 set schema tsmdb1
db2 'select TABNAME,COLNAME,COLCARD from SYSSTAT.COLUMNS where COLCARD<-1'


The output should say "0 record(s) selected."
If it lists any negative values for tables, then that table's index will becompromised.

There is no fix for TSM Server 7.1, as no patches are available.
TSM 7.1.1 will release with DB2 10.5 FP3, which will not include a fix for this problem.
As of 2014-08-01, the problem has not been isolated yet.

The workaround is to update column cardinality to a reasonable value.
It doesn't need to be exact. An example command might be:

db2 connect to tsmdb1
db2 set schema tsmdb1
db2 "UPDATE SYSSTAT.COLUMNS SET COLCARD=3300000000 WHERE COLNAME='BFID' AND TABNAME='BF_AGGREGATED_BITFILES' AND TABSCHEMA='TSMDB1'"


There is no APAR for this, and no hits on Google for "DB2 'negative column cardinality'".
This seems slightly related to: http://www-01.ibm.com/support/docview.wss?uid=swg1IC99408

NOTE: DO NOT INSTALL DB2 FIXPACK SEPARATELY. The TSM bundled DB2 is very slightly different. Standard DB2 fixpacks are not supported. If you decide to do this, you may find command or schema problems. If it works, then you may not be able to upgrade TSM afterward without a BACKUP DB, uninstall, reinstall, RESTORE DB -- at best.

If you have a large dedupe database, your options include:
* Stay at TSM 6.x
* Monitor for negative column cardinality
* Wait for an APAR and efix from IBM.
* Wait for TSM 7.1.1.1 or TSM 7.2.0 in 2015 (or whatever versions will contain fixes).

http://omnitech.net/reference/2014/08/04/db2-10-5-0-1-negative-colcard/
xaminmo: Josh 2016 (Default)
In the past, I set up TSM.PWD as root, but this seems to not be what I needed.

I'm posting because the error messages and IBM docs don't cover this.

tsmdbmgr.log shows:
ANS2119I An invalid replication server address return code rc value = 2 was received from the server.

TSM Activity log shows:
ANR2983E Database backup terminated due to environment or setup issue related to DSMI_DIR - DB2 sqlcode -2033 sqlerrmc 168. (SESSION: 1, PROCESS: 9)

db2diag.log shows:

2014-02-26-13.54.12.425089-360 E415619A371 LEVEL: Error
PID : 15138852 TID : 1 PROC : db2vend
INSTANCE: tsminst1 NODE : 000
HOSTNAME: tsmserver
EDUID : 1
FUNCTION: DB2 UDB, database utilities, sqluvint, probe:321
DATA #1 : TSM RC, PD_DB2_TYPE_TSM_RC, 4 bytes
TSM RC=0x000000A8=168 -- see TSM API Reference for meaning.

EDUID : 38753 EDUNAME: db2med.35926.0 (TSMDB1) 0
FUNCTION: DB2 UDB, database utilities, sqluMapVend2MediaRCWithLog, probe:656
DATA #1 : String, 134 bytes
Vendor error: rc = 11 returned from function sqluvint.
Return_code structure from vendor library /tsm/tsminst1/sqllib/adsm/libtsm.a:

DATA #2 : Hexdump, 48 bytes
0x0A00030462F0C4D0 : 0000 00A8 3332 3120 3136 3800 0000 0000 ....321 168.....
0x0A00030462F0C4E0 : 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0A00030462F0C4F0 : 0000 0000 0000 0000 0000 0000 0000 0000 ................

EDUID : 38753 EDUNAME: db2med.35926.0 (TSMDB1) 0
FUNCTION: DB2 UDB, database utilities, sqluMapVend2MediaRCWithLog, probe:696
MESSAGE : Error in vendor support code at line: 321 rc: 168

RC 168 per dsmrc.h means:
#define DSM_RC_NO_PASS_FILE 168 /* password file needed and user is
not root */

Verified everything required for this:
• passworddir points to the right directory
• DSMI_DIR points to the right directory
• dsmtca runs okay
• dsmapipw runs okay

Verified hostname info was correct

dsmffdc.log shows:
[ FFDC_GENERAL_SERVER_ERROR ]: (rdbdb.c:4200) GetOtherLogsUsageInfo failed, rc=2813, archLogDir = /tsm/arch.

Checked, and the log directory inside dsmserv.opt was typoed as /tsm/arch instead of /tsm/arc as was used to create the instance and as exists on the filesystems.

Updated dsmserv.opt and restarted tsm server. No change other than fixing Q LOG

SOLUTION:
The TSM.PWD file must be owned by the instance user, not by root.
Make sure to run the dsmapipw as the instance user, or chown the file after.

http://omnitech.net/reference/2014/02/26/tsm-7-1-config/

TSM 7.1

Feb. 20th, 2014 11:03 pm
xaminmo: (Josh 2004 Happy)
GAHHHHHHHHHH! If you install TSM client first on a clean AIX 7.1 system, TSM Server won't install. TSM Client comes with a version of xlsmp.rte that reverts the OS level to an unsupported version. You have to go find install base AIX media and install the version from there. This is a packaging oversight. Someone thought a 5 year old prerequesite was ok.

Further, you cannot call your TSM Server "tsmserver". Even though this matches the hostname requirements, Operation Center says nope with ANRI0011E.

Also, I'm absolutely required to use a password with at last 6 characters, one upper, one lower, one digit, and two nonalpha characters from a specific list.

*sigh*

A while back I tried to update TSM to 7.1 on Windows, but I installed the new Operation Center first. After that, the new deployment tool that is non-standard to everything except IBM refused to install TSM server, saying there was nothing to upgrade, but also that it couldn't install because TSM server was already installed.

But, someone, somewhere, is getting their bonus.
xaminmo: Josh 2016 (Default)
If you have 6 filesystems backing a sequential access file storage pool, and you remove one filesystem, TSM cannot calculate free space properly.

Instead of looking at the free space of the remaining filesystems, it take the total space of the filesystems, minus the volumes in that device class.

Since there may still be old volumes in the "removed" directory, it considers the device class 100% full if everything currently existing cannot fit into the remaining directories.

Note that removing a directory from a device class does not invalidate the existing volumes in that directory. So long as the directory is still accessible, the volumes will be usable.

This is a problem when you want to reduce a filesystem but not migrate 100% off of it, as there is no other way to tell TSM not to allocate new volumes in that directory other than to remove that dir from the device class.

http://omnitech.net/reference/2014/01/07/tsm-file-class-design-issue/
xaminmo: (Josh 2004 Happy)
tsm: TSM>show deduppending dedupe
ANR1015I Storage pool DEDUPE has 2,018,762,268,864 duplicate bytes pending removal.

tsm: TSM>SHOW DEDUPDELETE
****Dedup Deletion General Status****
Number of worker threads : 8
Number of active worker threads : 0
Number of chunks waiting in queue : 0

****Dedup Deletion Worker Info****
Worker thread 1 is not active
Worker thread 2 is not active
Worker thread 3 is not active
Worker thread 4 is not active
Worker thread 5 is not active
Worker thread 6 is not active
Worker thread 7 is not active
Worker thread 8 is not active
------------------------------------------
Total worker chunks queued : 0
Total worker chunks deleted : 0

tsm: TSM>q proc

Process Process Description Process Status
Number
-------- -------------------- -------------------------------------------------
1 Identify Duplicates Storage pool: DEDUPE. Volume:
/tsm/dedupe/00036730.BFS. State: active.
State Date/Time: 01/01/14 19:44:36. Current
Physical File(bytes): 13,453,908,907. Total
Files Processed: 32. Total Duplicate Extents
Found: 207,302. Total Duplicate Bytes Found:
27,030,923,224.
2 Identify Duplicates Storage pool: DEDUPE. Volume:
/tsm/dedupe/0003672C.BFS. State: active.
State Date/Time: 01/01/14 19:59:29. Current
Physical File(bytes): 82,217,208,517. Total
Files Processed: 1,110. Total Duplicate Extents
Found: 628,508. Total Duplicate Bytes Found:
99,009,523,025.
3 Identify Duplicates Storage pool: DEDUPE. Volume:
/tsm/dedupe/0003657E.BFS. State: active.
State Date/Time: 01/01/14 19:09:54. Current
Physical File(bytes): 32,356,415,194. Total
Files Processed: 1,799. Total Duplicate Extents
Found: 560,040. Total Duplicate Bytes Found:
87,123,137,741.
4 Identify Duplicates Storage pool: DEDUPE. Volume:
/tsm/dedupe/0003672F.BFS. State: active.
State Date/Time: 01/01/14 19:36:57. Current
Physical File(bytes): 2,147,746,191. Total Files
Processed: 2,701. Total Duplicate Extents Found:
565,790. Total Duplicate Bytes Found:
97,240,779,156.
5 Identify Duplicates Storage pool: DEDUPE. Volume:
/tsm/dedupe/0003672D.BFS. State: active.
State Date/Time: 01/01/14 18:47:32. Current
Physical File(bytes): 22,696,147,854. Total
Files Processed: 54. Total Duplicate Extents
Found: 43,421. Total Duplicate Bytes Found:
7,901,680,314.
6 Identify Duplicates Storage pool: DEDUPE. Volume:
/tsm/dedupe/00036731.BFS. State: active.
State Date/Time: 01/01/14 19:16:13. Current
Physical File(bytes): 24,424,088,494. Total
Files Processed: 6. Total Duplicate Extents
Found: 65,229. Total Duplicate Bytes Found:
14,781,615,514.
xaminmo: (Logo Tivoli Certified)
run all uninst* from /opt/IBM/tivoli
remove contents of /opt/IBM/tivoli
remove contents of /opt/tivoli
remove contents of /home/db2inst1

List the remnants in DE:
cd /usr/ibm/common/acsi/bin/
./de_lsrootiu.sh

Delete the UUID and discriminant (directory). My examples were:
./deleteRootIU.sh 2ADC4A33F09F4E85AD27963E850290C3 /opt/IBM/tivoli/tipv2
./deleteRootIU.sh 3DD9564D2E7442788584C1F35B07F2A2 /opt/IBM/tivoli/tipv2Components/TCRComponent
./deleteRootIU.sh 61AE95EAFC824C45BECFD427C959D5B7 /opt/IBM/tivoli/tipv2Components/TCRComponent
./deleteRootIU.sh 7F15FB682C80DFB90EBE3B0BF5D8EDC6 /opt/IBM/tivoli/tsmac
./deleteRootIU.sh C00DA95AFD9B7E0397153CD944B5A255 /opt/IBM/tivoli/tipv2


TAGS: admincenter admin center deploymentengine deployment engine ibm eserver tivoli force uninstall wipe
xaminmo: Josh 2016 (Default)
Admin center and reporting are installed.  I'm trying to log in to see if it works, and basic setup.

If I'm coming through a SOCKS proxy, it doesn't work at all.  "Connection Reset".

If I'm coming through port remapping, it doesn't work - "Connection Refused" - What hostname?  What IP?

If I run FF 3.6.28 (community RPM) or 3.5.13.1 (IBM BFF) on the AIX box where AdminCenter is installed, the javascript goes into a constant reload cycle.  Completely unusable as the page constantly refreshes itself.  There is no newer Firefox for AIX.

If I run FF on a Windows system in-network (Citrix), it works, to a point.  Many selectors, etc don't work in IBM's GUI tools on FF.  Selectors are missing the top item, which sucks if there is only one item.  I can't add the server.

If I run IE8 on a system in-network (Citrix), the tab crashes, and it says "This tab has been recovered."  Eventually, retrying, it gives up because whatever's in the tab just continues to crash.  It does this with all add-ons disabled too.

*sigh*
xaminmo: Josh 2016 (Default)
So, I tried installing the other way around, admin-center first. AC went on fine.
Reporting didn't fail immediately, but it won't let me pick my language.
Read more... )
The solution for this issue is to not pick to install any additional languages.

SO, did I misread the docs, or are the backwards?
I am Le Tired, so I'll have to check later.
xaminmo: Josh 2016 (Default)
Various issues I've run into and resolved.
Cut/paste out of a doc I'm working on, so the formattng isn't HTML/LJ pretty.
Read more... )
xaminmo: Josh 2016 (Default)
This was because the symlinks for the libobk shared library were incorrect, and/or permissions on the libtdp_r3.sl were incorrect, and/or the agent.lic was not readable by the DB user.

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on sbt_1 channel at 05/07/2012 17:00:48
ORA-19506: failed to create sequential file, name="WP1_aeimncnu.102340_1", parms=""
ORA-27028: skgfqcre: sbtbackup returned error
ORA-19511: Error received from media manager layer, error text:
BKI9204E: Additional support information: An exception was thrown at position: esd-rmanapplication.cpp(271) (text=Unknown exception.
).
RMAN>
specification does not match any backup in the repository
RMAN>
Recovery Manager complete.
xaminmo: (Baby poop)
This is a new, clean install of the OS, and a new, clean download of the 6.3.1 reporting tool.

daltsmrpt: /install/2012/TSM/631rpt# cat /stdout
rootRA: com.ibm.tivoli.remoteaccess.LocalUNIXProtocol@298a298a
rootRA.isProtocolAvailable(): true
Exception: Userid is not privileged. java.net.ConnectException: CTGRI0002E Session not established.
(X) commiting registry
(X) shutting down service manager
(X) cleaning up temporary directories

daltsmrpt: /install/2012/TSM/631rpt# whoami
root

daltsmrpt: /install/2012/TSM/631ac# oslevel -s
7100-01-02-1150


If I get this sorted out, I'll post about it.
(More)(Reply)

Profile

xaminmo: Josh 2016 (Default)
xaminmo

July 2018

S M T W T F S
12 34567
891011121314
15 161718192021
22232425262728
293031    

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Apr. 26th, 2019 02:15 am
Powered by Dreamwidth Studios