Friday, August 18, 2017

TNPM and RHEL 7 - Missing CSH library - Cannot Start Datachannel

Please notice that if you are using RHEL 7, CSH is not installed by default anymore. You have to install it to be able to start the Datachannel components.

yum install tcsh

The IBM doc only mentions KSH, but this is not enough.

Enjoy :)

Sunday, March 6, 2016

Installing TNPM 1.4.1 - Part 2 - What do you need?

Installation packages

Below you can find the installation packages and part numbers you will need:

IBM Software Catalog


CN5HGEN - IBM Tivoli Netcool Performance Manager Wireline Component v1.4.1 for Linux English

CIN3IML - IBM Tivoli Common Reporting 3.1.0.1 for Linux Multilingual

CIES6ML - IBM WebSphere Application Server V8.5.0.1 for Jazz for Service Management for Linux

CIXA2ML - Jazz for Service Management V1.1.0.3 for Linux Multilingual

CI6W6ML - IBM DB2 Enterprise Server Edition V10.1 for Linux on AMD64 and Intel® EM64T systems (x64) Multilingual

CI71NML - IBM DB2 10.1 Enterprise Server Edition - Restricted Use Quick Start and Activation Multiplatform Multilingual

FixCentral - IBM Installation Manager Install Kit for all x86_64 Linux versions supported by version 1.8.3.0

 

Oracle


p13390677_112040 - 64-bit Oracle Server 11g (11.2.0.4) Enterprise Edition for Linux

p13390677_112040 - 32-bit Oracle Client 11g (11.2.0.4) for Linux

System packages (RHEL 6.5)


# yum groupinstall "Base"

# yum install atk.i686 binutils.x86_64 cairo.i686 compat-libcap1.x86_64 compat-libstdc++-33.x86_64 compat-libstdc++-33.i686 elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64 gcc.x86_64 gcc-c++.x86_64 glibc.x86_64 glibc.i686 glibc-common.x86_64 glibc-devel.x86_64 glibc-devel.i686 glibc-headers.x86_64 gtk2.i686 kernel-headers.x86_64 ksh.x86_64 libaio.x86_64 libaio.i686 libaio-devel.x86_64 libaio-devel.i686 libgcc.x86_64 libgcc.i686 libgomp.x86_64 libstdc++.x86_64 libstdc++.i686 libstdc++-devel.x86_64 libstdc++-devel.i686 libX11.i686 libXau.i686 libxcb.i686 libXext.i686 libXp.x86_64 libXp.i686 libXpm.x86_64 libXtst.i686 make.x86_64 openmotif.x86_64 openmotif.i686 openssl098e.i686 sysstat.x86_64 unixODBC.x86_64 unixODBC.i686 unixODBC-devel.x86_64 xorg-x11-xauth PackageKit-gtk-module.x86_64 libcanberra-gtk2.x86_64 gtk2-engines.x86_64 unzip openssh-clients.x86_64


Thursday, November 5, 2015

Installing TNPM 1.4.1 - Part 1 - Motivation

Every time I have to install TNPM the first thing I do is to ask myself: Do I have all I need? This looks a simple question, but if you consider that you need to download at least 10 different installation packs (yes 13GB), and match the compatible versions, this can take some time.

Another important question for me, especially when a new version is released: Are the installation instructions correct? Are they guiding me to the correct way of installing TNPM?

Unfortunately, this question proves to be more pertinent than I would like it to be. For instance, I already raised a very important issue when installing TNPM 1.3.2 about using the root user to install the TIP portal. TNPM 1.3.2 did that without fear and even finished with a chmod -R 777 <portal_home> in the end...very smart. IBM recognized the issue included some instructions on how to install the portal as non-root.

Well, how surprised I was, while reading the instructions for TNPM 1.4.1, to see that we have to install the portal (now called JAZZ) as the root user and do a chmod -R 777 <portal_home> as the post installation step. This is not necessary or even mandatory and absolutely not recommended for a public portal.

So, I hope the following posts will help you to do the it in a better way.

One important remark!!
Despite I will use only official IBM tools to install the different packages and no tricks, it always comes back to you to decide on the consequences of not following the official instructions (even if they are not good). It is your call, not mine, to make sure everything works on your environment. IBM support can eventually deny their support if they see your public portal is not running as root and all application files do not have the 777 permission.  

Monday, October 12, 2015

Launch TNPM pages in context from external applications

If you ever needed to integrate TNPM (v1.3.2+) with an external application, you may have asked yourself how to launch the TNPM portal in context with authentication and authorization in place.

It turns out there is a portal tool that allows exactly that. Is it called xlaunch and it is available for TIP and JAZZ portals. The tool will create a user-based token that can be used authenticate and launch the TNPM page you wanted using HTTP GET.

Steps:


1) Create the xlaunch credentials (all in one line):
 
$ java -cp /<JazzSM_HOME>/profile/installedApps/JazzSMNode01Cell/isc.ear/xlaunchapi.jar com.ibm.isc.api.xlaunch.LaunchPropertiesHelper\$Encode com.ibm.isc.xlaunch.username <username> com.ibm.isc.xlaunch.password <password>

Replace <username> and <password> with the TNPM user you want to SSO.

This will return a string like:

L2NvbS5pYm0uaXNjLnhsYXVuY2gujXNlcm5hbWUvdG5wbS1jb20uaWJtLmlzYy5KbGF1bmNoLnBhc3N3b3JkL3RucG0*

2) Launch in context (autologin using xlaunch credentials)
       
URL: https://<serverip>:16311/ibm/action/launch/<pageid>/<xlaunch_credential>

Example:
https://1.2.3.4:16311/ibm/action/launch/tnpm.console.topology.performance.network.resourceGroups/L2NvbS5pYm0uaXNjLnhsYXVuY2gujXNlcm5hbWUvdG5wbS1jb20uaWJtLmlzYy5KbGF1bmNoLnBhc3N3b3JkL3RucG0*

To get the pageid, open the page in JAZZ, click on the "single page" icon on the top right corner and click "About". Under "General" you will find the pageid

3) That's it

Wednesday, July 29, 2015

SevOne and TNPM. A technical comparison.

Some people have asked me about SevOne and how does it compare to TNPM. Some even ask if they should migrate from TNPM to SevOne. My answer is...it depends.

So, to help you take your decision, I prepared the table below. Please keep in mind that it is not my intention to say which one is better (I have my personal opinion though), but only to show what are the differences (and common) points between them and let you take your own decision.



Item IBM TNPM SevOne
Architecture 5 specialized core components:
  • Database
  • DataMart
  • DataView
  • DataChannel
  • DataLoad
1 core component:
  • SevOne appliance
Installation Very complex. Can take a long time (days to weeks) Very simple. Appliance or VM based. It is a matter of hours to install and start using
Upgrade Very complex. Can take a long time (days to weeks) Simple. Press the update button and it is done (complex architectures may demand some special care)
Scalability Via hardware upgrade on each core component Via cluster. Add another appliance side by side in cluster mode
SNMP collection Using collection formulas. Can be easily customized Using certified collection formulas. Usually needs SevOne intervention
BULK collection Using PVLINE (proprietary) format Using xStats component
Component Discovery Using discovery formulas. Can be easily customized and allow custom properties Not flexible. SevOne always discovers everything it can using SNMP walk. Allow custom properties
Component enrichment Using inventory hooks Using API
Licensing Based on “class of devices” being monitored. Very complex Per discovered and enabled object
Technical Portal WebSphere based. Allow some flexibility and customization Cannot be customized or changed
Services Portal Supported Not supported. Another solution needs to be used
Multi-tenancy Supported Supported
Reports Cannot be created by the user Can be created and shared by the user
Backup Database backup Not existent. To avoid data loss, another appliance needs to be paired in high availability
Cross collection Supported, but complex Supported
Grouping components (service like measurements) Supported, but complex. Can use rules to group components Supported but each component needs to be maintained manually or via the API
Historical data Limited by the database storage size. Purge scripts can be used. Usually limited to 1 year. This can be changed, but it is highly dependent on the local storage available on each appliance
Authentication LDAP or local LDAP, RADIUS, TACACS
Authorization Based on roles Based on roles
Data volume Report generation and data processing are subject to speed degradation as the data volume increases Report generation and data processing are NOT subject to speed degradation as the data volume increases
Data aggregation Pre-calculated and aggregated before stored in the database. Stored raw data and aggregated data side by side Only raw data is stored. Aggregated data is calculated during report rendering
Data export Can be done using DataAccess component Not trivial. API allows export of reports in CSV format, but not built for huge data export

Thursday, April 2, 2015

Error truncating partition

If for any reason you had an issue with your TNPM datachannel LDR that took more than 3 days to solve, you may find the following error on a LDR walkback file once it starts processing the backlog:

END OF WALKBACK20165: Error truncating partition
ORA-06512: at "PV_ADMIN.PVM_ERROR", line 137
ORA-06512: at "PV_ADMIN.PVM_DATALOAD", line 3558
ORA-06512: at line 1

This is caused due to the fact that the LDR is trying to do changes on a db partition that is already at the READONLY state. You can confirm it by searching the following on the oracle trace log:

$ORACLE_BASE/diag/rdbms/pv/pv/trace/proviso_PV_LDR_01.log
The Lowest Level Error Code is:
ORA-00372: file 487 cannot be modified at this time
ORA-01110: data file 487: '/(...)/PV_C01_1DGA_000_2014082500_001.dbf'

Unfortunately, the LDR will not proceed until the issue is solved and will keep generating walkbacks.

To solve the problem:

1) Connect to the oracle database as PV_ADMIN
2) Execute the query (replace the correct tablespace_name value as showed on the oracle trace log) :

select tablespace_name, status from dba_tablespaces where tablespace_name like '%C01_1DGA_000_2014082500%';

3) You should get two columns, one with the tablespace name and the other with "READ ONLY"

4) Change the tablespace to READ/WRITE executing the following (use the tablespace name returned by the previous sql query):

alter tablespace tablespace_name read write;

like:
alter tablespace C01_1DGA_000_2014082500 read write;

5) Bounce the LDR and it should work fine


Sunday, October 26, 2014

Piped merge error - what is wrong?

Last week a friend of mine came to me to ask about a strange error he was getting on TNPM. Basically, he had many gaps on report data for all devices, and it was apparently intermittent.

The error message on the log was the following:


V1:3353 2014.10.22-02.08.43 UTC LDR.1-21884:9897        SQLLDR  2 SQL Loader started
V1:3354 2014.10.22-02.08.43 UTC LDR.1-21884:13196       SQLLDR  3 Starting Piped Merge
V1:3355 2014.10.22-02.09.10 UTC LDR.1-21884:13196       MERGE_ERROR     GYMDC10118F Piped Merge Error: No such device or address:Transfer error
V1:3356 2014.10.22-02.09.10 UTC LDR.1-21884:13196       SQLLDRKILL      GYMDC10102W Killed sqlldr pid=a UnixProcess (Inactive: exitStatus nil, Error: Success) , result=a UnixProcess (Inactive: exitStatus nil, Error: Success)
V1:3357 2014.10.22-02.09.10 UTC LDR.1-21884:9897        ORASQLLDR       GYMDC10104F  ErrorCode=nil CommandLine=$ORA_HOME/11.2.0-client32/bin/sqlldr userid=PV_LDR_01/xxxx@pv log=...datachannel/LDR.1/state/2014.10.22-00/MERGED~000.1DGA.BOF.log control=...datachannel/LDR.1/loader.gagg.ctl logErrors=

After some investigation, it was evident that for some reason, the unix pipe created during the data merge was getting corrupted.

It is important to know that the LDR component has two options for merging and loading the data files. On the topology editor, if the option "USE_PIPE" is false, it will generate an intermediate file with the merged data and then use this file to upload via oracle sqlldr. If "USE_PIPE" is true, it will create a unix pipe and the oracle sqlldr will use it to load the files. Some people say that using the pipe is faster, because you don't have to create the intermediate file, but this can cause issues as well, as we will see.

The TNPM system I mentioned was using the pipe method for quite a long time before the issue occurred. So it, should be something in the system itself that had changed. And indeed, it was.

When using the pipe method, the pipe pointer is created under /tmp/LDR.X wher X is the LDR channel number. This works fine if the /tmp is mounted locally, but, if for any reason, the IT team decides to mount it using a remote data store... well, you will have problems. This was exactly what happened. The IT team decide to mount the /tmp using a remote data store for the VMware cluster. Once the datachannel was running on the cluster, the pipe load was affected.

So, we deactivated the pipe (USE_PIPE=false) on the topology editor, deployed the topology and the problem was solved.

I could not find a way to change where to create the pipe pointer, so it must be hard-coded somewhere. If you know how to do it, let me know :)