Monday, January 4, 2016

Server Recovery and lessons learnt

Last month of 2015 was exciting, well at least at the beginning. Everything in the project was going as expected when suddenly a mail comes. One of the production server is inaccessible. IT guys find out that the server has some registries corrupted and they will have to restore the back up (tape backup of the entire C drive)

Then comes the bad news, the backup is not working either, IT guys restore the backup as old as one year but it is of no help   The only option is to rebuild a new server and replicate/reinstall Hyperion components in it. And so it begins.

Components on this server were
1. EAS
2. FR and Reporting Framework services.
3. ODI Studio and Data loading batch scripts

Good news was that Oracle Middleware home was on D drive which was intact so

Lesson #1: ALWAYS INSTALL EPM ON A DRIVE OTHER THAN C or NOT ON THE ONE HAVING OS FILES AND FOLDERS

This ensured that the some essential folders that are required can still be used.
Next task was to reinstall HFR and Framework services. The new server given had the D drive of the older one so Oracle folder is present, but that is of no use as registry entries are absent and it does not detect anything.
I decided to use the option of  'Reinstall this release' and reconfigure the database, but it did not work

Lesson #2: Re-installing a component on a recovered server does not work. You have to install afresh i.e. Select 'New installation' option
But make sure you use the same directory name and path. So the old Oracle folder has to be renamed (DO NOT DELETE it as it is needed at the time of restore)

At the time of configuration, use the 'Re-use schema' option. Selecting drop and recreate tables will say good bye to existing reports that are created.

By god's grace the configuration succeeded this time without any issue. One important point at the time of Configuration to be noted is

Lesson #3: When configuring web server for FR, make sure that Admin server is up and running on the shared services server. Else domain setup will fail. 

Recover RM1 folder.

Lesson #4: Once you have successfully started the FR services, close it once and replace the contents of RM1 folder with the older one.
it is located here
D:\Oracle\Middleware\user_projects\epmsystem1\ReportingAnalysis\Data\RM1

Hush ! the FR server was finally up and running. It was time to move to ODI

ODI does not require reinstall, it will work as expected, but with the credential details gone. Now comes the twist. For some reason the ODI work repository schema tables got deleted !
Nobody is able to figure out why that would have happened. The database is on a different server.
We confirmed the schema details from the Master repository tables and some entries had changed. This was re-assured when we compared it with Repositories of Development. Restoring it with older schemas did not work. This is still a mystery.

Lesson #5: Never trust disk backups or schema backups taken by other teams. 
The one thing that I've learnt the hard way is to have repository exports of ODI and keeping it in a safe remote location. Also, it is a good habit to have EPM LCM exports of every component that we can think of.

Then we came to know that Development and Production systems were not in sync.

Lesson #6: Always keep all environments in sync. We can copy contents directly from one environment to another.