SAP Startup Troubleshooting
1. IS DATABASE UP AND RUNNING?
This is an absolute prerequisite for SAP SYSTEM to start. In order to test this, operating system command R3trans -d can be executed (as SIDadm user). Any return code different than 0 means that DB cannot be reached by DIALOG work processes, and so SAP system will not initiate. The following return indicates that the database is up, running, and reachable by SAP work processes:
<hostname>:SIDadm > R3trans -d
This is R3trans version X.xx (release 742 - 18.11.14 - 20:14:09).
unicode enabled version
R3trans finished (0000).
2.IS SAP STARTUP AGENT (SAPSTARTSRV) RUNNING?
There must be ONLY ONE sapstartsrv process running for each instance. Examples for each platform below:
Examples
As an example, we will use a centralized system with the following 03 instances running on the same host (ASCS, PAS and DI). The following output should return for the operating system level command:
ps -ef | grep <SID> | grep sapstartsrv
(...) /usr/sap/<SID>/ASCS00/exe/sapstartsrv pf=/usr/sap/<SID>/SYS/profile/<SID>_ASCS01_<hostname> -D
(...) /usr/sap/<SID>/DVEBMGS01/exe/sapstartsrv pf=/usr/sap/<SID>/SYS/profile/<SID>_DVEBMGS00_<hostname> -D
(...) /usr/sap/<SID>/D02/exe/sapstartsrv pf=/usr/sap/<SID>/SYS/profile/<SID>_D02_<hostname> -D
There can be another sapstartsrv process (as below) but it will play no part in the startup of the instance.
(...) /usr/sap/hostctrl/exe/sapstartsrv pf=/usr/sap/hostctrl/exe/host_profile -D
If the service doesn't start and nothing it's written in "sapstartsrv.log" it might happen the binaries are damaged in the local exe folder. In this case, has to be replaced
After trying to start the instance (either through the SAP MMC, the "startsap" command or using "sapcontrol ... -function Start"), no error is returned. However, the SAP processes (Dispatcher - ABAP; jstart - Java) do not start. No trace file is updated either.
Verify whether any process is already in "GREEN" or "YELLOW" status (e.g., the IGS).
hostname:sidadm 40> sapcontrol -nr 33 -function GetProcessList
12.10.2015 20:15:45
GetProcessList
OK
name, description, dispstatus, textstatus, starttime, elapsedtime, pid
msg_server, MessageServer, GRAY, Stopped, , , 1234
disp+work, Dispatcher, GRAY, Stopped, , , 1111
igswd_mt, IGS Watchdog, GREEN, Running, 2015 07 31 11:59:15, 1:21:09, 2323
hostname:sidadm 41>
All processes must have the "GRAY" status. Otherwise, the SAP Startup Agent (sapstartsrv) will "ignore" the start command. Stop the instance first, so all processes are stopped. Then, start the instance again.
Logs checking Sequence :
Database starts
SAP checks if the database is available and if it is not it starts the database. No dedicated developer trace is created in the work directory as the database has its own logs. If the database does not start correctly it should be visible within seconds.
Message server kicks in
The message server is the first part of the system to start, it handles the communications between the instances. Only one message server is available per system, regardless of the amount of dialog instances available. The message server logs are kept on the dev_ms developer trace and if there is an issue with the message server it will be pretty obvious because you will not see a dev_disp trace.
Dispatcher next
Ahhh the Dispatcher!… well, the dispatcher dispatches (ha!), Nah, seriously… The dispatcher job is to receive the requests and direct them to an available suitable work process. During start-up you can see how the work processes are invited to start in the dev_disp developer trace. If something was to go wrong the only thing you will find in dev_disp is that the work processes died. Usually no reason is ever given for the failure to launch in this trace, but one dev_w* developer trace for each dialog process is created and populated in the process.
The dialog processes
This is where you’ll find the root of the problem 99.9% of the time. Why?… Each dialog process has a dedicated memory area and a dedicated connection to the database layer. If something is going to go wrong it will most likely happen here. Each work process has its own developer trace dev_w* where you will find detailed information on the error.
So the logical troubleshooting sequence, in a nutshell is…
- Check connection to the database with a quick R3trans –d command, it’s the quickest and simplest way to discard DB availability as the issue.
- Go to the work directory and check the developer traces. If the dev_w* logs exist it means that the message server and dispatcher started and the issue is in the work process logs..
- If the dispatcher and work process developer traces are not created then your issue is in the message server developer trace.
The next time you find a start-up issue I’m hoping that these simple steps might mean we can dispense with the crystal ball
Comments
Post a Comment