Microsoft Dynamics NAV instance running after windows restart but not responding - microsoft-dynamics

I've got issue with Microsoft Dynamics NAV instance.
We're restarting Windows server each night (I know that we probably shouldn't do it, but this is a separated topic and not the point of this question).
After windows server starts, SQL and Dynamics Nav instances are starting. Sometimes (1-2 times per month) Dynamics NAV instance is marked as 'Running', but actually doesn't respond (Web services are not working, RTC client cannot connect to this instance etc.). We have to restart Dynamics Nav instance manually once again in order to get it working correctly.
Did anyone have similar problems? We were looking into Windows Logs, but couldn't find anything interesting..
We also wonder if we shouldn't manage start of the services (SQL server, Dynamics Nav instances etc.) manually somehow instead of depending on the automatic start of everything after windows restart.
Update:
There is actually one error in the Windows Event Log which occures ONLY in days when Dynamics Nav instance is not started corretly:
Server instance: XXXX
Tenant ID:
User:
Type: System.AggregateException
Message: A Task's exception(s) were not observed either by Waiting on
the Task or accessing its Exception property. As a result, the
unobserved exception was rethrown by the finalizer thread.
HResult: -2146233088
Type: System.BadImageFormatException
Message: An attempt was made to load a program with an incorrect
format. (Exception from HRESULT: 0x8007000B)
StackTrace:
at Microsoft.Dynamics.Nav.Runtime.NavLicense.NativeMethods.UnpackLicense(Byte[]
license, Int32 licenseSize, StringBuilder header, Int32 headerSize)
at Microsoft.Dynamics.Nav.Runtime.NavLicense.Create(Byte[] license, LicenseExpiredHandler licenseExpiredHandler)
at Microsoft.Dynamics.Nav.Runtime.NavDatabaseSecurityAndLicense.get_License()
at Microsoft.Dynamics.Nav.Runtime.WindowsLanguageDataProvider.IsAvailableLanguage(Int32
languageId)
at Microsoft.Dynamics.Nav.Runtime.NavEnvironment.FindSupportedLanguage(Int32
languageId, Int32 defaultLanguageId)
at Microsoft.Dynamics.Nav.Runtime.NavSession.Open(Boolean useUserPersonalization, Byte[] licenseToUse, Boolean
allowAppsDisabledMode)
at Microsoft.Dynamics.Nav.Runtime.NavTaskSchedulerHelpers.RunAsSystemSession(NavTenant
tenant, Action`1 action)
at Microsoft.Dynamics.Nav.Runtime.NavTaskScheduler.TaskRunInfo.InternalRun()
at Microsoft.Dynamics.Nav.Runtime.NavTaskFactory.<>c__DisplayClass1_0.<RunTask>b__0()
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.Execute()
Source: Microsoft.Dynamics.Nav.Ncl
HResult: -2147024885

I'd suggest Delayed Start to help alleviate missing dependencies such as certificates OCSP validation without internet, etc. There should be Windows Logs saying the The service has completed configuration and is ready.
Service Auto-restart actions might help catch unexpected errors, but as it's Running I'm not sure it'll exactly apply to your situation.
The service tier should not be restarted nightly, as you've pointed out :). It might be easier to solve that issue, but I can't suggest anything without more information.
Also, which version of Dynamics NAV/Business Central?

Related

Windows agents implementation

One creates a Windows agent by calling CreateService with one of the following two parameters: SERVICE_USER_OWN_PROCESS or SERVICE_USER_SHARE_PROCESS. When SERVICE_USER_OWN_PROCESS is used, the agent will start with the next login and it will have a name like < service_name>_< some session ID>. Example of Windows 10 Microsoft agents: MessagingService_ba3d3c, PrintWorkflowUserSvc_ba3d3c or DevicesFlowUserSvc_ba3d3c (call sc query type=userservice to see the active ones) - in this case, the is 0xba3d3c, while the Logon Session is 0xba1a53 (close, but not enough) (seen with Process explorer).
My questions are:
Can I start the agent immediately after installation without logout? It would help with the installer that asks for reboot now.
What is this mysterious "session ID" ? It would help with the testing, to avoid enumeration and guessing.

CoRegisterClassObject returns error (session 0?)

A customer is running one of our programs, usually run as a service, as an application. The customer is getting the following error on CoRegisterClassObject():
The class is configured to run as a security id different from the caller.
It looks like some type of session 0 error, but why should CoRegisterClassObject() care about session 0? COM should allow both services (session 0) and apps (session > 0) and not care what registers what, shouldn't it?
Also, I don't like the fact that it's not in the list of errors returnable by CoRegisterClassObject(), as per the Microsoft doc webpage.
The error code in question is CO_E_WRONG_SERVER_IDENTITY (0x80004015).
Per this page:
COM security frequently asked questions
Q6 Why does CoRegisterClassObject return CO_E_WRONG_SERVER_IDENTITY? When launching my ATL 1.1 server service as an .exe file, I receive CO_E_WRONG_SERVER_IDENTITY from CoRegisterClassObject. (The class is configured to run as a security ID different from the caller.) This seems to occur whether I skip the CoInitializeSecurity or not. It fails running as a service or as an .exe file.
A. Many services are debugged by running them as console applications in the interactive user identity. Because the service is already registered to run in a different identity (configurable by the Services control panel applet), OLE fails the CoRegisterClassObject and RunningObjectTable::Register(ROTFLAGS_ALLOWANYCLIENT) calls by returning CO_E_WRONG_SERVER_IDENTITY to enforce security and to prevent malicious servers from spoofing the server. To debug by running in the interactive user's identity, make the following changes in the server's registry entries to prevent these failures:
• To prevent CoRegisterClassObject failure, remove the following named value:
[HKEY_CLASSES_ROOT\APPID\{0bf52b15-8cab-11cf-8572-00aa00c006cf}]
"LocalService"="HelloOleServerService"
• To prevent a IRunningObjectTable::Register(ROTFLAGS_ALLOWANYCLIENT) failure, follow these steps:
Remove the following named value:
[HKEY_CLASSES_ROOT\APPID\{0bf52b15-8cab-11cf-8572-00aa00c006cf}]
"LocalService"="HelloOleServerService"
Then add the following named value:
[HKEY_CLASSES_ROOT\APPID\{0bf52b15-8cab-11cf-8572-00aa00c006cf}]
"RunAs"="Interactive User"
You muist restore the modified registry entries after debugging.
I am assuming you would have to replace {0bf52b15-8cab-11cf-8572-00aa00c006cf} with your COM object's actual CLSID instead.

Word Automation Service BatchGetSyncJobStatus fails when requesting security token

I'm running a SharePoint 2013 on-premise server on which I have deployed a simple WCF service as a farm solution. The service accepts simple Http post requests that contain single MS Word documents as payload and returns these files converted into PDFs.
The service is accessible via Http to anonymous users. The WordAutomationService is running as Administration user account of the SharePoint server.
The service class creates an new instance of the Microsoft.Office.Word.Server.Conversions.SyncConverter and passes the proxy of the SharePoint's running WordAutomationService into the constructor (together with some ConversionJobSettings). Finally it calls the Convert method on the SyncConverter with the input stream (the Word document) and output stream (the web response which will contain the resulting PDF document produced by the WordAutomationService).
When creating the SyncConverter I don't set the UserToken property because the access to the service is by anonymous users. According to the remarks here https://msdn.microsoft.com/en-us/library/microsoft.office.word.server.conversions.syncconverter.usertoken.aspx this seems to be fine:
The default value for this property is a null reference (Nothing in Visual Basic), which is anonymous.
This setup works fine for small Word documents with a couple of pages and returns the expected PDF files. But as soon as the execution time of the WordAutomationService on the SharePoint exceeds a certain time threshold (around 5 seconds) the service fails because it never returns (which leads to a read timeout on the client).
According to the logs it seems the reason for this is that after some time the synchronous conversion job moves the work into a background process:
Sync Stream job conversion takes too long. Don't wait anymore. Check its status later
It then polls the status of this job on a regular basis by calling ConversionServiceApplicationProxy.BatchGetSyncJobStatus. Unfortunately this call fails because internally it tries to create a new channel to talk to this process and for that asks for a security token. The SecurityTokenService however cannot complete the token request and throws an exception:
An unhandled exception has occurred. The security token request cannot be completed. System.InvalidOperationException: The security token request cannot be completed.
at Microsoft.SharePoint.SPSecurityContext.SecurityTokenForServiceContext(Uri contextUri)
at Microsoft.SharePoint.SPChannelFactoryOperations.InternalCreateChannelActingAsLoggedOnUser[TChannel](ChannelFactory`1 factory, EndpointAddress address, Uri via)
at Microsoft.Office.ConversionServices.Service.ConfigChannelFactory`1.CreateChannel(EndpointAddress address)
at Microsoft.Office.ConversionServices.Service.ConversionServiceApplicationProxy.GetChannel(Uri uri)
at Microsoft.Office.ConversionServices.Service.ConversionServiceApplicationProxy.ExecuteOnChannel(Uri endpointAddress, Action`1 action)
at Microsoft.Office.ConversionServices.Service.ConversionServiceApplicationProxy.BatchGetSyncJobStatus(ICollection`1 ucids, Uri endpointAddress)
at Microsoft.Office.ConversionServices.Service.BatchGetStatusPollingThread.Run()
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart() StackTrace:
at onetnative.dll: (sig=37460b31-4453-4365-92f5-3a11c267be48|2|onetnative.pdb, offset=28F56) at onetnative.dll: (offset=15735)
I'm at a loss now how to get rid of the token issue so that the system can create the necessary channel to poll the conversion job status. Any help is highly appreciated. Thanks!
(I can't post the full log because it registers as spam)
I’ve found that, if you were to install SharePoint 2013 on a Domain Controller (a topology that Microsoft said is only good for development but not for production), then the default anonymous user in IIS (IUSR) will not work reliably, and any WCF solution which is accessed via an IIS site that has Anonymous Access configured to use the IUSR account will fail when it attempts to access Security Token Service.
In this case the most expedient solution is to reconfigure IIS to use another anonymous identity, namely the identity tied to the Application Pool.
For example if your site is called NameOfSite, you can run this in an elevated PowerShell:
Set-WebConfigurationProperty `
-Filter /system.WebServer/security/authentication/AnonymousAuthentication `
-Name username `
-Value "" `
-location "NameOfSite"
This solves the immediate problem at hand which is that SecurityTokenForServiceContext fails. However, if you’ve installed SharePoint 2013 on Windows 2012 R2 as a Domain Controller, then it is not over: WordServerWorker actually will not start in this configuration.
I can also confirm, however, that if you were to install SharePoint 2013 on a standalone server (with <Setting Id="SERVERROLE" Value="SINGLESERVER"/> role in the unattended config file), then the entire solution works end-to-end, and WordServerWorker will actually start properly.
Previously, the most relevant (and unanswered) question on this must be this MSDN posting, “The security token request cannot be completed”. I would assume that in that case, the service was only in a meta-stable state, and one of the IIS workers would have previously obtained credentials via NTLM during local testing.
Usually when sharepoint service applications interact with each other, these services maintaining current user context trough wcf calls by using service application framework (SAF). Its allows these services to use SPContext.Current, preserve correlation id between call in logs and so on. When this context is lost, services stopping being able to communicate each other. For example this happens if we have a code that starts a new thread but didn't setup user for newly created thread context.
According to your description your service is anonymous and didn't use SAF to maintain user context, but uses some services that requires existence of that context
The possible solution would be is to use SAF(which is tricky configured WCF in a nutshell) instead of plain WCF services with no authentication
Edit:
One more possible solution may be is wrap your code with RunWithElevatedPrivileges to make your services connects sharepoint with application pool identity

ColdFusion 2016 - Security service not available

CF 2016 on windows10 with IIS
I've checked other threads on similar issues and they don't appear to apply.
My laptop has needed to be crash-started on a number of occasions recently due to the laptop not waking up from sleep mode. A couple of times ColdFusion 2016 didn't start automatically and needed to be manually started. Now, ColdFusion appears to be starting automatically, but now I'm getting an error:
HTTP Error 500.0 - The Security service is not available.
I'm afraid I have no idea where to start on this or even what additional information to provide. So, I would really appreciate any hints.
The remainder of the error has the following information:
Detailed Error Information:
Module: IsapiModule Notification: ExecuteRequestHandler
Handler: ISAPI-dll
Error Code: 0x00000000
Requested URL: http://zbay_sys:80/jakarta/isapi_redirect.dll
Physical Path : C:\ColdFusion2016\config\wsconfig\1\isapi_redirect.dll
Logon Method: Anonymous
Logon User : Anonymous
I really hope I don't have to re-install CF
Glad, that you are sorted.
The error message says, "The Security service is not available." Thus IIS is showing http based error 500. If the service is not starting, there could likely be a problem at ColdFusion end.
Please try the following, if you face the similar issue in future:–
Stop ColdFusion service (if not already)
Launch Command prompt as Administrator
Browse to cf_root\cfusion\bin and run the following
command coldfusion.exe -start console
Try to access the CF Admin, once the services are started.
In case it gives an error message, please share the same.

sitecore session time-out or server failure on publish or browse for package to install

I am at my wits end on this and can't figure this out. In sitecore v6.2 something has changed that is causing an error message as follows
"The operation could not be completed. Your session may have been lost due to a time-out or server failure".
looks like this is coming from Sitecore.Web.UI.Sheer.ClientPage?
The request info:
https://sitecore.test.domain.com/sitecore/shell/sitecore/content/Applications/Content%20Editor.aspx?ic=People%2f16x16%2fcubes_blue.png
the response:
{"commands":[{"command":"Alert","value":"The operation could not be completed.\n\nYour session may have been lost\ndue to a time-out or a server failure.\n\nTry again."}]}
At first, I assumed it was because plugged in some new HttpModules so I moved them into the sitecore pipeline model and the problem kept persisting. I removed them from the entire application and the problem kept persisting.
A google search on the error gets me to some information on the keepalive.aspx stuff, but addressing that has no bearing.
I decompiled the code with reflector, but can't find anywhere this particular error is raised. It must be in sitecore.nexus or something.
According to my superiors we will open a ticket once we get the build resolved, but here's to hoping someone here has some suggestions.
The constant for this error message is THE_OPERATION_COULD_NOT_BE_COMPLETED_YOUR_SESSION_MAY_HAVE_BEEN_LOSTDUE_TO_A_TIMEOUT_OR_A_SERVER_FAILURE_PLEASE_TRY_AGAIN
This might happened if you server restarts while some dialog opened