2

svchost and nsi in particular have persistently been associated with memory leaks over the years with Google searches turning up many hits (but few solutions). Related questions with related answers have been asked multiple times on superuser in the past, but all that I've seen have been by people asking how to determine which process contains the leak which is not much help if I already know that.

I'm opening this question because the problem seems to persist despite claims that this or that solution fixed it and also to report the results of my own investigation in case it's helpful and to see how else I might be able to dig into the problem.

The one thing suggested that might have addressed the problem was application of KB 2950358 (sorry, can't link for lack of rep), but the installer simply says that this update is not applicable to this system.

Machine, OS and software: Win7 Pro x64, 8 GB memory with nVidia GTX 580 video (drivers from nVidia, 372.54 dated August 15, which is 15 days ago at this writing). Processes that are nearly always running include Spotify, Chrome (currently v52.0.2743.116), Skype (currently 7.26.0.101) plus a few Cygwin mintty, bash and ssh processes. Internet Explorer is not installed (beyond the bits that can't be removed). Usual browser add-ins such as flash for yt etc. Nothing hugely out of the ordinary, although a few of those make heavy use of networking and could, theoretically, be implicated if the likes of KB 2847346 are to be believed. All windows updates including the latest optional roll-up update have been applied.

Omitting some intermediate steps, I separated nsi out into its own svchost, rebooted and then logged the output of tasklist, every second, for the PIDs of the nsi process and the svchost to which nsi used to belong. The results are plotted here; sure enough, the latter is basically flat, but nsi grows at a steady (if not increasing) rate.

At the same time, I used procmon to record sys calls made by nsi, but all but 6 events were Thread Create and Thread Exit events, which isn't very helpful. Whatever is causing the problem is not causing nsi to make syscalls of its own.

Before I split nsi out, I did a similar trace for nearly four days and that svchost instance started off at 24 MB and grew continuously to about 2150 MB before I stopped it, with the rate of change apparently increasing with time. In the past, I've seen the offending svchost process upwards of 6 GB but, with the procmon running, that was the point at which I began to run out of memory. A couple of times, some memory was released, but not as much as was allocated. I can link this graph later if anybody wants to see it.

I also monitored that with procmon and can provide a break-down of events, but they're probably not very interesting given that it's established nsi that's at fault and that none of its events are particularly interesting.

Is there any tool for tracking what processes are making requests of a particular service?

What's my next move given that the apparently relevant KB hotfix is not applicable?

strix
  • 21
  • 2
  • The KB list effected files. Have you verfiied you do not have one of the affected files? If you do not then the hotfix, you linked to, isn't appliciable and more then like has nothing to do with the behavior you describe. *Furthermore, svchost does not have a memory leak, applications that communicate with it have the memory leak there is a huge difference* Do you have the software update that included this hotfix installed? – Ramhound Aug 30 '16 at 16:16
  • Install the WPT (part of the Win10 SDK: https://dev.windows.com/en-us/downloads/windows-10-sdk which als runs since Win7), run WPRUI.exe, select **First Level**, **CPU usage**, **VirtualAlloc usage, **Resident Analysis** and click **start** Now capture 5 minutes of the memory usage grow and click on **Save** to store the report into an ETL file. Zip the large ETL file into zip/RAR file, upload the zip (OneDrive, dropbox, google drive) and post the share link here. I'll try to analyze it and see what the service is doing – magicandre1981 Aug 30 '16 at 17:17
  • @Ramhound yes, they're part of the base OS anyway (`rpcrt`, `rpchttp`). I assume the reason the hotfix said 'not applicable' is that it's fairly old (2014) and these files have already been patched via windows update. The memory leak will show up in the WSS/PWS/commit sizes of the affected processes, but that doesn't mean that the root cause is in the affected process (`nsi`). As explained in KB 2847346, processes that don't cancel notifications can cause `nsi` to leak. – strix Aug 30 '16 at 17:23
  • @magicandre1981 will do, and thanks for the offer! If I can work out how to use it, I will first dig around and see if there's anything else useful I can report. – strix Aug 30 '16 at 17:31
  • @strix - You assumed incorrectly. Its not appliciable because you likely have the actual KB that contained the hotfix installed. Which is the reason I want you to verify file versions, to verify yourself, your not effected by the describe memory link you linked to. – Ramhound Aug 30 '16 at 17:37
  • @magicandre1981 http://bit.ly/2c6eoSu; PID of nsi in that trace is 1808. Let me know when you've got it so I can take it offline. This recording is from a nearly-fresh boot without any sensitive apps running. (I did have a look at WPA, but really, making sense of it requires more knowledge of windows' architecture than I have.) NB: I had to go for ADK 8.1 because SDK 10 gave me 'not a valid win32 executable', but I assume that is as good. Thanks again for your help. – strix Aug 30 '16 at 20:12
  • I only see some memory usage because of a WorkerFactory that does some work (ntdll.dll!NtWaitForWorkViaWorkerFactory calls ntoskrnl.exe!RtlpCreateUserThreadEx). this is a side effect of other tools that query network data like the sidebar (close gadgets that display network statistics) – magicandre1981 Aug 31 '16 at 15:39
  • @magicandre1981 Thanks for trying. Perhaps the problem wasn't being tickled at the time, though if memory serves the process did grow by about 500k in that 7 minute period. Now, after the machine has been running for a while and the problem is more overt, there are about 40 or so threads, most of which have a start address of `ntdll.dll!RtlDestroyHandleTable` in state `Wait:WrQueue` if that's at all meaningful. Periodically, these get destroyed, but they always come back shortly after. Does that give any clue? Reminds me of stuck PID table entries in Unix because the dest. proc is dead. – strix Sep 05 '16 at 16:36
  • @magicandre1981 Since you mention sidebar, I've killed everything in it (which seems to have made sidebar.exe disappear as well) and will track the size of nsi over the next couple of days to see if that makes any difference and will report back. Can't see why it should, but you never know. – strix Sep 05 '16 at 18:50
  • @magicandre1981 After 46 hours, service `nsi` grew from 8 MB to 936 MB even without sidebar active so I guess sidebar isn't it. Any suggestions on how to track down which application is causing `nsi` to allocate and not free memory? Can/should I use WPT/WPA to look for calls to `ntdll.dll!NtWaitForWorkViaWorkerFactory`, and do I need symbols loaded to do so? If so, where do I obtain symbol data? – strix Sep 07 '16 at 16:33
  • inside WPA you can load the debug symbols.https://msdn.microsoft.com/en-us/windows/hardware/commercialize/test/wpt/symbol-support – magicandre1981 Sep 08 '16 at 04:23
  • were you able to find the cause? – magicandre1981 Oct 12 '16 at 07:24

0 Answers0