0

Background

I have ten Windows 10 servers which are used to run regression scripts 24/7. There is another machine which has a controller software which sends scripts to be executed to the 10 servers. When a regression run starts the following happens

1) The controller copies the build to the 10 servers using robocopy

2) After all the 10 build copies are successful, scripts are sent to each server to be executed

3) Once a script finishes execution on a server, the next script is sent to the server.

Problem

Some of the servers seem to go into a sleep/hung state unless someone is remotely connected (via RDC). I have done the following

1) Before a run is started, logout from all remote connections

2) Wait for 5-10 minutes and start the run

3) The build copy mentioned above starts for few machines. For the others it does not. I verify this from the logs on the controller

4) The build copy usually takes about 10 minutes. After waiting 15 minutes, I login to the machines which have issues (build copy not happening) via RDC

5) The moment I login, the build copy to these machines also starts and finishes in 10 minutes

6) For now, I am logged into all the machines 24/7

What I have tried

I assumed this could be a Power setting problem. So I have made the following changes

1) Turn off hard disk after to never

2) Sleep after to never

3) Hibernate after to never (Some machines don't have this option. I tried the steps in this link Windows 10 Hibernation not available but got a "firmware does not support this option" error on the servers. The machines that have the hang problem don't have this setting)

I am not sure what other setting to change. Hoping someone has faced this problem and will be able to provide a solution

  • So, the controller machine is running robocopy and moving files to the remote worker machines over the network? If the robocopy is not starting, then why not? You should be able to troubleshoot by simply running the robocopy manually on the controller. Have you checked basic things? Like is the worker pingable on the network? Is there any robocopy error? It seems like this should be fairly easy to troubleshoot. I would suggest that it might be waiting for some type of user interaction and that could be a result of the user context you are executing the scripts in or the parameters for robocopy – Appleoddity May 24 '20 at 14:45
  • Thanks for your comment. Robocopy is working fine. Out of the 10 machines, the copy happens for 6-7 machines (this varies). For the remaining 3-4 it is paused until I RDC into them. I have manually run robocopy from the controller, it also hangs (until I RDC). The servers are pingable. Robocopy has no error. My assumption is this is not a robocopy problem but the machines going into some sort of sleep state. The moment I login to them, it wakes them up and the copies resume – sudhindraaithal May 24 '20 at 15:04

0 Answers0