I have an issue with a mainframe z/OS server that does FTP transfers to another server in a different data center. The FTP transfers happen every day from morning to night. But on Monday and Tuesday nights, some FTP transfers fail with broken pipe errors.
The failure does not depend on the file size being transferred. Sometimes, the transfer happens for 20 seconds with 25 MB transferred, and then the connection breaks with the broken pipe. Sometimes, the failure happens after a transfer for 6 minutes with 350MB transferred.
So, there is no definite pattern and the only consistency is failures happen only on Monday & Tuesday nights. Rest of the week, there are no issues.
The destination server has a large disk space with at least 20GB free at any time and the files being transferred are always less than 1GB. So broken pipe is not because of space issues at the destination. We tried to add a TIMEOUT parameter of 1000seconds in the mainframe FTP process that sends the data. But that did not help.
I'm wondering in which direction should I focus the investigation ?
- Should this be investigated from a network perspective between source and destination ?
- Should the destination server configuration parameters be reviewed ?
- Could this be due to the high load of connections coming into the destination server on Monday & Tuesday nights. In that case, is there an option to prioritize connections from the mainframe source to destination ?