You're right, it is confusing in the specification. Particularly:
When the merchant's server receives the Payment message, it must determine whether or not the transactions satisfy conditions of payment. If and only if they do, if [sic] should broadcast the transaction(s) on the Bitcoin p2p network.
and
Customer authorizes payment to the merchant's address and broadcasts the transaction through the Bitcoin p2p network.
The image particularly shows the Wallet broadcasting the transaction.

What is established, however, is that the transaction in the Payment message sent to the merchant is the same transaction the wallet app broadcasts. IMHO, I think it's safe to assume that both the customer's wallet and the merchant attempt to broadcast the transaction. When both parties hold the transaction, they can both ensure that it was broadcast to the network..